maxent_grpo.training.data¶

Dataset loading helpers for the training pipeline.

Functions

`_dataset_cache_path`(script_args, ...)	Resolve an on-disk cache directory for the processed dataset.
`_ensure_split_mapping`(dataset)	Coerce a dataset-like object into a split->dataset mapping.
`_format_eval_row`(example, *, prompt_column, ...)
`_normalize_eval_rows`(rows)
`_sample_eval_rows`(rows, keep, seed)
`_stable_hash`(value)	Return a short, stable hash for arbitrarily typed values.
`load_datasets`(script_args, training_args, ...)	Load train/eval datasets and return `(train_dataset, eval_rows)`.
`resolve_dataloader_kwargs`(training_args)	Return `torch.utils.data.DataLoader` kwargs derived from training_args.

maxent_grpo.training.data.load_datasets(script_args, training_args, tokenizer, *, accelerator=None)[source]¶

Load train/eval datasets and return (train_dataset, eval_rows).

The helper handles prompt/answer column normalization, optional dataset caching, and prompt truncation. Evaluation rows are normalized into a list of dictionaries with prompt/answer keys.

Parameters:

script_args (Any) – Script arguments describing dataset identifiers and prompt/answer columns.
training_args (Any) – Training configuration providing prompt limits and cache settings.
tokenizer (Any) – Tokenizer used to format prompts.
accelerator (Any | None) – Optional Accelerator used for process synchronization.

Returns:

Tuple of the processed training dataset and a list of evaluation rows (possibly empty when eval is disabled).

Return type:

tuple[Any, list]

Raises:

ValueError – If required dataset columns are missing.

maxent_grpo.training.data.resolve_dataloader_kwargs(training_args)[source]¶

Return torch.utils.data.DataLoader kwargs derived from training_args.

Parameters:: training_args (Any) – Training config or namespace containing DataLoader knobs such as dataloader_num_workers and dataloader_pin_memory.
Returns:: Dictionary of keyword arguments suitable for DataLoader.
Return type:: dict