maxent_grpo.training.data

Dataset loading helpers for the training pipeline.

Functions

_dataset_cache_path(script_args, ...)

Resolve an on-disk cache directory for the processed dataset.

_ensure_split_mapping(dataset)

Coerce a dataset-like object into a split->dataset mapping.

_format_eval_row(example, *, prompt_column, ...)

_normalize_eval_rows(rows)

_sample_eval_rows(rows, keep, seed)

_stable_hash(value)

Return a short, stable hash for arbitrarily typed values.

load_datasets(script_args, training_args, ...)

Load train/eval datasets and return (train_dataset, eval_rows).

resolve_dataloader_kwargs(training_args)

Return torch.utils.data.DataLoader kwargs derived from training_args.

maxent_grpo.training.data.load_datasets(script_args, training_args, tokenizer, *, accelerator=None)[source]

Load train/eval datasets and return (train_dataset, eval_rows).

The helper handles prompt/answer column normalization, optional dataset caching, and prompt truncation. Evaluation rows are normalized into a list of dictionaries with prompt/answer keys.

Parameters:
  • script_args (Any) – Script arguments describing dataset identifiers and prompt/answer columns.

  • training_args (Any) – Training configuration providing prompt limits and cache settings.

  • tokenizer (Any) – Tokenizer used to format prompts.

  • accelerator (Any | None) – Optional Accelerator used for process synchronization.

Returns:

Tuple of the processed training dataset and a list of evaluation rows (possibly empty when eval is disabled).

Return type:

tuple[Any, list]

Raises:

ValueError – If required dataset columns are missing.

maxent_grpo.training.data.resolve_dataloader_kwargs(training_args)[source]

Return torch.utils.data.DataLoader kwargs derived from training_args.

Parameters:

training_args (Any) – Training config or namespace containing DataLoader knobs such as dataloader_num_workers and dataloader_pin_memory.

Returns:

Dictionary of keyword arguments suitable for DataLoader.

Return type:

dict