maxent_grpo.cli.hydra_cli

Copyright 2025 Liv d’Aliberti

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Hydra-powered multi-command CLI for MaxEnt-GRPO workflows.

Functions

_apply_command_objective_default(...)

Apply command-specific objective defaults for recipe-less invocations.

_apply_overrides(target, overrides)

_build_grpo_configs(cmd)

Construct GRPO config objects from a command block.

_invoke_hydra_cli()

Invoke hydra_main through Hydra's decorator wrapper for CLI use.

_load_config_store()

Return Hydra's ConfigStore class if available.

_maybe_insert_command(default_command)

Ensure hydra sees a command override for convenience entrypoints.

_merge_mapping(base, updates)

_register_hydra_config()

Register HydraRootConfig with Hydra's config store.

_resolve_model_config_cls()

Return the TRL ModelConfig type or a stub when TRL is unavailable.

_resolve_recipe_path(cmd)

Return the explicit recipe path or fall back to $GRPO_RECIPE.

baseline_entry()

Console script wrapper for baseline training.

hydra_entry()

Entry point for the top-level Hydra CLI.

hydra_main([cfg])

Dispatch hydra-configured subcommands (direct-call friendly).

Classes

BaselineCommand([recipe, script, training, ...])

GRPO training command options for the baseline recipe.

DictConfig

alias of _DictConfigStub

HydraRootConfig([command, baseline, maxent])

Hydra root configuration covering all supported CLI commands.

MaxentCommand([recipe, script, training, model])

GRPO training command options for the MaxEnt recipe.

OmegaConf

alias of _OmegaConfStub

_DictConfig

alias of _DictConfigStub

_DictConfigStub

Minimal stub so type hints resolve without hydra installed.

_FallbackModelConfig(**kwargs)

Trivial stand-in for trl.ModelConfig when TRL is absent.

_HydraStub()

Minimal Hydra-like stub used when hydra is absent.

_OmegaConf

alias of _OmegaConfStub

_OmegaConfStub()

maxent_grpo.cli.hydra_cli.DictConfig

alias of _DictConfigStub

maxent_grpo.cli.hydra_cli.OmegaConf

alias of _OmegaConfStub

class maxent_grpo.cli.hydra_cli.BaselineCommand(recipe=None, script=<factory>, training=<factory>, model=<factory>)[source]

Bases: object

GRPO training command options for the baseline recipe.

Parameters:
  • recipe (str | None) – Optional recipe file path to load default configs from.

  • script (Dict[str, Any]) – Script-level overrides passed to GRPO script arguments.

  • training (Dict[str, Any]) – Training argument overrides passed to GRPO config.

  • model (Dict[str, Any]) – Model argument overrides passed to TRL model config.

recipe: str | None = None
script: Dict[str, Any]
training: Dict[str, Any]
model: Dict[str, Any]
class maxent_grpo.cli.hydra_cli.MaxentCommand(recipe=None, script=<factory>, training=<factory>, model=<factory>)[source]

Bases: object

GRPO training command options for the MaxEnt recipe.

Parameters:
  • recipe (str | None) – Optional recipe file path to load default configs from.

  • script (Dict[str, Any]) – Script-level overrides passed to GRPO script arguments.

  • training (Dict[str, Any]) – Training argument overrides passed to GRPO config.

  • model (Dict[str, Any]) – Model argument overrides passed to TRL model config.

recipe: str | None = None
script: Dict[str, Any]
training: Dict[str, Any]
model: Dict[str, Any]
class maxent_grpo.cli.hydra_cli.HydraRootConfig(command='train-baseline', baseline=<factory>, maxent=<factory>)[source]

Bases: object

Hydra root configuration covering all supported CLI commands.

Parameters:
  • command (str) – Name of the subcommand to run.

  • baseline (BaselineCommand) – Baseline training command configuration.

  • maxent (MaxentCommand) – MaxEnt training command configuration.

command: str = 'train-baseline'
baseline: BaselineCommand
maxent: MaxentCommand
maxent_grpo.cli.hydra_cli.hydra_main(cfg=None)[source]

Dispatch hydra-configured subcommands (direct-call friendly).

Parameters:

cfg (_DictConfigStub | None) – Optional Hydra configuration object or plain dict derived from CLI files.

Returns:

Result of the executed command, or None for commands that only have side effects.

Raises:

ValueError – If an unsupported command name is supplied.

Return type:

Any

maxent_grpo.cli.hydra_cli.hydra_entry()[source]

Entry point for the top-level Hydra CLI.

Returns:

None after invoking the configured command.

Return type:

None

maxent_grpo.cli.hydra_cli.baseline_entry()[source]

Console script wrapper for baseline training.

Returns:

None after dispatching to Hydra.

Return type:

None