maxent_grpo.cli.config_validation¶
Pydantic-powered validation for Hydra training configs.
This module inspects the resolved training arguments before a pipeline is
launched so accidental MaxEnt toggles are caught early. The validator is kept
lightweight and only depends on pydantic, which is already part of the
runtime toolchain for several other components. Future guardrails can extend
this module by adding additional schema checks (including GRPO + entropy-bonus
overrides under train-maxent).
Functions
|
|
|
|
|
|
|
|
|
Return MaxEnt fields whose values differ from their defaults. |
|
Return the canonical entropy-mode label used by config validation. |
|
|
|
Return a short string pointing at the config origin for error messages. |
|
Return a mapping containing the knobs relevant to validation. |
|
Reject entropy-loss settings that do not match the implemented math. |
|
|
|
|
|
Reject SEED-GRPO knobs that are incompatible with the selected objective. |
|
Validate Hydrated training knobs before dispatching to a pipeline. |
Classes
|
Minimal schema capturing the knobs that need cross-field validation. |
- maxent_grpo.cli.config_validation.validate_training_config(training_args, *, command, source=None)[source]¶
Validate Hydrated training knobs before dispatching to a pipeline.
The validator ensures that the canonical
objectivematches the presence of MaxEnt-specific options. When MaxEnt knobs are supplied while the effective objective stays on the native GRPO path, aValueErroris raised so the job fails fast.- Parameters:
- Returns:
None. Raises on invalid or incompatible configurations.- Raises:
ValueError – If incompatible knob combinations are detected.
- Return type:
None