maxent_grpo.training.controller_optimizer¶
Meta-optimizer orchestration for controller updates.
Functions
|
Return a torch.no_grad context when available, else a nullcontext. |
Classes
|
Manage meta-controller optimizer state and cadence. |
- class maxent_grpo.training.controller_optimizer.ControllerMetaManager(cfg, weighting)[source]¶
Bases:
objectManage meta-controller optimizer state and cadence.
- Parameters:
cfg (GRPOConfig)
weighting (WeightingSettings)
- make_backprop_fn()[source]¶
Return a callback that computes gradients via autograd.
- Return type:
Callable[[int], ControllerGradients | None] | None
- apply_gradients(gradients, *, lr_scale)[source]¶
Apply controller updates based on the configured method.
- Parameters:
gradients (ControllerGradients | None)
lr_scale (float)
- Return type:
None