maxent_grpo.training.controller_optimizer

Meta-optimizer orchestration for controller updates.

Functions

_no_grad_context(torch_mod)

Return a torch.no_grad context when available, else a nullcontext.

Classes

ControllerMetaManager(cfg, weighting)

Manage meta-controller optimizer state and cadence.

class maxent_grpo.training.controller_optimizer.ControllerMetaManager(cfg, weighting)[source]

Bases: object

Manage meta-controller optimizer state and cadence.

Parameters:
should_run(global_step)[source]
Parameters:

global_step (int)

Return type:

bool

make_backprop_fn()[source]

Return a callback that computes gradients via autograd.

Return type:

Callable[[int], ControllerGradients | None] | None

apply_gradients(gradients, *, lr_scale)[source]

Apply controller updates based on the configured method.

Parameters:
Return type:

None