maxent_grpo.training.scoring_common¶
Scoring helpers extracted from the MaxEnt-GRPO training loop.
Functions
|
Return value as a context manager when possible, otherwise a no-op. |
|
Return the right autocast context for the current accelerator/device. |
|
Return |
|
|
|
Return a human-friendly summary of an embedding module. |
|
Return True if flag is True on all ranks (best-effort). |
|
Return True if flag is True on any rank (best-effort). |
|
Return a dist module when initialized, otherwise None. |
|
Return a config value from either Mapping or object-style configs. |
|
Return the vocab size exposed by the model's embedding weights. |
|
Return a tensor cast to long when the stub lacks |
|
Return True when any known embedding weight is not 2-D. |
|
Yield from |
|
|
|
Return the active torch module. |
|
Normalize dtype objects coming from various stubs. |
|
|
|
Return |
|
Return a numpy view of |
|
Return True for tensor-like stubs used in tests. |
|
Return True if the provided weight exposes a 2-D shape. |
Classes
|
Minimal distributed API needed for best-effort gathers. |
|
Lightweight wrapper that compares equal to torch.long in stubs. |
|
Context manager that temporarily clamps padding attributes. |
|