maxent\_grpo.training.rewards ============================= .. automodule:: maxent_grpo.training.rewards .. rubric:: Functions .. autosummary:: _apply_group_scales _build_reward_proxy _call_reward_fn _coerce_reward_names _completion_was_truncated _compute_seed_grpo_statistics _extract_completion_runtime_info _extract_ref_logprob_fields _group_q_distribution _has_recipe_path _rank_tag _sanitize_ref_logprob_meta _seed_extract_answer _seed_logsumexp _seed_logsumexp_by_id _seed_predictive_entropy_rao _seed_semantic_ids_by_answers _zero_truncated_completion_rewards compute_reward_statistics compute_reward_totals group_advantages load_eval_reward_functions load_reward_functions prepare_generation_batch reward_moments