maxent_grpo.training.runtime.setup¶
Setup utilities for loading runtime dependencies and accelerator plugins.
Functions
|
Construct a DeepSpeedPlugin from Accelerate env/config when available. |
Return TRL's prepare_deepspeed helper when available. |
|
|
Return accelerate.Accelerator or raise a helpful RuntimeError. |
|
Return torch.utils.data.DataLoader with a descriptive error on failure. |
|
Return a DeepSpeed module import or raise a contextual RuntimeError. |
|
Return the torch module or raise a helpful RuntimeError. |
|
Return (PreTrainedModel, PreTrainedTokenizer) with clear failure messages. |
- class maxent_grpo.training.runtime.setup.GenerationSamplingConfig(max_prompt_len, max_completion_len, gen_temperature, gen_top_p, use_vllm, vllm, *, vllm_mode='server')[source]¶
Bases:
objectShared completion sampling knobs (HF + vLLM).
- Parameters:
- vllm: VLLMClientConfig¶
- property vllm_frequency_penalty: float¶
Backward-compatible accessor for the frequency penalty value.
- property vllm_include_stop_str_in_output: bool¶
Whether vLLM should preserve matched stop strings in output text.
- property vllm_backoff_multiplier: float¶
Multiplier applied to the backoff delay after each attempt.
- property vllm_guided_json: str | None¶
Backward-compatible accessor for JSON schema-guided decoding.
- class maxent_grpo.training.runtime.setup.MaxEntOptions(tau=<factory>, q_temperature=<factory>, q_epsilon=<factory>, length_normalize_ref=<factory>)[source]¶
Bases:
objectLightweight knobs specific to MaxEnt sequence-level updates.
- maxent_grpo.training.runtime.setup.get_trl_prepare_deepspeed()[source]¶
Return TRL’s prepare_deepspeed helper when available.
- Return type:
Any | None
- maxent_grpo.training.runtime.setup.require_accelerator(context)[source]¶
Return accelerate.Accelerator or raise a helpful RuntimeError.
- maxent_grpo.training.runtime.setup.require_dataloader(context)[source]¶
Return torch.utils.data.DataLoader with a descriptive error on failure.
- maxent_grpo.training.runtime.setup.require_deepspeed(context, module='deepspeed')[source]¶
Return a DeepSpeed module import or raise a contextual RuntimeError.