maxent_grpo.training.rollout¶
Copyright 2025 Liv d’Aliberti
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Rollout utilities for the MaxEnt runner.
- class maxent_grpo.training.rollout.CompletionGenerator(ctx)[source]¶
Bases:
LocalGenerationMixin,VLLMGenerationMixinStateful helper that handles both local HF and vLLM completions.
- Parameters:
ctx (GenerationContext)
- class maxent_grpo.training.rollout.GenerationContext(max_prompt_len, max_completion_len, gen_temperature, gen_top_p, use_vllm, vllm, accelerator, model, tokenizer, generation_stats, device, penalty=<factory>, prompt_char_limit=None, *, vllm_mode='server')[source]¶
Bases:
GenerationPenaltyPassthroughMixin,GenerationSamplingConfigConfiguration required to produce completions for each training batch.
- Parameters:
max_prompt_len (int)
max_completion_len (int)
gen_temperature (float)
gen_top_p (float)
use_vllm (bool)
vllm (VLLMClientConfig)
accelerator (Accelerator)
model (PreTrainedModel)
tokenizer (PreTrainedTokenizer)
device (Any)
penalty (GenerationPenaltyConfig)
prompt_char_limit (int | None)
vllm_mode (str)
- accelerator: TypesAccelerator¶
- model: TypesPreTrainedModel¶
- tokenizer: TypesPreTrainedTokenizer¶
- device: Any¶
- penalty: GenerationPenaltyConfig¶
Modules
Shared generation context dataclass used by local and vLLM paths. |
|
Distributed helpers shared across generation utilities. |
|
Public CompletionGenerator that wires local and vLLM helpers together. |
|
Completion generation helpers for the MaxEnt-GRPO runner. |
|
Local HF generation helpers split from the vLLM adapter. |
|
vLLM-focused helpers split away from the local generation path. |
|
In-process (colocated) vLLM generation helpers for the custom loop. |