maxent_grpo.training.runtime.prompts¶
Prompt-related helpers and sampling penalties.
Functions
|
Return the token cap used for prompt truncation. |
|
Resolve a prompt suffix from environment variables. |
|
Raise if the configured prompt column is missing from a dataset row. |
|
Shared prompt/answer builder used across training pipelines. |
|
Clamp prompt strings to a safe token length when possible. |
|
Append a short eval-only format reminder to the prompt. |
|
Append a format reminder to all prompts. |
|
Merge external truncation state into the shared warning cache. |
|
Clamp prompt strings to a safe token length when possible. |
Classes
|
Protocol for tokenizers with chat template capabilities. |
|
Shared penalty/stop sequence overrides for completion sampling. |
Expose penalty overrides via legacy |
- class maxent_grpo.training.runtime.prompts.ChatTokenizer(*args, **kwargs)[source]¶
Bases:
ProtocolProtocol for tokenizers with chat template capabilities.
- class maxent_grpo.training.runtime.prompts.GenerationPenaltyConfig(gen_top_k=None, gen_best_of=None, gen_frequency_penalty=0.0, gen_presence_penalty=0.0, gen_stop_sequences=None)[source]¶
Bases:
objectShared penalty/stop sequence overrides for completion sampling.
- Parameters:
- class maxent_grpo.training.runtime.prompts.GenerationPenaltyPassthroughMixin[source]¶
Bases:
objectExpose penalty overrides via legacy
gen_*accessors.- penalty: GenerationPenaltyConfig¶
- maxent_grpo.training.runtime.prompts.append_prompt_suffix(prompt)[source]¶
Append a format reminder to all prompts.
- maxent_grpo.training.runtime.prompts.append_eval_prompt_suffix(prompt)[source]¶
Append a short eval-only format reminder to the prompt.
- maxent_grpo.training.runtime.prompts.sync_trunc_state(state)[source]¶
Merge external truncation state into the shared warning cache.
- maxent_grpo.training.runtime.prompts.truncate_prompt(prompt, char_limit=None, *, tokenizer=None, max_tokens=None)[source]¶
Clamp prompt strings to a safe token length when possible.
- Parameters:
prompt (str) – Prompt string to clamp.
char_limit (int | None) – Optional character limit fallback. When
Nonethe module-levelPROMPT_CHAR_LIMITis used.tokenizer (Any | None) – Optional tokenizer used to enforce token limits.
max_tokens (int | None) – Optional token limit override (preferred when tokenizer is available).
- Returns:
The original prompt when under the limit, otherwise a truncated prefix.
- Return type: