maxent_grpo.training.generation.vocab_guard

Shared helpers for masking model-only token IDs during generation.

Functions

_resolve_served_model_id(ctx)

Best-effort resolution of the external vLLM-served model identifier.

_resolve_served_model_vocab_limit(ctx)

Return the output-vocab width exposed by the external vLLM model.

merge_invalid_token_block_logit_bias(ctx, ...)

Block model-only token IDs that the tokenizer cannot represent.

resolve_allowed_token_ids(ctx)

Return a cached hard allowlist for tokenizer-addressable token IDs.

resolve_blocked_token_ids(ctx)

Return tokenizer-inaccessible model token IDs for local generation guards.

resolve_model_vocab_limit(ctx)

Return the model output-vocab width exposed to generation.

resolve_tokenizer_vocab_limit(tokenizer)

Return the maximum token id addressable by the tokenizer plus one.

maxent_grpo.training.generation.vocab_guard.resolve_tokenizer_vocab_limit(tokenizer)[source]

Return the maximum token id addressable by the tokenizer plus one.

Parameters:

tokenizer (Any)

Return type:

int | None

maxent_grpo.training.generation.vocab_guard.resolve_model_vocab_limit(ctx)[source]

Return the model output-vocab width exposed to generation.

Parameters:

ctx (Any)

Return type:

int | None

maxent_grpo.training.generation.vocab_guard.merge_invalid_token_block_logit_bias(ctx, existing_bias)[source]

Block model-only token IDs that the tokenizer cannot represent.

Parameters:
  • ctx (Any)

  • existing_bias (Any)

Return type:

Dict[str, float] | None

maxent_grpo.training.generation.vocab_guard.resolve_allowed_token_ids(ctx)[source]

Return a cached hard allowlist for tokenizer-addressable token IDs.

Parameters:

ctx (Any)

Return type:

List[int] | None

maxent_grpo.training.generation.vocab_guard.resolve_blocked_token_ids(ctx)[source]

Return tokenizer-inaccessible model token IDs for local generation guards.

Parameters:

ctx (Any)

Return type:

List[int]