maxent_grpo.training.generation.vllm_requests

Request/retry helpers separated from vLLM weight sync and scatter logic.

Functions

_client_tag_fail_fast_enabled([ctx])

_coerce_bool(value)

_hash_prompts(prompts)

Return a stable identifier for the pending prompt batch.

_is_client_tag_error(err)

_normalize_vllm_url(raw_url)

Return a normalized vLLM /generate endpoint URL or raise on invalid input.

_record_logprob_status(ctx, has_payload)

_resolve_client_tag(ctx)

Return a stable client tag for this trainer rank if available.

_resolve_dataset_label(ctx)

Return the dataset label stored on the context or stats.

_resolve_default_limit()

Return the current default prompt token cap from the environment.

_resolve_served_model_id(ctx)

Best-effort resolution of the external model identifier.

Classes

VLLMRequestMixin()

Mix-in that isolates request building, retries, and aggregation.

class maxent_grpo.training.generation.vllm_requests.VLLMRequestMixin[source]

Bases: object

Mix-in that isolates request building, retries, and aggregation.

ctx: Any
set_safe_generate(safe_fn)[source]

Allow callers to override the vLLM safe_generate hook.

Parameters:

safe_fn (Callable[..., Any]) – Callable matching the safe_generate signature.

Return type:

None

set_time_provider(time_mod)[source]

Allow callers to override the time module for sleep/now calls.

Parameters:

time_mod (Any) – Replacement module or object exposing sleep and time as needed.

Return type:

None

set_fallback_generate(fallback_fn)[source]

Allow callers to override the local fallback generation hook.

Parameters:

fallback_fn (Callable[..., Any]) – Callable invoked when vLLM cannot provide outputs.

Return type:

None

set_request_executor(executor_fn)[source]

Allow callers to override the vLLM request executor.

Parameters:

executor_fn (Callable[[maxent_grpo.training.generation.vllm_state._VLLMGenerationState, list[int]], bool]) – Function that performs one vLLM request round for pending indices and returns True on success.

Return type:

None

set_request_batcher(batcher_fn)[source]

Allow callers to override the vLLM batch request helper.

Parameters:

batcher_fn (Callable[[list[str], int], tuple[list[list[str]] | None, list[list[VLLMLogprobResult | None]] | None]]) – Callable used to build and dispatch a single vLLM request for a list of prompts and a target count.

Return type:

None

run_vllm_rounds(state)[source]

Public entry point for executing vLLM retry rounds.

Parameters:

state (_VLLMGenerationState) – Mutable vLLM generation state tracked across retries.

Return type:

None

static expand_dedup_results(grouped, meta, mapping)[source]

Public wrapper for expanding de-duplicated results.

Parameters:
  • grouped (list[list[str]]) – Grouped completions for unique prompts.

  • meta (list[list[VLLMLogprobResult | None]] | None) – Optional grouped metadata for unique prompts.

  • mapping (list[int] | None) – Mapping from original prompt indices to unique indices.

Returns:

Grouped completions and metadata expanded to the original prompt ordering.

Return type:

tuple[list[list[str]], list[list[VLLMLogprobResult | None]] | None]

prepare_vllm_targets(prompts, num_samples, per_prompt_counts)[source]

Public wrapper for resolving vLLM targets/dedup mapping.

Parameters:
  • prompts (list[str]) – Original prompt list.

  • num_samples (int) – Global completion target per prompt.

  • per_prompt_counts (list[int] | None) – Optional per-prompt completion overrides.

Returns:

Tuple of deduplicated prompts, target counts, and mapping back to the original order when deduplication occurs.

Return type:

tuple[list[str], list[int], list[int] | None]

merge_vllm_results(state, grouped, grouped_meta, pending_indices)[source]

Public wrapper for merging generated outputs.

Parameters:
  • state (_VLLMGenerationState) – Generation state to update.

  • grouped (list[list[str]]) – Generated completions aligned to pending_indices.

  • grouped_meta (list[list[VLLMLogprobResult | None]] | None) – Optional metadata aligned to pending_indices.

  • pending_indices (list[int]) – Prompt indices associated with the provided completions.

Return type:

None

backfill_missing(state, missing_indices)[source]

Public wrapper for local fallback generation.

Parameters:
  • state (_VLLMGenerationState) – Generation state to update.

  • missing_indices (list[int]) – Prompt indices still missing completions.

Return type:

None

record_vllm_failure(state, missing_indices)[source]

Public wrapper for reporting vLLM failures.

Parameters:
  • state (_VLLMGenerationState) – Generation state containing prompt counts.

  • missing_indices (list[int]) – Indices that remain incomplete.

Return type:

None

static coalesce_grouped_outputs(groups, prompt_count, requested_n, meta=None)[source]

Public wrapper for regrouping vLLM outputs.

Parameters:
  • groups (list[list[str]]) – Raw grouped completions returned by vLLM.

  • prompt_count (int) – Number of prompts originally requested.

  • requested_n (int) – Target completions per prompt.

  • meta (list[list[VLLMLogprobResult | None]] | None) – Optional grouped metadata aligned with groups.

Returns:

Regrouped completions and metadata aligned to prompts. If regrouping is not possible, metadata may be dropped.

Return type:

tuple[list[list[str]], list[list[VLLMLogprobResult | None]] | None]

static merge_group_chunk(chunk, meta_chunk, requested_n)[source]

Public wrapper for merging grouped chunks.

Parameters:
  • chunk (list[list[str]]) – Subset of grouped outputs belonging to one prompt.

  • meta_chunk (list[list[VLLMLogprobResult | None]] | None) – Optional metadata aligned to chunk.

  • requested_n (int) – Target number of completions for the prompt.

Returns:

Flattened completions and optional flattened metadata trimmed to requested_n.

Return type:

tuple[list[str], list[VLLMLogprobResult | None] | None]