maxent_grpo.training.runtime.ops.vllm_startup

Detect and classify vLLM server startup stalls from log text.

Functions

_build_parser()

classify_vllm_startup_log(log_text[, ...])

Classify startup progress using marker patterns in log_text.

main()

should_trigger_v0_fallback(log_text, attempt)

Return True when vLLM startup appears stuck and should be relaunched in V0 mode.

Classes

StartupStatus(value)

High-level startup state derived from vLLM log lines.

class maxent_grpo.training.runtime.ops.vllm_startup.StartupStatus(value)[source]

Bases: str, Enum

High-level startup state derived from vLLM log lines.

STARTING = 'starting'
HEALTHY = 'healthy'
CORE_ENGINE_STALL = 'core_engine_stall'
ERROR = 'error'
maxent_grpo.training.runtime.ops.vllm_startup.classify_vllm_startup_log(log_text, stall_threshold=3)[source]

Classify startup progress using marker patterns in log_text.

Parameters:
  • log_text (str)

  • stall_threshold (int)

Return type:

StartupStatus

maxent_grpo.training.runtime.ops.vllm_startup.should_trigger_v0_fallback(log_text, attempt, min_attempts=20, stall_threshold=3)[source]

Return True when vLLM startup appears stuck and should be relaunched in V0 mode.

Parameters:
  • log_text (str)

  • attempt (int)

  • min_attempts (int)

  • stall_threshold (int)

Return type:

bool

maxent_grpo.training.runtime.ops.vllm_startup.main()[source]
Return type:

int