vllm.model_executor.models.config ¶
HybridAttentionMambaModelConfig ¶
Bases: VerifyAndUpdateConfig
Source code in vllm/model_executor/models/config.py
verify_and_update_config classmethod ¶
verify_and_update_config(vllm_config: VllmConfig) -> None
Perform early validation and setup for hybrid attention/mamba models.
Block size alignment with mamba page sizes is handled later by Platform.update_block_size_for_backend(), which runs after model layers are constructed and the attention backend is known.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vllm_config | VllmConfig | vLLM Config | required |
Source code in vllm/model_executor/models/config.py
LlamaNemotronVLConfig ¶
Bases: VerifyAndUpdateConfig
Config handler for LlamaNemotronVL embedding models.
Source code in vllm/model_executor/models/config.py
MambaModelConfig ¶
Bases: VerifyAndUpdateConfig
Source code in vllm/model_executor/models/config.py
verify_and_update_config classmethod ¶
verify_and_update_config(vllm_config: VllmConfig) -> None
Enable FULL_AND_PIECEWISE cuda graph mode by default (required to get good performance for mamba layers in V1).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vllm_config | VllmConfig | vLLM Config | required |
Source code in vllm/model_executor/models/config.py
NemotronHForCausalLMConfig ¶
Bases: VerifyAndUpdateConfig
Source code in vllm/model_executor/models/config.py
verify_and_update_config staticmethod ¶
verify_and_update_config(vllm_config: VllmConfig) -> None
Update mamba_ssm_cache_dtype for NemotronH models when set to 'auto' (or not explicitly set), to the value specified in the HF config, or to float16 if not specified.
Source code in vllm/model_executor/models/config.py
Qwen3_5ForConditionalGenerationConfig ¶
Bases: VerifyAndUpdateConfig
Source code in vllm/model_executor/models/config.py
verify_and_update_config staticmethod ¶
verify_and_update_config(vllm_config: VllmConfig) -> None
Update mamba_ssm_cache_dtype for Qwen3.5 models when set to 'auto' (or not explicitly set), to the value specified in the HF config's mamba_ssm_dtype field. Warn if the user explicitly overrides it to a different value.