Vllm-Project Vllm vulnerabilities

30 known vulnerabilities affecting vllm-project/vllm.

Total CVEs
30
CISA KEV
0
Public exploits
0
Exploited in wild
0
Severity breakdown
CRITICAL6HIGH13MEDIUM9LOW2

Vulnerabilities

Page 1 of 2
CVE-2026-34756MEDIUMCVSS 6.5v>= 0.1.0, < 0.19.02026-04-06
CVE-2026-34756 [MEDIUM] CWE-770 CVE-2026-34756: vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19. vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can s
nvd
CVE-2026-34755MEDIUMCVSS 6.5v>= 0.7.0, < 0.19.02026-04-06
CVE-2026-34755 [MEDIUM] CWE-770 CVE-2026-34755: vLLM is an inference and serving engine for large language models (LLMs). From 0.7.0 to before 0.19. vLLM is an inference and serving engine for large language models (LLMs). From 0.7.0 to before 0.19.0, the VideoMediaIO.load_base64() method at vllm/multimodal/media/video.py splits video/jpeg data URLs by comma to extract individual JPEG frames, but does not enforce a frame count limit. The num_frames parameter (default: 32), which is enforced by t
nvd
CVE-2026-34753MEDIUMCVSS 5.4v>= 0.16.0, < 0.19.02026-04-06
CVE-2026-34753 [MEDIUM] CWE-918 CVE-2026-34753: vLLM is an inference and serving engine for large language models (LLMs). From 0.16.0 to before 0.19 vLLM is an inference and serving engine for large language models (LLMs). From 0.16.0 to before 0.19.0, a server-side request forgery (SSRF) vulnerability in download_bytes_from_url allows any actor who can control batch input JSON to make the vLLM batch runner issue arbitrary HTTP/HTTPS requests from the server, without any URL validation or domain
nvd
CVE-2026-34760MEDIUMCVSS 5.9v>= 0.5.5, < 0.18.02026-04-02
CVE-2026-34760 [MEDIUM] CWE-20 CVE-2026-34760: vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to befo vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e
nvd
CVE-2026-27893HIGHCVSS 8.8v>= 0.10.1, < 0.18.02026-03-27
CVE-2026-27893 [HIGH] CWE-693 CVE-2026-27893: vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.18.0, two model implementation files hardcode `trust_remote_code=True` when loading sub-components, bypassing the user's explicit `--trust-remote-code=False` security opt-out. This enables remote code execution via malicious mode
nvd
CVE-2026-22778CRITICALCVSS 9.8v>= 0.8.3, < 0.14.12026-02-02
CVE-2026-22778 [CRITICAL] CWE-532 CVE-2026-22778: vLLM is an inference and serving engine for large language models (LLMs). From 0.8.3 to before 0.14. vLLM is an inference and serving engine for large language models (LLMs). From 0.8.3 to before 0.14.1, when an invalid image is sent to vLLM's multimodal endpoint, PIL throws an error. vLLM returns this error to the client, leaking a heap address. With this leak, we reduce ASLR from 4 billion guesses to ~8 guesses. This vulnerability can be chaine
nvd
CVE-2026-24779HIGHCVSS 7.1v>= 0.15.1, < 0.17.02026-01-27
CVE-2026-24779 [HIGH] CWE-918 CVE-2026-24779: vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.14.1, a vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.14.1, a Server-Side Request Forgery (SSRF) vulnerability exists in the `MediaConnector` class within the vLLM project's multimodal feature set. The load_from_url and load_from_url_async methods obtain and process media from URLs provided by users, using differ
nvd
CVE-2026-22807CRITICALCVSS 9.8v>= 0.10.1, < 0.14.02026-01-21
CVE-2026-22807 [CRITICAL] CWE-94 CVE-2026-22807: vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face `auto_map` dynamic modules during model resolution without gating on `trust_remote_code`, allowing attacker-controlled Python code in a model repo/path to execute at server startup. An attacker wh
nvd
CVE-2026-22773HIGHCVSS 7.5v>= 0.6.4, < 0.12.02026-01-10
CVE-2026-22773 [HIGH] CWE-770 CVE-2026-22773: vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This causes a tensor dimension mismatch that results in an unhandled runtime error,
nvd
CVE-2025-66448HIGHCVSS 8.8fixed in 0.11.12025-12-01
CVE-2025-66448 [HIGH] CWE-94 CVE-2025-66448: vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.11.1, vllm has vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.11.1, vllm has a critical remote code execution vector in a config class named Nemotron_Nano_VL_Config. When vllm loads a model config that contains an auto_map entry, the config class resolves that mapping with get_class_from_dynamic_module(...) and immediately instant
nvd
CVE-2025-62372HIGHCVSS 8.3v>= 0.5.5, < 0.11.12025-11-21
CVE-2025-62372 [HIGH] CWE-129 CVE-2025-62372: vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to befo vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong), regardless of whether the model is intended to support such inputs (as de
nvd
CVE-2025-62164HIGHCVSS 8.8v>= 0.10.2, < 0.11.12025-11-21
CVE-2025-62164 [HIGH] CWE-20 CVE-2025-62164: vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to be vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized t
nvd
CVE-2025-62426MEDIUMCVSS 6.5v>= 0.5.5, < 0.11.12025-11-21
CVE-2025-62426 [MEDIUM] CWE-770 CVE-2025-62426: vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to befo vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possib
nvd
CVE-2025-59425HIGHCVSS 7.5fixed in 0.11.0rc22025-10-07
CVE-2025-59425 [HIGH] CWE-385 CVE-2025-59425: vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more characters the provided API key gets correct. Data analysis across many attempts
nvd
CVE-2025-48956HIGHCVSS 7.5v>= 0.1.0, < 0.10.1.12025-08-21
CVE-2025-48956 [HIGH] CWE-400 CVE-2025-48956: vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10. vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory exhaustion, potentially leading to a crash or unresponsiveness. The atta
nvd
CVE-2025-48887MEDIUMCVSS 6.5v>= 0.6.4, < 0.9.02025-05-30
CVE-2025-48887 [MEDIUM] CWE-1333 CVE-2025-48887: vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Den vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but excluding 0.9.0. The root cause is the use of a highly complex and nested regular expression for tool call det
nvd
CVE-2025-48944MEDIUMCVSS 6.5v>= 0.8.0, < 0.9.02025-05-30
CVE-2025-48944 [MEDIUM] CWE-20 CVE-2025-48944: vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality is invoked. These inputs are not validated before bei
nvd
CVE-2025-48942MEDIUMCVSS 6.5v>= 0.8.0, < 0.9.02025-05-30
CVE-2025-48942 [MEDIUM] CWE-248 CVE-2025-48942: vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to bu vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the is
nvd
CVE-2025-46722HIGHCVSS 7.3v>= 0.7.0, < 0.9.02025-05-29
CVE-2025-46722 [HIGH] CWE-1023 CVE-2025-46722: vLLM is an inference and serving engine for large language models (LLMs). In versions starting from vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only t
nvd
CVE-2025-46570LOWCVSS 2.6fixed in 0.9.02025-05-29
CVE-2025-46570 [LOW] CWE-208 CVE-2025-46570: vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, wh vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to
nvd