CVE-2026-54233
published 2026-06-22CVE-2026-54233: vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload…
PriorityP334medium6.5CVSS 3.1
AVNACLPRLUINSUCNINAH
EPSS
0.24%
15.3th percentile
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. This vulnerability is fixed in 0.23.1rc0.
Affected
24 ranges
| Vendor | Product | Version range | Fixed in |
|---|---|---|---|
| rhaii | vllm-cpu-rhel9 | — | — |
| rhaii | vllm-cuda-rhel9 | — | — |
| rhaii | vllm-gaudi-rhel9 | — | — |
| rhaii | vllm-neuron-rhel9 | — | — |
| rhaii | vllm-rocm-rhel9 | — | — |
| rhaii | vllm-spyre-rhel9 | — | — |
| rhaii | vllm-tpu-rhel9 | — | — |
| rhaiis | vllm-cpu-rhel9 | — | — |
| rhaiis | vllm-cuda-rhel9 | — | — |
| rhaiis | vllm-neuron-rhel9 | — | — |
| rhaiis | vllm-rocm-rhel9 | — | — |
| rhaiis | vllm-spyre-rhel9 | — | — |
| rhaiis | vllm-tpu-rhel9 | — | — |
| rhelai3 | bootc-aws-cuda-rhel9 | — | — |
| rhelai3 | bootc-azure-cuda-rhel9 | — | — |
| rhelai3 | bootc-azure-rocm-rhel9 | — | — |
| rhelai3 | bootc-cuda-rhel9 | — | — |
| rhelai3 | bootc-gaudi-rhel9 | — | — |
| rhelai3 | bootc-gcp-cuda-rhel9 | — | — |
| rhelai3 | bootc-rocm-rhel9 | — | — |
| rhoai | odh-vllm-gaudi-rhel9 | — | — |
| vllm-project | vllm | < 0.23.1rc0 | 0.23.1rc0 |
| vllm | vllm | < 0.23.1 | 0.23.1 |
| vllm | vllm | 0 – 0.23.0 | — |
CVSS provenance
nvdv3.16.5MEDIUMCVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
vendor_redhat6.5MEDIUM
Stop checking back — get the weekly exploitation signal.
Every Monday: what got weaponized or added to CISA KEV in the last seven days — each CVE cross-linked to its PoC, Nuclei template, and detection rule. Free, one email a week, unsubscribe in one click.
Red Hat
vllm: vLLM: Denial of Service via excessive memory allocation in audio transcription
vendor_redhat·2026-06-22·CVSS 6.5
CVE-2026-54233 [MEDIUM] CWE-770 vllm: vLLM: Denial of Service via excessive memory allocation in audio transcription
vllm: vLLM: Denial of Service via excessive memory allocation in audio transcription
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. This vulnerability is fixed in 0.23.1rc0.
A flaw was found in vLLM, an inference and serving engine for large language models (LLMs). A remote attacker could exploit a vulnerability in the `/v1/audio/transcriptions` endpoint. By uploading a specially crafted compressed audio file, such as an OPUS file, the attacker could cause the system to allocate an excessive amount of memory during the decoding process. This uncontrolled memory allocation can le
GHSA
vLLM: OOM Denial of Service via Audio Decompression Bomb
ghsa·2026-06-17
CVE-2026-54233 [MEDIUM] CWE-409 vLLM: OOM Denial of Service via Audio Decompression Bomb
vLLM: OOM Denial of Service via Audio Decompression Bomb
### Summary
vLLM's `/v1/audio/transcriptions` endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0.
### Details
`SpeechToTextProcessor` rejects uploads over `VLLM_MAX_AUDIO_CLIP_FILESIZE_MB` (default 25MB) based on compressed byte length, but the audio decoder in `audio.py` accumulates all decoded frames into memory with no size limit before returning:
```python
# speech_to_text.py L184-189
if len(audio_data) / 1024 ** 2 > self.max_audio_filesize_mb:
raise VLLMValidationError(...)
y, sr = load_audio(buf, sr=self.asr_config.sample_rate) # decoded size unchecked
# audio.py L77-107
chunks: list[npt.NDArray] = []
for frame in conta
No detection rules found.
No public exploits indexed.
2026-06-22
Published