cbcvebase.

Ggml-Org Llama.Cpp vulnerabilities

7 known vulnerabilities affecting ggml-org/llama.cpp.

Total CVEs
7
CISA KEV
0
Public exploits
0
Exploited in wild
0
Severity breakdown
CRITICAL2HIGH4LOW1

Vulnerabilities

Page 1 of 1
CVE-2026-34159P2CRITICALCVSS 9.8fixed in b84922026-04-01
CVE-2026-34159 [CRITICAL] CWE-119 CVE-2026-34159: llama.cpp is an inference of several LLM models in C/C++. Prior to version b8492, the RPC backend's llama.cpp is an inference of several LLM models in C/C++. Prior to version b8492, the RPC backend's deserialize_tensor() skips all bounds validation when a tensor's buffer field is 0. An unauthenticated attacker can read and write arbitrary process memory via crafted GRAPH_COMPUTE messages. Combined with pointer leaks from ALLOC_BUFFER/BUFFER_GET_B
nvd
CVE-2026-21869P2CRITICALCVSS 9.8≤ 55d4206c82026-01-08
CVE-2026-21869 [CRITICAL] CWE-787 CVE-2026-21869: llama.cpp is an inference of several LLM models in C/C++. In commits 55d4206c8 and prior, the n_disc llama.cpp is an inference of several LLM models in C/C++. In commits 55d4206c8 and prior, the n_discard parameter is parsed directly from JSON input in the llama.cpp server's completion endpoints without validation to ensure it's non-negative. When a negative value is supplied and the context fills up, llama_memory_seq_rm/add receives a reversed r
nvd
CVE-2025-49847P3HIGHCVSS 8.8fixed in b56622025-06-17
CVE-2025-49847 [HIGH] CWE-119 CVE-2025-49847: llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐suppli llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, cau
nvd
CVE-2026-33298P3HIGHCVSS 7.8fixed in b78242026-03-24
CVE-2026-33298 [HIGH] CWE-122 CVE-2026-33298: llama.cpp is an inference of several LLM models in C/C++. Prior to b7824, an integer overflow vulner llama.cpp is an inference of several LLM models in C/C++. Prior to b7824, an integer overflow vulnerability in the `ggml_nbytes` function allows an attacker to bypass memory validation by crafting a GGUF file with specific tensor dimensions. This causes `ggml_nbytes` to return a significantly smaller size than required (e.g., 4MB instead of Exabytes),
nvd
CVE-2025-53630P3HIGHCVSS 8.9fixed in b81462025-07-10
CVE-2025-53630 [HIGH] CWE-122 CVE-2025-53630: llama.cpp is an inference of several LLM models in C/C++. Integer Overflow in the gguf_init_from_fil llama.cpp is an inference of several LLM models in C/C++. Integer Overflow in the gguf_init_from_file_impl function in ggml/src/gguf.cpp can lead to Heap Out-of-Bounds Read/Write. This vulnerability is fixed in commit 26a48ad699d50b6268900062661bd22f3e792579.
nvd
CVE-2025-52566P3HIGHCVSS 8.8fixed in b57212025-06-24
CVE-2025-52566 [HIGH] CWE-119 CVE-2025-52566: llama.cpp is an inference of several LLM models in C/C++. Prior to version b5721, there is a signed llama.cpp is an inference of several LLM models in C/C++. Prior to version b5721, there is a signed vs. unsigned integer overflow in llama.cpp's tokenizer implementation (llama_vocab::tokenize) (src/llama-vocab.cpp:3036) resulting in unintended behavior in tokens copying size comparison. Allowing heap-overflowing llama.cpp inferencing engine with caref
nvd
CVE-2026-2069P4LOWCVSS 3.3v55abc392026-02-06
CVE-2026-2069 [LOW] CWE-119 CVE-2026-2069: A flaw has been found in ggml-org llama.cpp up to 55abc39. Impacted is the function llama_grammar_ad A flaw has been found in ggml-org llama.cpp up to 55abc39. Impacted is the function llama_grammar_advance_stack of the file llama.cpp/src/llama-grammar.cpp of the component GBNF Grammar Handler. This manipulation causes stack-based buffer overflow. The attack needs to be launched locally. The exploit has been published and may be used. Patch name: 18993.
nvd
Ggml-Org Llama.Cpp vulnerabilities | cvebase