cbcvebase.
CVE-2023-46229
published 2023-10-19

CVE-2023-46229: LangChain before 0.0.317 allows SSRF via document_loaders/recursive_url_loader.py because crawling can proceed from an external server to an internal server.

PriorityP180high8.8CVSS 3.1
AVNACLPRNUIRSUCHIHAH
ITWVulnCheck KEV
Exploited in the wild
EPSS
44.71%
98.6th percentile
LangChain before 0.0.317 allows SSRF via document_loaders/recursive_url_loader.py because crawling can proceed from an external server to an internal server.

Affected

3 ranges
VendorProductVersion rangeFixed in
langchainlangchain< 0.0.3170.0.317
langchainlangchain>= 0 < 0.0.3170.0.317
langchainlangchain>= 0 < 9ecb7240a480720ec9d739b3877a52f76098a2b89ecb7240a480720ec9d739b3877a52f76098a2b8

Detection & IOCsextracted from sources · hover to see the quote

pathlangchain/document_loaders/recursive_url_loader.py
pathlangchain/libs/langchain/langchain/document_loaders/sitemap.py
pathlangchain/document_loaders/web_base.py
  • Detect SSRF attempts via LangChain SitemapLoader: monitor for outbound HTTP requests initiated by aiohttp.ClientSession.get that traverse from external/public URLs to internal/RFC-1918 IP space, as the scrape_all method invokes _fetch without any filtering or sanitizing.
  • Flag LangChain versions earlier than 0.0.317 in software inventory; the vulnerability is present in all prior versions and was patched in pull request langchain#11925 released in version 0.0.317.
  • Alert on HTTP requests to intranet/internal resources (e.g., instance metadata endpoints, internal APIs) originating from a LangChain process, which may indicate exploitation of the SitemapLoader SSRF to access local services, conduct port scans, or retrieve instance metadata.
  • Inspect sitemap XML documents supplied to LangChain SitemapLoader for URLs pointing to internal/private IP ranges or localhost; a malicious actor can embed intranet resource URLs in a crafted sitemap to trigger SSRF.
  • ·The patch for CVE-2023-46229 introduces a function called _extract_scheme_and_domain and an allowlist; defenders should verify the allowlist is properly configured to restrict crawling scope, as a misconfigured or overly permissive allowlist may still expose internal resources.
  • ·The SSRF vulnerability is triggered through the SitemapLoader's load method, which parses a user-supplied web_path as a sitemap XML and then fetches all URLs within it without restriction; any deployment accepting untrusted sitemap URLs is at risk on versions before 0.0.317.

CVSS provenance

nvdv3.18.8HIGHCVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
vulncheck8.8HIGH
vendor_redhat8.8HIGH
CVEs like this are exactly what “Exploited This Week” covers.

Every Monday: what got weaponized or added to CISA KEV in the last seven days — each CVE cross-linked to its PoC, Nuclei template, and detection rule. Free, one email a week, unsubscribe in one click.