Add SSRF guard for outbound URL fetches by alexkroman · Pull Request #261 · AssemblyAI/cli

alexkroman · 2026-06-23T13:49:52Z

Adds a new ssrf module that validates URLs before fetching them, preventing Server-Side Request Forgery attacks by refusing to connect to private, loopback, link-local, or internal addresses. The guard resolves hostnames to IP addresses and inspects them, catching both direct attempts to access internal addresses and public URLs that redirect to internal ones.

Key changes

New aai_cli/core/ssrf.py module: Provides assert_public_url() to validate that a URL's hostname resolves only to public IP addresses. Covers IPv4 and IPv6 (including IPv4-mapped IPv6), detects loopback, RFC 1918 private ranges, link-local (including the 169.254.169.254 cloud-metadata address), unique-local, and multicast addresses. Raises BlockedURLError (a UsageError subclass) for non-public hosts.
Manual redirect handling in webpage._fetch(): Changed from follow_redirects=True to manual per-hop redirect following so the SSRF guard runs on every redirect target. A public URL can redirect to an internal one, so each hop must be validated. Implements a redirect hop cap (MAX_REDIRECTS = 5) to prevent infinite loops. Also adds response body size capping (_MAX_BYTES = 10 MB) to prevent memory exhaustion from hostile URLs, and proper charset decoding using the response's declared charset rather than assuming UTF-8.
Manual redirect handling in feed._fetch(): Similarly updated to follow redirects manually with per-hop SSRF validation. Returns None on SSRF violations so the URL falls through to the API's server-side fetch.
Comprehensive test coverage: New tests/test_ssrf.py tests IP classification (internal vs. public), URL validation, and error cases. Updated tests/test_webpage.py with fixtures that stub DNS to a public IP for hermeticity, plus new tests for SSRF blocking, redirect-to-internal detection, body size capping, charset decoding, redirect loops, and DNS failures. Updated tests/test_transcribe_feed.py with similar fixtures and tests for redirect following and SSRF violations.

Implementation details

DNS resolution is abstracted via _resolve_host() so tests can stub it without network access while still exercising the real ipaddress-based IP classification.
The BlockedURLError exception is a UsageError subclass so it renders as a clean exit-2 message and integrates with existing error handling.
Redirect validation happens before the request body is read, minimizing wasted bandwidth on blocked targets.
The guard is stdlib-only (socket, ipaddress, urllib.parse) plus the existing errors module, keeping the import footprint minimal.

https://claude.ai/code/session_01AWsVSeWJjTXE6bsG4e1J3V

…check) The `read_url` live-agent tool, `speak --url`, and the podcast-feed probe all fetched URLs through a guard that only string-matched the literal host (`risk._LOCAL_HOST`). That missed DNS-based bypasses (a public hostname that resolves to 127.0.0.1/169.254.169.254), alternate IP spellings (decimal/hex IPv4, IPv4-mapped IPv6), and — critically — redirects: the fetch followed 30x hops with no re-check, so a public URL could redirect to the cloud-metadata endpoint and the body came back to the model. Add `core/ssrf.py`: resolve the host via getaddrinfo and refuse any private/loopback/link-local/reserved/multicast IP via `ipaddress`, enforced on the initial URL and on every redirect hop. `core/webpage._fetch` and `app/transcribe/feed._fetch` now follow redirects manually and call the guard each hop. Also cap `webpage` response bodies at 10 MB so a hostile URL can't exhaust memory (the feed fetch already capped). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AWsVSeWJjTXE6bsG4e1J3V

alexkroman enabled auto-merge June 23, 2026 13:51

alexkroman added this pull request to the merge queue Jun 23, 2026

Merged via the queue into main with commit c16318e Jun 23, 2026
20 checks passed

alexkroman deleted the claude/security-review-e2d359 branch June 23, 2026 13:59

alexkroman mentioned this pull request Jun 23, 2026

Client-orchestrated voice agent: streaming TTS, --files sandbox, spoken approval #268

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SSRF guard for outbound URL fetches#261

Add SSRF guard for outbound URL fetches#261
alexkroman merged 1 commit into
mainfrom
claude/security-review-e2d359

alexkroman commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

alexkroman commented Jun 23, 2026

Key changes

Implementation details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants