How Fable 5 found the SSRF in my phishing scanner

Sigmund

05 Jul 2026 • 7 min read

FalconEye v3.4.0 shipped in June with a phishing scanner tab. You paste a suspicious URL, the backend fetches the page, runs it through an indicator library, and returns whether the target is live and what phishing kit it matches. Simple, useful, and behind a validate_url function that I thought was safe.

Claude Code using Sonnet 4.6 wrote a good junk of it. The validate_url helper resolved the hostname, checked the IP against a blocklist of private ranges, and rejected anything internal. The fetch itself used httpx.AsyncClient(follow_redirects=True, verify=False). Tests passed. It ran in production for a few days.

Then I thoght, hell, lets give this Fable 5 a try, see what it can actually do, or if it was fully neutered by Ahtropic to avoid the Government order to remove it. I opened a fresh Claude Code session, launched Fable 5 with /effort xhigh, pointed it at the repo, and asked for an adversarial review. Twenty minutes later I had a security report calling out two independent bypasses of my SSRF guard, plus a rate-limiting bug that had been silently draining my LLM budget for weeks.

The report came in a nice md format, with summary and details, good enough for you to fix the identified vulnerabilities:

PS: All those of course have been fixed before this article was published 😄

What Sonnet and I built

The scanner code was clean and the intent was right. Fable's own summary of the codebase called out real strengths first:

The codebase shows real security awareness: all SQL is parameterized, the LLM model is hardcoded with kill-switches, LLM output is HTML-escaped in the three LLM tabs, the whois subprocess uses list-form args over a strictly-validated domain, and inputs are normalized through tight allowlists. There is no SQL injection, no command injection, no eval/exec/deserialization, and no API-key leakage into logs or responses.

That is a defensible starting point. The problems were in the two places you would only catch by thinking adversarially: SSRF and the reverse-proxy header chain.

The SSRF

Fable's write-up of H-1:

validate_url resolves the hostname once and rejects private/loopback/link-local ranges. Two independent bypasses defeat it:Redirect following. follow_redirects=True means an attacker-controlled public host can return HTTP 302 Location: http://169.254.169.254/… or http://127.0.0.1:6379/…. httpx follows the redirect without re-running validate_url, so the guard only ever inspects the first hop.DNS rebinding (TOCTOU). validate_url calls socket.getaddrinfo() at check time; httpx performs its own DNS resolution at fetch time. An attacker who controls a domain's DNS can return a public IP during validation and an internal IP (short TTL) during the fetch.

That is textbook stuff, and Sonnet's implementation had both holes. My tests all passed because I had tested for 127.0.0.1 and 169.254.169.254 in their normal IPv4 form. The tests were consistent with my mental model, not with reality.

Fable also verified that the IP blocklist itself was incomplete, and pasted the actual bypass results:

::ffff:127.0.0.1          blocked=False
::ffff:169.254.169.254    blocked=False   # cloud metadata
::ffff:a9fe:a9fe          blocked=False   # same, hex form
::                        blocked=False   # unspecified → localhost on many stacks
100.64.1.1                blocked=False   # CGNAT / shared address space

::ffff:169.254.169.254 is the AWS metadata service IP wrapped in IPv6 syntax. To a human it reads as private. To a naive membership check against IPv4 networks, it parses as an IPv6Address and slips right through. That was the single most important finding in the report.

The fix went to Sonnet. Fable's write-up had the shape of the correct pattern, and Sonnet ported it into a new app/utils/safe_fetch.py module. Per-hop revalidation on redirects. ipaddress stdlib classification (is_private, is_loopback, is_link_local, is_reserved, is_multicast, is_unspecified) with ipv4_mapped unwrapping before the check. TLS verification enforced. 18 unit tests including the mapped-IPv6 metadata case.

Deployed to production, then smoke-tested with the actual bypass URL:

curl -X POST https://falconeye.osintph.info/api/scanner/scan \
  -d '{"url":"http://[::ffff:169.254.169.254]/"}'
{"detail":"URL blocked: Hostname '::ffff:169.254.169.254' resolves to a blocked address"}

Closed.

The wallet bomb

The other finding I want to name is the one that had been silently costing me money. Fable's M-1:

gunicorn runs uvicorn.workers.UvicornWorker with no --forwarded-allow-ips, so uvicorn's ProxyHeadersMiddleware trusts only 127.0.0.1. The right-most non-127.0.0.1 entry is therefore the Cloudflare edge IP. request.client.host becomes a Cloudflare edge address for every request.

Every per-IP rate limit in FalconEye had been keying on Cloudflare edge IPs instead of real client IPs. That includes the ten-per-day LLM cost cap. My daily Anthropic API spend cap had never actually been per-user. It was per-Cloudflare-edge, meaning one abuser rotating through Cloudflare PoPs could burn through my budget as many times as there are edge locations.

Fix was a single-header change: key on CF-Connecting-IP, which nginx already restricted to Cloudflare source ranges. Sonnet added app/utils/client_ip.py, migrated every slowapi limiter and every SQLite rate-limit table to use it, and shipped. Verified against production traffic: the rate-limit database now records real client IPs (112.204.173.xx, my PLDT IP) instead of 172.69.68.3 (Cloudflare edge).

Calibration matters

Fable also flagged three XSS surfaces in the frontend (Telegram channel metadata, RDAP fields, RSS news content), a prompt injection concern in the LLM tabs, missing security headers at nginx, and a handful of lower-severity items. It also confirmed things that were actually correct in Sonnet's work: the whois subprocess handling was safe because of the strict domain regex, the script decoder path had no local execution, SQL was fully parameterized, and the temp-image URLs were properly signed and expiring.

That last part matters. A review that only names failures reads like a hit piece. A review that also names the things that hold up under scrutiny reads like calibrated engineering feedback.

Two reviewers, different priors

The workflow lesson is simple. Sonnet wrote FalconEye. Sonnet wrote the tests for FalconEye. Sonnet ran the tests and told me they passed. The tests passed because Sonnet's mental model was internally consistent, not because the code held up against adversarial input.

Fable did not catch these bugs because it is smarter than Sonnet at security. It caught them because it had not written the code and had not written the tests, and it approached the diff without any prior commitment to what the code was supposed to do. Independence matters more than raw capability.

The class of bug that gets caught this way is specific. Mapped IPv6. DNS rebinding. Wrong IP layer in a reverse proxy chain. Bugs where the naive implementation looks correct and fails only when something actively tries to break it. Human developers hit this too. It is not a model limitation, it is a general problem with self-tested code, and using a second model with different training is a cheap mitigation.

Sonnet then wrote the fixes for what Fable found. That worked well: Fable had already identified the class of bug and often the specific fix pattern, and Sonnet is fast and disciplined at porting patterns into working code with tests. Different models for different phases of the same job.

The safety filter friction

One complication worth naming honestly. When I first launched the review session, Fable did not run it. The classifier flagged the prompt as cybersecurity or biology content and downshifted me to Opus 4.8 with this message:

Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them.

I pay for Anthropic. The review was legitimate defensive work on my own code. The filter caught me anyway.

Shows that even though Anthropic advertises that "Fable is avalaible again" it is not the same. Questionable I would think.

The workaround was in this case easy, some additonal prompting convinced it to do the security review anyway, but, that might not work again, or might not work for you.

However, this shows that they did some serious change before re-releasing it for non us citizens. Shady.

The users who benefit most from frontier reasoning on security work are also the users most likely to trip the filter. Cybersecurity practitioners doing defensive work, incident responders, threat intelligence teams. If you are planning on doing that kind of work with Fable, you may be out of luck, but, give it a try with some creative prompts.

Do this in your workflow

If you are using LLMs to write security-critical code, run the fix session with whichever model you prefer. Get the tests green. Deploy it. Then open a fresh session with a different model, xhigh effort, and ask it adversarially to find what was missed. Do not share the fix session's context. Do not tell it what you expect it to find. Let it read the code cold.

You will find bugs. Twenty minutes of cold review is worth more than any additional round of tests written by the same model that wrote the code. If you are working on anything that touches auth, network egress, deserialization, or state that spans workers, treat the second-opinion review as a mandatory step, not an optional one.

This closed out FalconEye v3.5.0. safe_fetch handles mapped IPv6 correctly, rate limits track real users, three XSS surfaces are escaped, LLM output is schema-validated, and nginx serves proper security headers. All of it started with a twenty-minute cold read from a model that had not seen the code before.

Two reviewers, different priors. Cheapest security control I have added all year.

One important thing to add, none of these models alone can do the work, based on my experience, lot of back and forth and lot of revisions neede, but, they do make the process faster for the bulk of the project, maybe 80 to 90% based on my exprience.

The Github repo has the release notes, the portal is online again as well:

https://github.com/osintph/falconeye

https://falconeye.osintph.info

Reach out if you have questions or comments or what to collaborate

Session ID: 059db238ab37c3d92615c5cc24b694da29c598cc13e27886053722404118e14271