Mythos is Mostly Hype... (also the bugs it found were mostly unexploitable and exaggerated...)

Industry 1.2K points 222 comments 1 month ago

Source: https://www.tomshardware.com/tech-industry/artificial-intelligence/anthropics-claude-mythos-isnt-a-sentient-super-hacker-its-a-sales-pitch-claims-of-thousands-of-severe-zero-days-rely-on-just-198-manual-reviews Free access: https://clearthis.page/?u=https%3A%2F%2Fwww.tomshardware.com%2Ftech-industry%2Fartificial-intelligence%2Fanthropics-claude-mythos-isnt-a-sentient-super-hacker-its-a-sales-pitch-claims-of-thousands-of-severe-zero-days-rely-on-just-198-manual-reviews Source 2: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier Key quotes: \- Anthropic's blog and verbose 250-page report on the model... includes over **20 pages** of Anthropic staff waxing lyrically about their novel impressions of the new model and its **"fondness for particular philosophers."** \- Alongside the repeated suggestions from Anthropic and its staff that we should be concerned, nay, terrified, of what AI like Claude Mythos can do, they repeatedly suggest they're **unsure if this new AI is conscious.** \- In the case of the FFMPeg vulnerability that has existed for 16 years, **Anthropic's own analysis** of the release suggested **"This bug ultimately is not a critical severity vulnerability," and "would be challenging to turn this vulnerability into a functioning exploit."** \- Mythos reportedly found several potential exploits in the Linux kernel, but was **unable to exploit any of them** because of Linux's defense-in-depth security systems. A number of the exploits had also been recently patched, too, making it rather confusing why they were included in the total. \- We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis. **Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens.** A 5.1B-active open model recovered the core chain of the 27-year-old OpenBSD bug. TL;DR: Thousands of zero-days is false because most of the bugs were unexploitable or low-severity and they also only verified less than 200 of the bugs and extrapolated from there. Their research paper is mostly marketing hype. Eight cheap open-source models were able to find their exploits. There is one impressive thing here: An AI model can parse through a complex open-source project. However, with a month and endless compute, there's no doubt Opus could do the same. Unfortunately, **Anthropic never compared models directly (hmm why would they not compare models directly, that's kind of the whole point...?)** so we'll never know.

More from r/Anthropic