Talkie: a 13B LLM trained only on pre-1931 text used Claude Sonnet to help test the model and judge its output

Researchers Alec Radford (GPT, CLIP, Whisper), Nick Levine, and David Duvenaud just released **talkie**: a 13 billion parameter language model trained *exclusively* on text published before 1931. No internet. No Wikipedia. No World War II. Its worldview is frozen at December 31, 1930. **Why does this matter?** Every major LLM today (GPT, Claude, Gemini, Llama) ultimately shares a common ancestor: the modern web. That makes it nearly impossible to tell what these models genuinely *reason* versus what they simply *memorized*. Talkie breaks that lineage entirely. From the team: >*"It's an important question how much LM capabilities arise from memorization vs generalization. Vintage LMs enable unique generalization tests."* Interestingly, Claude has a direct role in talkie's creation: **Claude Sonnet 4.6** was used as the judge in talkie's reinforcement learning pipeline (online DPO), and Claude Opus 4.6 generated synthetic multi-turn conversations used in the final fine-tuning stage. The team even notes the irony: using a thoroughly modern LLM to help shape a model that's supposed to be frozen in 1930, and flagging it as a contamination risk they're actively working to eliminate in future versions. The most striking example: **talkie can learn to write Python code from just a few in-context examples... despite having zero modern code in its training data.** It's reasoning from 19th-century mathematics texts, not retrieval. **What it's being used to study** * **Long-range forecasting**: how well can a model "predict" the future from its frozen vantage point? * **Invention**: can it develop ideas that postdate its knowledge cutoff? * **LLM identity**: what makes a model *itself*? Talkie's alien data distribution helps isolate what's architecture vs. what's just "vibes absorbed from the web" **Links** * Chat with talkie live * Official blog post * Original announcement on X * Discussion on r/accelerate * Discussion on r/singularity Both models are **Apache 2.0 licensed** and open-weight on Hugging Face. The team is already planning a GPT-3-scale vintage model for later this year.

Talkie: a 13B LLM trained only on pre-1931 text used Claude Sonnet to help test the model and judge its output

More from r/ClaudeAI

You're right to push back.

Taught Claude to talk like a caveman to use 75% less tokens.

Opus tryna be TOO human