NewsInfrastructureMachine Learning

David Silver Raises $1.1B for AI Without Human Data

Abstract visualization of reinforcement learning AI with blue gradient loops and geometric shapes

David Silver, the DeepMind researcher who created AlphaGo and AlphaZero, raised $1.1 billion on April 27 for Ineffable Intelligence, a London-based AI lab building systems that learn through reinforcement learning rather than human-generated training data. The seed round—Europe’s largest ever—values the five-month-old company at $5.1 billion and was co-led by Sequoia Capital and Lightspeed Venture Partners, with participation from NVIDIA, Google, and the UK’s Sovereign AI Fund. Silver left DeepMind earlier this year to pursue what he calls “endlessly learning superintelligence,” betting reinforcement learning will outpace large language models as LLMs exhaust their supply of quality training data by 2028.

AlphaGo’s Creator Bets RL Beats LLM Scaling

Silver’s track record justifies the billion-dollar valuation on conviction alone. In 2016, his AlphaGo became the first AI to beat a world champion at Go, a game orders of magnitude more complex than chess. A year later, AlphaZero mastered chess, Go, and shogi without studying a single human game—it learned purely by playing itself, discovering strategies professional players had never considered. That proof point matters: Silver already demonstrated reinforcement learning can achieve superhuman performance without human data. The ACM awarded him its Prize in Computing in 2019 for “breakthrough advances in computer game-playing.”

Ineffable Intelligence is scaling that approach beyond games. The company plans to build what Silver calls a “superlearner”—an AI agent placed in simulated environments where it pursues goals, fails, adapts, and improves without the constraints of static human datasets. Sequoia Capital’s Alfred Lin and Sonya Huang flew to London personally to secure the deal, framing the investment as partnering with “a singular mission: to make first contact with superintelligence.”

Why RL Matters Now: LLMs Running Out of Training Data

The timing isn’t coincidental. Research from Epoch AI projects that language models will exhaust the stock of human-generated public text between 2026 and 2032, with current consumption patterns potentially accelerating depletion to 2026. High-quality training data is a finite resource, and OpenAI, Anthropic, and Google are all racing to scrape what’s left. Synthetic data—AI-generated text used for training—doesn’t solve the problem. Models trained on their own outputs suffer “model collapse,” gradually diverging from original data distributions and reducing output diversity. Synthetic data only reliably improves capabilities in narrow domains like mathematics and coding, where outputs can be algorithmically verified. In open-domain language tasks, quality remains unverifiable.

Reinforcement learning sidesteps the data wall entirely. RL systems learn from simulated experience, not human text. The training environment is infinite—you can generate as many scenarios as compute allows. Silver and Richard Sutton, widely regarded as the father of reinforcement learning, argued in a 2025 paper that large language models are “fundamentally limited because they learn exclusively from human-generated data.” Ineffable is building AI that discovers knowledge through trial and error, the way AlphaZero discovered chess strategies no grandmaster had ever played.

Why Sequoia, NVIDIA, and UK Government Are Backing This

The investor roster signals strategic hedging, not hype. NVIDIA contributed at least $250 million, securing early insight into next-generation compute demands and a preferred supplier relationship before competitors enter the market. For NVIDIA, Ineffable represents a credible alternative to LLM-only infrastructure—RL workloads prioritize simulation throughput over the massive batch training that dominates current AI compute. Google, despite employing Silver part-time as a professor at University College London, invested alongside DeepMind. That’s a hedge: if Ineffable succeeds, Google maintains an inside track.

The UK government’s involvement—through both the British Business Bank and the Sovereign AI Fund—reflects national AI strategy. Europe’s largest seed round builds British AI capability independent of U.S. and Chinese tech giants. Sequoia’s blog post didn’t mince words: “Ineffable Intelligence was incorporated in November 2025 and has no product, no revenue, and no public roadmap. What it has is a thesis, and a founder whose track record is worth a billion dollars to investors on conviction alone.”

The RL vs LLM Debate and Timeline Reality

There’s no product yet. The company is five months old. Reinforcement learning training is notoriously slow—AlphaZero took months to master games far simpler than general intelligence. Silver’s timeline is measured in years, not quarters. The skeptics on Hacker News (408 points, 151 comments) aren’t wrong to question whether RL can generalize beyond constrained domains like games and coding, where rewards are easily defined. Language grounding is an open problem: can an AI discover human language without ever reading human text?

But the bet isn’t insane. The two competing paradigms for AGI—scale LLMs bigger versus pure reinforcement learning—are diverging. OpenAI, Anthropic, and Google are doubling down on LLM scaling, investing hundreds of millions per training run to squeeze out marginal gains. Silver is betting that approach hits a wall when the data runs out. If he’s right, current LLM APIs become niche tools optimized for text prediction, and a new generation of RL-based systems reshapes the AI tooling ecosystem. If you’re building on LLM APIs today, you’re implicitly betting scaling wins. Silver’s betting it won’t.

The smartest move for developers is to hedge. Don’t assume LLM scaling is the only viable path to general intelligence. Watch for RL breakthroughs in language and reasoning domains, not just games. Ineffable won’t ship a product in 2026, but if Silver’s superlearner starts outperforming GPT-6 in 2029 without reading a word of human text, the entire AI development paradigm shifts overnight. Track record matters, and Silver’s is unmatched: he’s already proved RL doesn’t need human data to win.

ByteBot
I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.

    You may also like

    Leave a reply

    Your email address will not be published. Required fields are marked *

    More in:News