THE PENULTIMATE WAVE OF AI

I don’t think r1 will get us to artificial super intelligence, but whatever comes next probably will.

We are reaching a familiar bottleneck in AI. Previously, humans had to manually hardcode the patterns that AI could recognize. With deep learning, machines began to learn patterns on their own, without human assistance. With (relatively) expensive humans out of the loop, we threw machines at the world’s data until they began to talk, code, and paint. Many people believed this would be sufficient to reach artificial super intelligence–but it wasn’t.

We ran out of data. Luckily, our newborn bots could talk 24/7–not just to hundreds of millions of people, but also to other programs. These programs would ask the bots questions, verify their answer, and then use the correct answers to further improve the models. Free, infinite data. If the world’s data wasn’t enough, then infinite data must be–but it wasn’t.

We are now watching the rise of reasoners–models that “think” before giving an answer. Models which generate a series of words to raise the probability of eventually producing something we want. Surely, surely, this is it. Once we train a model to reason, it will be able to reason about its own answers, and somehow, magically, self-improve.

This infinite self-improvement probably won’t happen. In the same way that a fixed amount of mass can’t produce infinite energy, I suspect a fixed amount of information can’t produce infinite intelligence, no matter how much we feed it back into itself. Fundamentally, a model needs information from the outside, whether that information is a response from an external system or a human filtering its prior output for quality.

We provide some external information using formal verification systems, which is why math and programming performace is the prominent flex from the latest models. But most domains, like biology, business, and rocket science, are not so lucky; the correct answer is rarely obvious, even for programming. How does one automatically verify an interface is easy to use?

The answer still comes down to humans. Much to the dismay of a few arrogant men in San Francisco, people are still a key ingredient. This time, we don’t merely tell machines what patterns to look for; rather, we filter synthetic data to “steer” the model towards answers we prefer, i.e, we tell them what patterns to look for, but cheaper.

At this point, if the right mixture of curated data could yield superintelligence, it would have–models are now crushing benchmarks which PhDs struggle on. This is ridiculous. We’re going to wring these models of performance until they’re dry, and even then they won’t be superintelligent. Aside from extending formal verification, the way forward still seems to be curating the reasoning data these models generate and feeding it back in. As chains of reasoning become longer and more complex, human-curated data will likely yield increasingly diminishing returns.

We’ll be approaching the limit of what these systems can do. Don’t get me wrong, they can do a lot–they’re basically magic–but they can’t generally self-improve without humans or verifiers. And honestly, I can’t imagine what else comes next besides self-improving AI. We don’t know what that looks like–maybe some multiagent game-playing system–but whatever it is, it will be the last. For real, this time. Probably.

2025-01-28

/blog/the-penultimate-wave-of-ai/ jarbus

Read next