Richard Sutton


DeepMind's Richard Sutton - The Long-term of AI & Temporal-Difference Learning

Published on Sep 6, 2017

DeepMind announced in July, 2017 that Prof. Richard Sutton would be leading DeepMind Alberta.

Richard S. Sutton is a Canadian computer scientist. Currently he is professor of Computer Science and iCORE chair at the University of Alberta. Dr. Sutton is considered one of the founding fathers of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning, policy gradient methods, the Dyna architecture.

Recorded: July 2017
 

Richard Sutton – Father of RL thinks LLMs are a dead end

Sep 26, 2025

Richard Sutton is the father of reinforcement learning, winner of the 2024 Turing Award, and author of The Bitter Lesson. And he thinks LLMs are a dead end. After interviewing him, my steel man of Richard’s position is this: LLMs aren’t capable of learning on-the-job, so no matter how much we scale, we’ll need some new architecture to enable continual learning. And once we have it, we won’t need a special training phase — the agent will just learn on-the-fly, like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete.

In our interview, I did my best to represent the view that LLMs might function as the foundation on which experiential learning can happen… Some sparks flew. A big thanks to the Alberta Machine Intelligence Institute for inviting me up to Edmonton and for letting me use their studio and equipment. Enjoy!

Transcript: dwarkesh.com/p/richard-sutton

𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒
00:00:00 – Are LLMs a dead end?
00:13:51 – Do humans do imitation learning?
00:23:57 – The Era of Experience
00:34:25 – Current architectures generalize poorly out of distribution
00:42:17 – Surprises in the AI field
00:47:28 – Will The Bitter Lesson still apply after AGI?
00:54:35 – Succession to AI
 
Back
Top