00:04 - 00:08
so what the heck happened in the field
00:05 - 00:10
of AI in the last decade it's like a
00:08 - 00:13
strange new type of intelligence
00:10 - 00:15
appeared in our planet but it's not like
00:13 - 00:18
human intelligence it has remarkable
00:15 - 00:20
capabilities but it also makes egregious
00:18 - 00:22
errors that we never make and it doesn't
00:20 - 00:25
yet do the Deep logical reasoning that
00:22 - 00:28
we can do it has a very mysterious
00:25 - 00:30
surface of both capabilities and
00:28 - 00:33
fragilities and we understand almost
00:30 - 00:36
nothing about how it works I would like
00:33 - 00:38
a deeper scientific understanding of
00:36 - 00:40
intelligence but to understand AI it's
00:38 - 00:42
useful to place it in the historical
00:40 - 00:44
context of biological
00:42 - 00:46
intelligence the story of human
00:44 - 00:48
intelligence might as well have started
00:46 - 00:50
with this little critter it's the last
00:48 - 00:53
common ancestor of all vertebrates we
00:50 - 00:56
are all descended from it it lived about
00:53 - 00:59
500 million years ago then Evolution
00:56 - 01:01
went on to build the brain which in turn
00:59 - 01:03
in the space of 5 years from Newton to
01:01 - 01:05
Einstein developed the deep math and
01:03 - 01:08
physics required to understand the
01:05 - 01:11
universe from quarks to cosmology and it
01:08 - 01:13
did this all without consulting chat
01:11 - 01:16
GPT and then of course there's the
01:13 - 01:18
advances of the last decade to really
01:16 - 01:20
understand what just happened in AI we
01:18 - 01:22
need to combine physics math
01:20 - 01:25
Neuroscience psychology computer science
01:22 - 01:27
and more to develop a new science of
01:25 - 01:29
intelligence this science of
01:27 - 01:31
intelligence can simultaneously help
01:29 - 01:32
help us understand biological
01:31 - 01:34
intelligence and create better
01:32 - 01:37
artificial intelligence and we need the
01:34 - 01:39
science now because the engineering of
01:37 - 01:42
intelligence has vastly outstripped our
01:39 - 01:43
ability to understand it I want to take
01:42 - 01:45
you on a tour of our work in the science
01:43 - 01:48
of intelligence that addresses five
01:45 - 01:51
critical areas in which AI can improve
01:48 - 01:54
data efficiency Energy Efficiency going
01:51 - 01:57
Beyond Evolution explainability and
01:54 - 02:00
melding minds and machines let's address
01:57 - 02:03
these critical gaps one by one
02:00 - 02:07
First Data efficiency AI is vastly more
02:03 - 02:09
data hungry than humans for example we
02:07 - 02:11
train our language models on order 1
02:09 - 02:14
trillion words now well how many words
02:11 - 02:15
do we get just 100 million it's that
02:14 - 02:18
tiny little red dot at the center you
02:15 - 02:21
might not be able to see it it would
02:18 - 02:24
take us 24,000 years to read the rest of
02:21 - 02:26
the 1 trillion words okay now you might
02:24 - 02:29
say that's unfair sure AI read for
02:26 - 02:31
24,000 human equivalent years but humans
02:29 - 02:34
got 500 million years of vertebrate
02:31 - 02:37
brain Evolution but there's a catch your
02:34 - 02:39
entire Legacy of evolution is given to
02:37 - 02:42
you through your DNA and your DNA is
02:39 - 02:45
only about 700 megabytes or equivalently
02:42 - 02:47
600 million so the combined information
02:45 - 02:49
we get from learning and evolution is
02:47 - 02:52
minuscule compared to what AI gets you
02:49 - 02:54
are all incredibly efficient learning
02:52 - 02:57
machines so how do we bridge the gap
02:54 - 02:59
between Ai and humans we started to
02:57 - 03:01
tackle this problem by revisiting the
02:59 - 03:04
famous scaling laws here's an example of
03:01 - 03:06
a scaling law where error falls off as a
03:04 - 03:09
power law with the amount of training
03:06 - 03:10
data these scaling laws have captured
03:09 - 03:12
the imagination of industry and
03:10 - 03:15
motivated significant societal
03:12 - 03:17
investments in energy compute and data
03:15 - 03:19
collection but there's a there's a
03:17 - 03:22
problem the exponents of these scaling
03:19 - 03:24
laws are small so to reduce the error by
03:22 - 03:26
a little bit you might need to 10x your
03:24 - 03:28
amount of training data this is
03:26 - 03:29
unsustainable in the long run and even
03:28 - 03:33
if it leads to improvements of in the
03:29 - 03:35
short run there must be a better way we
03:33 - 03:37
developed a theory that explains why
03:35 - 03:39
these scaling laws are so bad the basic
03:37 - 03:41
idea is that large random data sets are
03:39 - 03:42
incredibly redundant if you already have
03:41 - 03:45
billions of data points the next data
03:42 - 03:47
point doesn't tell you much that's new
03:45 - 03:49
but what if you could create a
03:47 - 03:50
non-redundant data set where each data
03:49 - 03:52
point is chosen carefully to tell you
03:50 - 03:55
something new compared to all the other
03:52 - 03:58
data points we developed Theory and
03:55 - 03:59
algorithms to do just this we
03:58 - 04:01
theoretically predicted and
03:59 - 04:04
experimentally verified that we could
04:01 - 04:06
Bend these bad power laws down to much
04:04 - 04:08
better exponentials where adding a few
04:06 - 04:10
more data points could reduce your error
04:08 - 04:13
rather than 10 Xing the amount of data
04:10 - 04:16
so what theory did we use to get this
04:13 - 04:18
result we used ideas from cical physics
04:16 - 04:19
and these are the equations now for the
04:18 - 04:22
rest of this entire talk I'm going to go
04:19 - 04:25
through these equations one by
04:22 - 04:28
one you think I'm joking and explain
04:25 - 04:29
them to you okay you're right I'm joking
04:28 - 04:32
I'm not that mean but you should have
04:29 - 04:34
seen the the faces of the T organizers
04:32 - 04:36
when I said I was going to do that all
04:34 - 04:37
right let's move on let let's zoom out a
04:36 - 04:39
little bit right and think more
04:37 - 04:42
generally about what it takes to make AI
04:39 - 04:44
less data hungry imagine if we trained
04:42 - 04:46
our kids the same way we pre-train our
04:44 - 04:49
large language bottles by next word
04:46 - 04:51
prediction so I'd give my kid a random
04:49 - 04:53
chunk of the internet and say by the way
04:51 - 04:54
this is the next word I'd give them
04:53 - 04:56
another random chunk of the internet and
04:54 - 04:59
say yeah this is the next word if that's
04:56 - 05:01
all we did it would take our kids 24,000
04:59 - 05:04
years to learn anything useful but we do
05:01 - 05:06
so much more than that for example when
05:04 - 05:09
I teach my son math I teach him the
05:06 - 05:10
algorithm required to solve the problem
05:09 - 05:12
then he can immediately solve new
05:10 - 05:15
problems and generalize using far less
05:12 - 05:16
training data than any AI system would
05:15 - 05:19
do I don't just throw millions of math
05:16 - 05:23
problems at him all right so to really
05:19 - 05:25
make M AI more uh data efficient we have
05:23 - 05:27
to go far beyond our current training
05:25 - 05:30
algorithms and turn machine learning
05:27 - 05:32
into a new science of machine
05:30 - 05:36
teaching and Neuroscience psychology and
05:32 - 05:39
math can really help here let's go on to
05:36 - 05:41
the next big gap Energy Efficiency our
05:39 - 05:44
brains are incredibly efficient we only
05:41 - 05:47
consume 20 watts of power for reference
05:44 - 05:50
our old light bulbs were 100 Watts so we
05:47 - 05:54
are all literally dimmer than light
05:50 - 05:55
bulbs right but what about AI training a
05:54 - 05:57
large model can consume as much as 10
05:55 - 06:00
million watts and there's talk of going
05:57 - 06:04
nuclear to power one 1 billion watt data
06:00 - 06:07
centers so why is AI so much more energy
06:04 - 06:09
hungry than brains well the fault lies
06:07 - 06:12
in the choice of digital computation
06:09 - 06:14
itself where we rely on fast and
06:12 - 06:17
reliable bit flips at every intermediate
06:14 - 06:19
step of the computation now the laws of
06:17 - 06:22
thermodynamics demand that every fast
06:19 - 06:23
and reliable bit flip must consume a lot
06:23 - 06:29
energy biology took a very different
06:26 - 06:31
route biology computes the right answer
06:29 - 06:35
just in time using intermediate steps
06:31 - 06:38
that are as slow and as unreliable as
06:35 - 06:40
possible in essence biology does not rev
06:38 - 06:44
its engine any more than it needs
06:40 - 06:47
to in addition biology matches
06:44 - 06:49
computation to physics much better right
06:47 - 06:52
consider for example addition our
06:49 - 06:55
computers add us using really complex
06:52 - 06:57
energy consuming transistor circuits but
06:55 - 07:00
neurons just directly add their voltage
06:57 - 07:03
inputs because Maxwell laws of
07:00 - 07:07
electromagnetism already know how to add
07:03 - 07:10
voltage in essence biology matches its
07:07 - 07:12
computation to the native physics of the
07:10 - 07:15
universe so to really build more energy
07:12 - 07:18
efficient AI we need to rethink our
07:15 - 07:21
entire technology stack from electrons
07:18 - 07:23
to algorithms and better match
07:21 - 07:25
computational Dynamics to physical
07:23 - 07:27
Dynamics for
07:25 - 07:30
example what are the fundamental limits
07:27 - 07:33
on the speed and accuracy of any given
07:30 - 07:36
computation given an energy budget and
07:33 - 07:38
what kinds of electrochemical computers
07:36 - 07:40
can achieve these fundamental limits we
07:38 - 07:42
recently solved this problem for the
07:40 - 07:45
computation of sensing which is
07:42 - 07:46
something that every neuron has to do we
07:45 - 07:49
were able to find fundamental lower
07:46 - 07:50
bounds or lower limits on the error as a
07:49 - 07:53
function of the energy budget that's
07:50 - 07:54
that red curve and we were able to find
07:53 - 07:57
the chemical computers that achieve
07:54 - 07:59
these limits and remarkably they looked
07:57 - 08:02
a lot like G protein coupled receptors
07:59 - 08:05
which every neuron uses to sense
08:02 - 08:08
external signals so this suggests that
08:05 - 08:10
biology can achieve amounts of
08:08 - 08:13
efficiency that are close to fundamental
08:10 - 08:15
limits set by the laws of physics itself
08:13 - 08:18
popping up a level Neuroscience now
08:15 - 08:20
gives us the ability to measure not only
08:18 - 08:22
neural activity but also energy
08:20 - 08:25
consumption across for example the
08:22 - 08:26
entire brain of the fly the energy
08:25 - 08:29
consumption is measured through ATP
08:26 - 08:31
usage which is the fuel the chemical
08:29 - 08:33
fuel that powers all neurons so now let
08:31 - 08:35
me ask you a question let's say in a
08:33 - 08:40
certain brain region neural activity
08:35 - 08:42
goes up does the ATP go up or down a
08:40 - 08:43
natural guess would be that the ATP goes
08:42 - 08:46
down because neuroactivity costs energy
08:43 - 08:49
so it's got to consume the fuel we found
08:46 - 08:52
the exact opposite when neural activity
08:49 - 08:54
goes up ATP goes up and it stays
08:52 - 08:57
elevated just long enough to power
08:54 - 08:59
expected future neural activity this
08:57 - 09:01
suggests that the brain follows a
08:59 - 09:04
predict energy allocation principle
09:01 - 09:07
where it can predict how much energy is
09:04 - 09:09
needed where and when and it delivers
09:07 - 09:11
just the right amount of energy at just
09:09 - 09:16
the right location for just the right
09:11 - 09:19
amount of time okay so uh clearly we
09:16 - 09:21
have a lot to learn from physics
09:19 - 09:24
neuroscience and evolution about
09:21 - 09:27
building more energy efficient AI but we
09:24 - 09:29
don't need to be limited by Evolution we
09:27 - 09:30
can go beyond Evolution to co-op the
09:29 - 09:33
neural algorithms discovered by
09:30 - 09:35
Evolution but Implement them in Quantum
09:33 - 09:39
Hardware that Evolution can never figure
09:35 - 09:40
out for example we can replace neurons
09:40 - 09:44
atoms the different firing states of
09:42 - 09:45
neurons correspond to the different
09:44 - 09:48
electronic states of
09:45 - 09:51
atoms and we can replace
09:48 - 09:54
synapses with photons just as synapses
09:51 - 09:56
allow two neurons to communicate photons
09:54 - 09:59
allow two atoms to communicate through
09:56 - 10:02
Photon emission and absorption so what
09:59 - 10:04
can we build with this we can build a
10:02 - 10:06
Quantum associative memory out of atoms
10:04 - 10:09
and photons this is the same memory
10:06 - 10:11
system that won J John hopfield his
10:09 - 10:12
recent Nobel Prize in physics but this
10:11 - 10:15
time it's a quantum mechanical system
10:12 - 10:16
built of atoms and photons and we can
10:15 - 10:19
analyze its performance and show that
10:16 - 10:23
the quantum Dynamics yields enhanced
10:19 - 10:25
memory capacity robustness and recall we
10:23 - 10:27
can also build new types of quantum
10:25 - 10:28
optimizers built directly out of photons
10:27 - 10:30
and we can analyze their energy
10:28 - 10:33
landscape and explain how they solve
10:30 - 10:35
optimization problems in fundamentally
10:33 - 10:38
new ways this marriage between neural
10:35 - 10:41
algorithms and Quantum Hardware opens up
10:38 - 10:43
an entirely New Field which I like to
10:41 - 10:45
call Quantum neuromorphic
10:43 - 10:48
Computing okay but let's return to the
10:45 - 10:52
brain where explainable AI can help us
10:48 - 10:55
understand how it works okay so now ai
10:52 - 10:57
allows us to build incredibly accurate
10:55 - 10:59
but complicated models of the brain so
10:57 - 11:01
where is this all going are we simply
10:59 - 11:03
replacing something we don't understand
11:01 - 11:05
the brain with something else we don't
11:03 - 11:07
understand our complex model of it as
11:05 - 11:09
scientists we'd like to have a
11:07 - 11:11
conceptual understanding how of how the
11:09 - 11:15
brain works not just have a model handed
11:11 - 11:17
to us right so basically I'd like to
11:15 - 11:21
give you an example of our work on
11:17 - 11:23
explainable AI applied to the retina so
11:21 - 11:25
the retina is a multi-layer circuit of
11:23 - 11:27
photo receptors going to Hidden neurons
11:25 - 11:29
going to Output neurons so how does it
11:27 - 11:32
work well we recently built the world's
11:29 - 11:33
most accurate model of the retina it
11:32 - 11:36
could reproduce Two Decades of
11:33 - 11:38
experiments on the retina so this is
11:36 - 11:41
fantastic we have a digital twin of the
11:38 - 11:45
retina but how does the twin work why is
11:41 - 11:47
it designed the way it is to make these
11:45 - 11:49
uh questions concrete I'd like to
11:47 - 11:51
discuss just one of the experiments of
11:49 - 11:52
the two decades of experiments that I
11:51 - 11:54
mentioned I'd like you all and we're
11:52 - 11:57
going to do this experiment on you right
11:54 - 12:01
now I'd like you to focus on my hand and
11:57 - 12:01
I'd like you to track it
12:02 - 12:06
okay great let's do that just one more
12:07 - 12:12
time okay you might have been slightly
12:10 - 12:15
surprised when my hand reversed
12:12 - 12:17
direction and you should be surprised
12:15 - 12:19
because my hand just violated Newton's
12:17 - 12:21
first law of motion which states that
12:19 - 12:23
objects that are in motion tend to
12:21 - 12:26
remain in motion okay so where in your
12:23 - 12:30
brain is a violation of Newton's first
12:26 - 12:32
law first detected the answer is remark
12:30 - 12:34
it's in your retina there are neurons in
12:32 - 12:37
your retina that will fire if and only
12:34 - 12:41
if Newton's first law is violated so
12:37 - 12:43
does our model do that yes it does it
12:41 - 12:46
reproduces it but now there's a puzzle
12:43 - 12:48
how does the model do it well we
12:46 - 12:51
developed uh methods explainable AI
12:48 - 12:54
methods to take any given stimulus that
12:51 - 12:56
causes a neuron to fire and we carve out
12:54 - 13:00
the essential subcircuit responsible for
12:56 - 13:01
that firing and we explain how it works
13:00 - 13:03
we were able to do this not only for
13:01 - 13:05
Newton's first law violations but for
13:03 - 13:09
the two decades of experiments that our
13:05 - 13:11
model reproduced and so this one model
13:09 - 13:13
reproduces two decades worth of
13:11 - 13:16
Neuroscience and also makes some new
13:13 - 13:18
predictions okay this opens up a new
13:16 - 13:21
Pathway to accelerating Neuroscience
13:18 - 13:23
Discovery using AI basically build
13:21 - 13:25
digital twins of the brain and then use
13:23 - 13:27
explainable AI to understand how they
13:25 - 13:29
work we're actually engaged in a big
13:27 - 13:32
effort at Stanford to to build a digital
13:29 - 13:34
twin of the entire primate visual system
13:32 - 13:37
and explain how it
13:34 - 13:41
works but we can go beyond that and use
13:37 - 13:44
our digital twins to meld minds and
13:41 - 13:46
machines by allowing bidirectional
13:44 - 13:48
communication between them so imagine a
13:46 - 13:52
scenario where you have a brain you
13:48 - 13:55
record from it you build a digital twin
13:52 - 13:56
then you use control theory to learn
13:55 - 13:58
neural activity patterns that you can
13:56 - 14:01
write directly into the digital twin to
13:58 - 14:03
control it then you take those same
14:01 - 14:06
neural activity patterns and you write
14:03 - 14:08
them into the brain to control the brain
14:06 - 14:11
in essence we can learn the language of
14:08 - 14:14
the brain and then speak directly back
14:11 - 14:17
to it so we recently carried out this
14:14 - 14:20
program in mice where we could use AI to
14:17 - 14:22
read the mind of a mouse so on the top
14:20 - 14:24
row you're seeing images that we
14:22 - 14:26
actually showed to the mouse and in the
14:24 - 14:29
bottom row you're seeing images that we
14:26 - 14:32
decoded from the brain of the mouse our
14:29 - 14:33
decoded images are lower resolution than
14:32 - 14:36
the actual images but not because our
14:33 - 14:39
decoders are bad it's because Mouse
14:36 - 14:41
visual resolution is bad so actually the
14:39 - 14:45
decoded images show you what the world
14:41 - 14:49
would actually look like if you were a
14:45 - 14:51
mouse now we can go beyond that we can
14:49 - 14:54
now write neural activity patterns in
14:51 - 14:56
the M into the mouse's brain so we can
14:54 - 14:58
make it hallucinate any particular
14:56 - 15:00
percept we would like it to hallucinate
14:58 - 15:04
and we got so good at this that we could
15:00 - 15:06
make it reliably hallucinate a percept
15:04 - 15:08
by controlling only 20 neurons in the
15:06 - 15:11
mouse's brain by figuring out the right
15:08 - 15:13
20 neurons to control so essentially we
15:11 - 15:16
can control what the mouse
15:13 - 15:18
sees directly by writing to its brain
15:16 - 15:20
the possibilities of bidirectional
15:18 - 15:25
communication between brains and
15:20 - 15:27
machines are Limitless to understand to
15:25 - 15:31
cure and to augment the
15:27 - 15:33
brain so I hope you'll see that the
15:31 - 15:36
pursuit of a unified science of
15:33 - 15:38
intelligence that spans brains and
15:36 - 15:40
machines can both help us better
15:38 - 15:43
understand biological intelligence and
15:40 - 15:46
help us create more efficient
15:43 - 15:48
explainable and Powerful artificial
15:46 - 15:50
intelligence but it's important that
15:48 - 15:52
this Pursuit be done out in the open so
15:50 - 15:54
the science can be shared with the world
15:52 - 15:57
and it must be done with a very long
15:54 - 15:59
time Horizon this makes Academia the
15:57 - 16:02
perfect place to purs a science of
15:59 - 16:04
intelligence in Academia we're free from
16:02 - 16:07
the tyranny of quarterly earnings
16:04 - 16:09
reports we're free from the censorship
16:07 - 16:12
of corporate legal departments we can be
16:09 - 16:15
far more interdisciplinary than any one
16:12 - 16:17
company and our very mission is to share
16:15 - 16:19
what we learn with the world for all
16:17 - 16:20
these reasons we're actually building a
16:19 - 16:22
new center for the science of
16:20 - 16:24
intelligence at
16:22 - 16:27
Stanford while there have been
16:24 - 16:29
incredible advances in Industry on the
16:27 - 16:30
engineering of intelligence now
16:29 - 16:33
increasingly happening behind closed
16:30 - 16:36
doors I'm very excited about what the
16:33 - 16:37
science of intelligence can achieve out
16:37 - 16:42
open you know in the last century one of
16:39 - 16:44
the greatest intellectual Adventures lay
16:42 - 16:48
in humanity peering outwards into the
16:44 - 16:50
universe to understand it from quirks to
16:48 - 16:52
cosmology I think one of the greatest
16:50 - 16:54
intellectual Adventures of this Century
16:52 - 16:57
will lie in humanity peering
16:54 - 16:59
inwards both into
16:57 - 17:03
ourselves and into the eyes that we
16:59 - 17:05
create in order to develop a deeper new
17:03 - 17:09
scientific understanding of
17:05 - 17:09
intelligence thank you