00:00 - 00:01
China's latest AI
00:01 - 00:03
breakthrough has leapfrogged
00:04 - 00:05
I think we should take the
00:05 - 00:06
development out of China
00:06 - 00:07
very, very seriously.
00:08 - 00:09
A game changing move that
00:09 - 00:11
does not come from OpenAI,
00:11 - 00:13
Google or Meta.
00:13 - 00:14
There is a new model that
00:14 - 00:16
has all of the valley
00:17 - 00:18
But from a Chinese lab
00:18 - 00:20
called Deepseek.
00:20 - 00:22
It's opened a lot of eyes of
00:22 - 00:23
like what is actually
00:23 - 00:24
happening in AI in China.
00:25 - 00:26
What took Google and OpenAI
00:26 - 00:27
years and hundreds of
00:27 - 00:28
millions of dollars to
00:28 - 00:30
build... Deepseek says took
00:30 - 00:32
it just two months and less
00:32 - 00:34
than $6 million dollars.
00:34 - 00:35
They have the best open
00:35 - 00:37
source model, and all the
00:37 - 00:38
American developers are
00:38 - 00:39
building on that.
00:39 - 00:40
I'm Deirdre Bosa with the
00:40 - 00:42
tech check take... China's
00:42 - 00:43
AI breakthrough.
00:53 - 00:55
It was a technological leap
00:55 - 00:57
that shocked Silicon Valley.
00:57 - 00:59
A newly unveiled free,
00:59 - 01:01
open-source AI model that
01:01 - 01:02
beats some of the most
01:02 - 01:03
powerful ones on the market.
01:03 - 01:04
But it wasn't a new launch
01:04 - 01:06
from OpenAI or model
01:06 - 01:07
announcement from Anthropic.
01:07 - 01:09
This one was built in the
01:09 - 01:11
East by a Chinese research
01:11 - 01:12
lab called Deepseek.
01:13 - 01:14
And the details behind its
01:14 - 01:16
development stunned top AI
01:16 - 01:17
researchers here in the U.S.
01:17 - 01:19
First-the cost.
01:19 - 01:20
The AI lab reportedly spent
01:20 - 01:22
just $5.6 million dollars to
01:22 - 01:24
build Deepseek version 3.
01:24 - 01:26
Compare that to OpenAI,
01:26 - 01:27
which is spending $5 billion
01:27 - 01:28
a year, and Google,
01:28 - 01:30
which expects capital
01:30 - 01:32
expenditures in 2024 to soar
01:32 - 01:34
to over $50 billion.
01:34 - 01:35
And then there's Microsoft
01:35 - 01:36
that shelled out more than
01:36 - 01:39
$13 billion just to invest
01:40 - 01:42
But even more stunning how
01:42 - 01:43
Deepseek's scrap pier model
01:43 - 01:45
was able to outperform the
01:45 - 01:46
lavishly-funded American
01:47 - 01:48
To see the Deepseek,
01:49 - 01:51
new model. It's super
01:51 - 01:52
impressive in terms of both
01:52 - 01:53
how they have really
01:53 - 01:55
effectively done an
01:55 - 01:56
open-source model that does
01:56 - 01:58
what is this inference time
01:58 - 01:59
compute. And it's super
01:59 - 02:00
compute efficient.
02:00 - 02:02
It beat Meta's Llama,
02:02 - 02:04
OpenAI's GPT 4-O and
02:04 - 02:05
Anthropic's Claude Sonnet
02:05 - 02:07
3.5 on accuracy on
02:07 - 02:08
wide-ranging tests.
02:08 - 02:09
A subset of 500 math
02:09 - 02:11
problems, an AI math
02:11 - 02:12
evaluation, coding
02:12 - 02:14
competitions, and a test of
02:14 - 02:16
spotting and fixing bugs in
02:16 - 02:18
code. Quickly following that
02:18 - 02:19
up with a new reasoning
02:19 - 02:20
model called R1,
02:20 - 02:22
which just as easily
02:22 - 02:23
outperformed OpenAI's
02:23 - 02:25
cutting-edge o1 in some of
02:25 - 02:26
those third-party tests.
02:26 - 02:29
Today we released Humanity's
02:29 - 02:31
Last Exam, which is a new
02:31 - 02:32
evaluation or benchmark of
02:32 - 02:34
AI models that we produced
02:34 - 02:36
by getting math,
02:36 - 02:37
physics, biology,
02:37 - 02:39
chemistry professors to
02:39 - 02:40
provide the hardest
02:40 - 02:41
questions they could
02:41 - 02:42
possibly imagine. Deepseek,
02:42 - 02:44
which is the leading Chinese
02:44 - 02:47
AI lab, their model is
02:47 - 02:48
actually the top performing,
02:48 - 02:50
or roughly on par with the
02:50 - 02:51
best American models.
02:51 - 02:52
They accomplished all that
02:52 - 02:53
despite the strict
02:53 - 02:54
semiconductor restrictions
02:54 - 02:55
that the U.S . government
02:55 - 02:57
has imposed on China,
02:57 - 02:58
which has essentially
02:58 - 02:59
shackled the amount of
02:59 - 03:01
computing power. Washington
03:01 - 03:02
has drawn a hard line
03:02 - 03:03
against China in the AI
03:03 - 03:05
race. Cutting the country
03:05 - 03:06
off from receiving America's
03:06 - 03:08
most powerful chips like...
03:08 - 03:10
Nvidia's H-100 GPUs.
03:10 - 03:11
Those were once thought to
03:11 - 03:13
be essential to building a
03:13 - 03:14
competitive AI model.
03:15 - 03:16
With startups and big tech
03:16 - 03:17
firms alike scrambling to
03:17 - 03:18
get their hands on any
03:18 - 03:20
available. But Deepseek
03:20 - 03:21
turned that on its head.
03:21 - 03:22
Side-stepping the rules by
03:22 - 03:24
using Nvidia's less
03:24 - 03:27
performant H-800s to build
03:27 - 03:29
the latest model and showing
03:29 - 03:30
that the chip export
03:30 - 03:31
controls were not the
03:31 - 03:33
chokehold D.C. intended.
03:33 - 03:34
were able to take whatever
03:34 - 03:36
hardware they were trained
03:36 - 03:37
on, but use it way more
03:37 - 03:38
efficiently.
03:38 - 03:40
But just who's behind Deep
03:40 - 03:42
seek anyway? Despite its
03:42 - 03:43
breakthrough, very,
03:43 - 03:45
very little is known about
03:45 - 03:46
its lab and its founder,
03:46 - 03:47
Liang Wenfeng.
03:48 - 03:49
According to Chinese media
03:49 - 03:50
reports, Deepseek was born
03:50 - 03:52
out of a Chinese hedge fund
03:52 - 03:53
called High Flyer Quant.
03:53 - 03:55
That manages about $8
03:55 - 03:56
billion in assets.
03:56 - 03:57
The mission, on its
03:57 - 03:58
developer site, it reads
03:58 - 04:00
simply: "unravel the mystery
04:00 - 04:02
of AGI with curiosity.
04:03 - 04:04
Answer the essential
04:04 - 04:06
question with long-termism."
04:06 - 04:08
The leading American AI
04:08 - 04:09
startups, meanwhile – OpenAI
04:09 - 04:11
and Anthropic – they have
04:11 - 04:12
detailed charters and
04:12 - 04:13
constitutions that lay out
04:13 - 04:14
their principles and their
04:14 - 04:15
founding missions,
04:15 - 04:17
like these sections on AI
04:17 - 04:19
safety and responsibility.
04:19 - 04:20
Despite several attempts to
04:20 - 04:22
reach someone at Deepeseek,
04:22 - 04:24
we never got a response.
04:24 - 04:26
How did they actually
04:26 - 04:27
assemble this talent?
04:27 - 04:28
How did they assemble all
04:28 - 04:29
the hardware? How did they
04:29 - 04:30
assemble the data to do all
04:30 - 04:32
this? We don't know, and
04:32 - 04:33
it's never been publicized,
04:33 - 04:34
and hopefully we can learn
04:35 - 04:36
But the mystery brings into
04:36 - 04:38
sharp relief just how urgent
04:38 - 04:40
and complex the AI face off
04:40 - 04:42
against China has become.
04:42 - 04:43
Because it's not just
04:43 - 04:44
Deepseek. Other,
04:44 - 04:45
more well -known Chinese AI
04:45 - 04:47
models have carved out
04:47 - 04:48
positions in the race with
04:48 - 04:49
limited resources as well.
04:50 - 04:51
Kai Fu Lee, he's one of the
04:51 - 04:53
leading AI researchers in
04:53 - 04:54
China, formerly leading
04:54 - 04:55
Google's operations there.
04:55 - 04:57
Now, his startup,
04:57 - 04:59
"Zero One Dot AI," it's
04:59 - 05:00
attracting attention,
05:00 - 05:01
becoming a unicorn just
05:01 - 05:02
eight months after founding
05:02 - 05:04
and bringing in almost $14
05:04 - 05:06
million in revenue in 2024.
05:06 - 05:08
The thing that shocks my
05:08 - 05:09
friends in the Silicon
05:09 - 05:10
Valley is not just our
05:10 - 05:12
performance, but that we
05:12 - 05:14
trained the model with only
05:14 - 05:17
$3 million, and GPT-4 was
05:17 - 05:18
trained by $80 to $100
05:19 - 05:20
Trained with just three
05:20 - 05:22
million dollars. Alibaba's
05:22 - 05:23
Qwen, meanwhile, cut costs
05:23 - 05:25
by as much as 85% on its
05:25 - 05:27
large language models in a
05:27 - 05:27
bid to attract more
05:27 - 05:29
developers and signaling
05:29 - 05:35
that the race is on.
05:37 - 05:38
China's breakthrough
05:38 - 05:40
undermines the lead that our
05:40 - 05:42
AI labs were once thought to
05:42 - 05:44
have. In early 2024,
05:44 - 05:45
former Google CEO Eric
05:45 - 05:46
Schmidt. He predicted China
05:46 - 05:48
was 2 to 3 years behind the
05:48 - 05:50
U.S . in AI.
05:50 - 05:51
But now , Schmidt is singing
05:51 - 05:52
a different tune.
05:52 - 05:54
Here he is on ABC's "This
05:55 - 05:56
I used to think we were a
05:56 - 05:57
couple of years ahead of
05:57 - 05:59
China, but China has caught
05:59 - 06:01
up in the last six months in
06:01 - 06:02
a way that is remarkable.
06:02 - 06:03
The fact of the matter is
06:03 - 06:06
that a couple of the Chinese
06:06 - 06:07
programs, one,
06:07 - 06:08
for example, is called Deep
06:08 - 06:10
seek, looks like they've
06:11 - 06:12
It raises major questions
06:12 - 06:15
about just how wide open
06:15 - 06:16
AI's moat really is.
06:16 - 06:17
Back when OpenAI released
06:17 - 06:19
ChatGPT to the world in
06:19 - 06:20
November of 2022,
06:21 - 06:22
it was unprecedented and
06:22 - 06:24
uncontested.
06:24 - 06:25
Now, the company faces not
06:25 - 06:26
only the international
06:26 - 06:27
competition from Chinese
06:27 - 06:29
models, but fierce domestic
06:29 - 06:30
competition from Google's
06:30 - 06:32
Gemini, Anthropic's Claude,
06:32 - 06:33
and Meta's open source Llama
06:33 - 06:35
Model. And now the game has
06:35 - 06:36
changed. The widespread
06:36 - 06:38
availability of powerful
06:38 - 06:40
open-source models allows
06:40 - 06:41
developers to skip the
06:41 - 06:43
demanding, capital-intensive
06:43 - 06:45
steps of building and
06:45 - 06:46
training models themselves.
06:46 - 06:48
Now they can build on top of
06:48 - 06:49
existing models,
06:49 - 06:51
making it significantly
06:51 - 06:52
easier to jump to the
06:52 - 06:53
frontier, that is the front
06:53 - 06:55
of the race, with a smaller
06:55 - 06:57
budget and a smaller team.
06:57 - 06:59
In the last two weeks,
06:59 - 07:01
AI research teams have
07:01 - 07:04
really opened their eyes and
07:04 - 07:05
have become way more
07:05 - 07:07
ambitious on what's possible
07:07 - 07:08
with a lot less capital.
07:09 - 07:10
So previously,
07:11 - 07:12
to get to the frontier,
07:13 - 07:13
you would have to think
07:13 - 07:14
about hundreds of millions
07:14 - 07:16
of dollars of investment and
07:16 - 07:17
perhaps a billion dollars of
07:17 - 07:18
investment. What Deepseek
07:18 - 07:19
has now done here in Silicon
07:19 - 07:21
Valley is it's opened our
07:21 - 07:22
eyes to what you can
07:22 - 07:24
actually accomplish with 10,
07:24 - 07:26
15, 20, or 30 million
07:27 - 07:28
It also means any company,
07:28 - 07:30
like OpenAI, that claims the
07:30 - 07:32
frontier today ...could lose
07:32 - 07:34
it tomorrow. That's how
07:34 - 07:35
Deepseek was able to catch
07:35 - 07:36
up so quickly. It started
07:36 - 07:37
building on the existing
07:37 - 07:39
frontier of AI,
07:39 - 07:40
its approach focusing on
07:40 - 07:41
iterating on existing
07:41 - 07:43
technology rather than
07:43 - 07:44
reinventing the wheel.
07:44 - 07:47
They can take a really good
07:47 - 07:49
big model and use a process
07:49 - 07:50
called distillation. And
07:50 - 07:51
what distillation is,
07:51 - 07:53
basically you use a very
07:53 - 07:56
large model to help your
07:56 - 07:57
small model get smart at the
07:57 - 07:58
thing that you want it to
07:58 - 07:59
get smart at. And that's
07:59 - 08:00
actually a very cost
08:01 - 08:03
It closed the gap by using
08:03 - 08:04
available data sets,
08:04 - 08:06
applying innovative tweaks,
08:06 - 08:07
and leveraging existing
08:07 - 08:09
models. So much so,
08:09 - 08:10
that Deepseek's model has
08:10 - 08:12
run into an identity crisis.
08:13 - 08:14
It's convinced that its
08:14 - 08:16
ChatGPT, when you ask it
08:16 - 08:17
directly, "what model are
08:17 - 08:19
you?" Deepseek responds...
08:19 - 08:20
I'm an AI language model
08:20 - 08:21
created by OpenAI,
08:22 - 08:23
specifically based on the
08:23 - 08:25
GPT -4 architecture.
08:25 - 08:26
Leading OpenAI CEO Sam
08:26 - 08:28
Altman to post in a thinly
08:28 - 08:30
veiled shot at Deepseek just
08:30 - 08:31
days after the model was
08:31 - 08:33
released. "It's relatively
08:33 - 08:34
easy to copy something that
08:34 - 08:35
you know works.
08:35 - 08:36
It's extremely hard to do
08:36 - 08:37
something new,
08:37 - 08:39
risky, and difficult when
08:39 - 08:40
you don't know if it will
08:40 - 08:42
work." But that's not
08:42 - 08:43
exactly what Deepseek did.
08:44 - 08:45
It emulated GPT by
08:45 - 08:47
leveraging OpenAI's existing
08:47 - 08:48
outputs and architecture
08:48 - 08:49
principles, while quietly
08:49 - 08:50
introducing its own
08:50 - 08:51
enhancements, really
08:51 - 08:53
blurring the line between
08:53 - 08:54
itself and ChatGPT.
08:55 - 08:56
It all puts pressure on a
08:56 - 08:57
closed source leader like
08:57 - 08:58
OpenAI to justify its
08:58 - 09:00
costlier model as more
09:00 - 09:01
potentially nimbler
09:01 - 09:02
competitors emerge.
09:02 - 09:03
Everybody copies everybody
09:03 - 09:05
in this field.
09:05 - 09:07
You can say Google did the
09:07 - 09:08
transformer first. It's not
09:08 - 09:10
OpenAI and OpenAI just
09:10 - 09:12
copied it. Google built the
09:12 - 09:13
first large language models.
09:13 - 09:14
They didn't productise it,
09:14 - 09:16
but OpenAI did it into a
09:16 - 09:19
productized way. So you can
09:19 - 09:21
say all this in many ways.
09:21 - 09:22
It doesn't matter.
09:22 - 09:24
So if everyone is copying
09:24 - 09:25
one another, it raises the
09:25 - 09:28
question, is massive spend
09:28 - 09:31
on individual L-L-Ms even a
09:31 - 09:32
good investment anymore?
09:32 - 09:34
Now, no one has as much at
09:34 - 09:35
stake as OpenAI.
09:35 - 09:36
The startup raised over $6
09:36 - 09:38
billion in its last funding
09:38 - 09:39
round alone. But,
09:39 - 09:41
the company has yet to turn
09:41 - 09:43
a profit. And with its core
09:43 - 09:44
business centered on
09:44 - 09:45
building the models -
09:45 - 09:46
it's much more exposed than
09:46 - 09:47
companies like Google and
09:47 - 09:49
Amazon, who have cloud and
09:49 - 09:51
ad businesses bankrolling
09:51 - 09:53
their spend. For OpenAI,
09:53 - 09:54
reasoning will be key.
09:54 - 09:56
A model that thinks before
09:56 - 09:57
it generates a response,
09:57 - 09:58
going beyond pattern
09:58 - 09:59
recognition to analyze,
09:59 - 10:01
draw logical conclusions,
10:01 - 10:02
and solve really complex
10:02 - 10:04
problems. For now,
10:04 - 10:05
the startup's o1 reasoning
10:05 - 10:07
model is still cutting edge.
10:08 - 10:09
But for how long?
10:09 - 10:10
Researchers at Berkeley
10:10 - 10:11
showed that they could build
10:11 - 10:13
a reasoning model for $450
10:13 - 10:15
just last week. So you can
10:15 - 10:16
actually create these models
10:16 - 10:18
that do thinking for much,
10:18 - 10:19
much less. You don't need
10:19 - 10:21
those huge amounts of to
10:21 - 10:22
pre-train the models. So I
10:22 - 10:24
think the game is shifting.
10:24 - 10:26
It means that staying on top
10:26 - 10:27
may require as much
10:27 - 10:29
creativity as capital.
10:29 - 10:31
Deepseek's breakthrough also
10:31 - 10:32
comes at a very tricky time
10:32 - 10:33
for the AI darling.
10:33 - 10:35
Just as OpenAI is moving to
10:35 - 10:37
a for-profit model and
10:37 - 10:39
facing unprecedented brain
10:39 - 10:41
drain. Can it raise more
10:41 - 10:42
money at ever higher
10:42 - 10:43
valuations if the game is
10:43 - 10:44
changing? As Chamath
10:44 - 10:46
Palihapitiya puts it...
10:46 - 10:47
let me say the quiet part
10:47 - 10:49
out loud: AI model building
10:49 - 10:51
is a money trap.
10:58 - 10:59
Those trip restrictions from
10:59 - 11:00
the U.S . government, they
11:00 - 11:03
were intended to slow down
11:03 - 11:04
the race. To keep American
11:04 - 11:06
tech on American ground,
11:06 - 11:07
to stay ahead in the race.
11:07 - 11:08
What we want to do is we
11:08 - 11:09
want to keep it in this
11:09 - 11:10
country. China is a
11:10 - 11:11
competitor and others are
11:11 - 11:12
competitors.
11:12 - 11:14
So instead, the restrictions
11:14 - 11:15
might have been just what
11:15 - 11:16
China needed.
11:16 - 11:17
Necessity is the mother of
11:19 - 11:22
B ecause they had to go
11:22 - 11:24
figure out workarounds,
11:25 - 11:25
they actually ended up
11:25 - 11:26
building something a lot
11:26 - 11:27
more efficient.
11:27 - 11:28
It's really remarkable the
11:28 - 11:29
amount of progress they've
11:29 - 11:31
made with as little capital
11:32 - 11:33
as it's taken them to make
11:33 - 11:34
that progress.
11:34 - 11:35
It drove them to get
11:35 - 11:36
creative. With huge
11:36 - 11:38
implications. Deepseek is an
11:38 - 11:39
open-source model, meaning
11:39 - 11:41
that developers have full
11:41 - 11:42
access and they can
11:42 - 11:43
customize its weights or
11:43 - 11:44
fine -tune it to their
11:45 - 11:46
It's known that once open
11:46 - 11:47
-source is caught up or
11:48 - 11:49
improved over closed source
11:49 - 11:52
software, all developers
11:52 - 11:53
migrate to that.
11:53 - 11:55
key is that it's also
11:55 - 11:57
inexpensive. The lower the
11:57 - 11:58
cost, the more attractive it
11:58 - 12:00
is for developers to adopt.
12:00 - 12:01
The bottom line is our
12:01 - 12:03
inference cost is 10 cents
12:03 - 12:05
per million tokens,
12:05 - 12:07
and that's 1/30th of what
12:07 - 12:08
the typical comparable model
12:08 - 12:10
charges. Where's it going?
12:10 - 12:11
It's well, the 10 cents
12:11 - 12:14
would lead to building apps
12:14 - 12:15
for much lower costs.
12:15 - 12:17
So if you wanted to build a
12:17 - 12:19
u.com or Perplexity or some
12:19 - 12:21
other app, you can either
12:21 - 12:23
pay OpenAI $4.40 per million
12:23 - 12:26
tokens, or if you have our
12:26 - 12:27
model, it costs you just 10
12:28 - 12:29
It could mean that the
12:29 - 12:30
prevailing model in global
12:30 - 12:32
AI may be open -source,
12:32 - 12:34
as organizations and nations
12:34 - 12:35
come around to the idea that
12:35 - 12:36
collaboration and
12:36 - 12:37
decentralization,
12:37 - 12:38
those things can drive
12:38 - 12:40
innovation faster and more
12:40 - 12:41
efficiently than
12:41 - 12:42
proprietary, closed
12:42 - 12:44
ecosystems. A cheaper,
12:44 - 12:45
more efficient, widely
12:45 - 12:47
adopted open -source model
12:47 - 12:49
from China that could lead
12:49 - 12:51
to a major shift in
12:52 - 12:53
That's more dangerous,
12:54 - 12:56
because then they get to own
12:56 - 12:58
the mindshare, the
12:59 - 13:00
In other words, the adoption
13:00 - 13:01
of a Chinese open-source
13:01 - 13:02
model at scale that could
13:02 - 13:04
undermine U.S . leadership
13:04 - 13:06
while embedding China more
13:06 - 13:07
deeply into the fabric of
13:07 - 13:09
global tech infrastructure.
13:09 - 13:10
There's always a point where
13:10 - 13:12
open source can stop being
13:12 - 13:13
open -source, too,
13:13 - 13:15
right? So, the licenses are
13:15 - 13:16
very favorable today,
13:16 - 13:17
but-it could close it.
13:17 - 13:19
Exactly, over time,
13:19 - 13:20
they can always change the
13:20 - 13:22
license. So, it's important
13:22 - 13:24
that we actually have people
13:24 - 13:26
here in America building,
13:26 - 13:27
and that's why Meta is so
13:28 - 13:29
Another consequence of
13:29 - 13:31
China's AI breakthrough is
13:31 - 13:32
giving its Communist Party
13:32 - 13:34
control of the narrative.
13:34 - 13:35
AI models built in China t
13:35 - 13:36
hey're forced to adhere to a
13:36 - 13:37
certain set of rules set by
13:37 - 13:39
the state. They must embody
13:39 - 13:41
"core socialist values."
13:41 - 13:42
Studies have shown that
13:42 - 13:44
models created by Tencent
13:44 - 13:45
and Alibaba, they will
13:45 - 13:46
censor historical events
13:46 - 13:48
like Tiananmen Square,
13:48 - 13:49
deny human rights abuse,
13:50 - 13:51
and filter criticism of
13:51 - 13:53
Chinese political leaders.
13:53 - 13:54
That contest is about
13:54 - 13:55
whether we're going to have
13:55 - 13:56
democratic AI informed by
13:56 - 13:58
democratic values,
13:58 - 14:00
built to serve democratic
14:00 - 14:01
purposes, or we're going to
14:01 - 14:03
end up with with autocratic
14:03 - 14:04
If developers really begin
14:04 - 14:06
to adopt these models en
14:06 - 14:07
masse because they're more
14:07 - 14:08
efficient, that could have a
14:08 - 14:10
serious ripple effect.
14:10 - 14:11
Trickle down to even
14:11 - 14:12
consumer-facing AI
14:12 - 14:13
applications, and influence
14:13 - 14:15
how trustworthy those
14:15 - 14:16
AI-generated responses from
14:16 - 14:18
chatbots really are.
14:18 - 14:19
And there's really only two
14:19 - 14:20
countries right now in the
14:20 - 14:22
world that can build this at
14:22 - 14:23
scale, you know,
14:23 - 14:25
and that is the U.S .
14:25 - 14:27
and China, and so,
14:27 - 14:28
you know, the consequences
14:28 - 14:30
of the stakes in and around
14:30 - 14:32
this are just enormous.
14:32 - 14:33
Enormous stakes,
14:33 - 14:35
enormous consequences,
14:35 - 14:37
and hanging in the balance:
14:37 - 14:38
A merica's lead.
14:42 - 14:44
For a topic so complex and
14:44 - 14:45
new, we turn to an expert
14:45 - 14:47
who's actually building in
14:47 - 14:48
the space, and
14:48 - 14:50
model-agnostic. Perplexity
14:50 - 14:51
co-founder and CEO Arvind
14:51 - 14:52
Srinivas – who you heard
14:52 - 14:53
from throughout our piece.
14:54 - 14:55
He sat down with me for more
14:55 - 14:56
than 30 minutes to discuss
14:56 - 14:57
Deepseek and its
14:57 - 14:59
implications, as well as
14:59 - 15:00
Perplexity's roadmap.
15:00 - 15:01
We think it's worth
15:01 - 15:02
listening to that whole
15:02 - 15:04
conversation, so here it is.
15:04 - 15:05
So first I want to know what
15:05 - 15:07
the stakes are. What,
15:07 - 15:09
like describe the AI race
15:09 - 15:11
between China and the U.S .
15:11 - 15:12
and what's at stake.
15:13 - 15:14
Okay, so first of all,
15:14 - 15:16
China has a lot of
15:16 - 15:18
disadvantages in competing
15:18 - 15:21
with the U.S. Number one is,
15:21 - 15:22
the fact that they don't get
15:22 - 15:24
access to all the hardware
15:24 - 15:26
that we have access to here.
15:27 - 15:28
So they're kind of working
15:28 - 15:30
with lower end GPUs than us.
15:31 - 15:32
I t's almost like working
15:32 - 15:33
with the previous generation
15:33 - 15:35
GPUs, scrappily.
15:35 - 15:38
S o and the fact that the
15:38 - 15:39
bigger models tend to be
15:39 - 15:42
more smarter, naturally puts
15:42 - 15:43
them at a disadvantage.
15:43 - 15:46
But the flip side of this is
15:46 - 15:47
that necessity is the mother
15:47 - 15:51
of invention, because they
15:51 - 15:52
had to go figure out
15:53 - 15:55
workarounds. They actually
15:55 - 15:56
ended up building something
15:56 - 15:58
a lot more efficient.
15:58 - 15:59
It's like saying, "hey look,
15:59 - 16:01
you guys really got to get a
16:01 - 16:04
top notch model, and I'm not
16:04 - 16:05
going to give you resources
16:05 - 16:07
and figure out something,"
16:07 - 16:08
right? Unless it's
16:08 - 16:09
impossible, unless it's
16:09 - 16:11
mathematically possible to
16:11 - 16:13
prove that it's impossible
16:13 - 16:14
to do so, you can always try
16:14 - 16:15
to like come up with
16:15 - 16:17
something more efficient.
16:17 - 16:20
But that is likely to make
16:20 - 16:21
them come up with a more
16:21 - 16:22
efficient solution than
16:22 - 16:24
America. And of course,
16:24 - 16:25
they have open -sourced it,
16:25 - 16:27
so we can still adopt
16:27 - 16:28
something like that here.
16:28 - 16:30
But that kind of talent
16:30 - 16:32
they're building to do that
16:32 - 16:33
will become an edge for them
16:33 - 16:34
over time right?
16:35 - 16:36
T he leading open-source
16:36 - 16:38
model in America's Meta's
16:38 - 16:40
Llama family. It's really
16:40 - 16:41
good. It's kind of like a
16:41 - 16:42
model that you can run on
16:42 - 16:43
your computer.
16:43 - 16:45
B ut even though it got
16:45 - 16:47
pretty close to GBT-4,
16:48 - 16:50
and at the time of its
16:50 - 16:51
release, the model that was
16:51 - 16:54
closest in quality was the
16:54 - 16:56
giant 405B, not the 70B that
16:56 - 16:56
you could run on your
16:56 - 16:59
computer. And so there was
16:59 - 17:01
still a not a small,
17:01 - 17:02
cheap, fast, efficient,
17:02 - 17:04
open-source model that
17:04 - 17:06
rivaled the most powerful
17:06 - 17:07
closed models from OpenAI,
17:07 - 17:09
Anthropic. Nothing from
17:09 - 17:11
America, nothing from
17:11 - 17:12
Mistral AI either.
17:12 - 17:13
And then these guys come
17:13 - 17:16
out, with like a crazy model
17:16 - 17:17
that's like 10x cheaper and
17:17 - 17:19
API pricing than GPT -4 and
17:19 - 17:21
15x cheaper than Sonnet,
17:21 - 17:23
I believe. Really fast,
17:23 - 17:24
16 tokens per second–60
17:24 - 17:25
tokens per second,
17:26 - 17:29
and pretty much equal or
17:29 - 17:30
better in some benchmarks
17:30 - 17:31
and worse in some others.
17:31 - 17:32
But like roughly in that
17:32 - 17:34
ballpark of 4-O's quality.
17:35 - 17:36
And they did it all with
17:36 - 17:39
like approximately just 20,
17:39 - 17:41
48, 800 GPUs, which is
17:41 - 17:42
actually equivalent to like
17:42 - 17:44
somewhere around 1,500 or
17:44 - 17:47
1,000 to 1,500 H100 GPUs.
17:47 - 17:50
That's like 20 to 30x lower
17:50 - 17:52
than the amount of GPUs that
17:52 - 17:53
GPT -4s is usually trained
17:53 - 17:56
on, and roughly $5 million
17:56 - 17:58
in total compute budget.
17:59 - 18:00
They did it with so little
18:00 - 18:02
money and such an amazing
18:02 - 18:04
model, gave it away for
18:04 - 18:04
free, wrote a technical
18:04 - 18:06
paper, and definitely it
18:06 - 18:09
makes us all question like,
18:09 - 18:10
"okay, like if we have the
18:10 - 18:12
equivalent of Doge for like
18:12 - 18:14
model training,
18:14 - 18:15
this is an example of that,
18:15 - 18:16
right?"
18:16 - 18:17
Right. Yeah. Efficiency,
18:18 - 18:19
is what you're getting at.
18:19 - 18:20
So, fraction of the price,
18:21 - 18:22
fraction of the time.
18:22 - 18:23
Yeah. Dumb down GPUs
18:23 - 18:25
essentially. What was your
18:25 - 18:27
surprise when you understood
18:27 - 18:28
what they had done.
18:28 - 18:30
So my surprise was that when
18:30 - 18:31
I actually went through the
18:31 - 18:33
technical paper,
18:33 - 18:35
the amount of clever
18:35 - 18:37
solutions they came up with,
18:38 - 18:39
first of all, they train a
18:39 - 18:40
mixture of experts model.
18:40 - 18:42
It's not that easy to train,
18:43 - 18:44
there's a lot of like,
18:44 - 18:46
the main reason people find
18:46 - 18:46
it difficult to catch up
18:46 - 18:48
with OpenAI, especially on
18:48 - 18:49
the MoE architecture,
18:49 - 18:51
is that there's a lot of,
18:52 - 18:54
irregular loss spikes.
18:54 - 18:56
The numerics are not stable,
18:56 - 18:57
so often, like,
18:57 - 18:59
you've got to restart the
18:59 - 19:00
training checkpoint again,
19:00 - 19:01
and a lot of infrastructure
19:01 - 19:03
needs to be built for that.
19:03 - 19:04
And they came up with very
19:04 - 19:06
clever solutions to balance
19:06 - 19:07
that without adding
19:07 - 19:09
additional hacks.
19:09 - 19:12
T hey also figured out
19:12 - 19:13
floating point-8 bit
19:13 - 19:15
training, at least for some
19:15 - 19:17
of the numerics. And they
19:17 - 19:18
cleverly figured out which
19:18 - 19:19
has to be in higher
19:19 - 19:20
precision, which has to be
19:20 - 19:22
in lower precision. T o my
19:22 - 19:24
knowledge, I think floating
19:24 - 19:26
point-8 training is not that
19:26 - 19:27
well understood. Most of the
19:27 - 19:28
training in America is still
19:28 - 19:30
running in FP16.
19:30 - 19:31
Maybe OpenAI and some of the
19:31 - 19:32
people are trying to explore
19:32 - 19:33
that, but it's pretty
19:33 - 19:35
difficult to get it right.
19:35 - 19:36
So because necessity is the
19:36 - 19:37
mother of invention, because
19:37 - 19:38
they don't have that much
19:38 - 19:39
memory, that many GPUs.
19:40 - 19:41
T hey figured out a lot
19:42 - 19:44
numerical stability stuff
19:44 - 19:45
that makes their training
19:45 - 19:46
work. And they claimed in
19:46 - 19:48
the paper that for majority
19:48 - 19:49
of the training was stable.
19:50 - 19:51
Which means what? They can
19:51 - 19:53
always rerun those training
19:53 - 19:57
runs again and on more data
19:57 - 19:59
or better data. And then,
20:00 - 20:01
it only trained for 60 days.
20:02 - 20:03
So that's pretty amazing.
20:04 - 20:05
Safe to say you were
20:05 - 20:06
So I was definitely
20:06 - 20:08
surprised. Usually the
20:08 - 20:11
wisdom or, like I wouldn't
20:11 - 20:12
say, wisdom, the myth, is
20:12 - 20:14
that Chinese are just good
20:14 - 20:16
at copying. So if we start
20:16 - 20:18
stop writing research papers
20:18 - 20:20
in America, if we stop
20:20 - 20:22
describing the details of
20:22 - 20:23
our infrastructure or
20:23 - 20:25
architecture, and stop open
20:25 - 20:27
sourcing, they're not going
20:27 - 20:29
to be able to catch up. But
20:29 - 20:30
the reality is, some of the
20:30 - 20:33
details in Deep seek v3 are
20:33 - 20:34
so good that I wouldn't be
20:34 - 20:36
surprised if Meta took a
20:36 - 20:38
look at it and incorporated
20:38 - 20:38
some of that –tried to copy
20:38 - 20:41
them . Right.
20:41 - 20:42
I wouldn't necessarily say
20:42 - 20:43
copy. It's all like,
20:43 - 20:44
you know, sharing science,
20:45 - 20:47
engineering, but the point
20:47 - 20:48
is like, it's changing.
20:48 - 20:50
Like, it's not like China is
20:50 - 20:51
just copycat. They're also
20:52 - 20:53
We don't know exactly the
20:53 - 20:55
data that it was trained on
20:55 - 20:56
right? Even though it's open
20:56 - 20:57
-source, we know some of the
20:57 - 20:59
ways and things that was
20:59 - 20:59
trained up, but not
20:59 - 21:01
everything. And there's this
21:01 - 21:02
idea that it was trained on
21:02 - 21:05
public ChatGPT outputs,
21:05 - 21:06
which would mean it just was
21:06 - 21:07
copied. But you're saying it
21:07 - 21:08
goes beyond that? There's
21:08 - 21:09
real innovation in there?
21:09 - 21:11
look, I mean, they've
21:11 - 21:13
trained it on 14.8 trillion
21:13 - 21:15
tokens. T he internet has so
21:15 - 21:16
much ChatGPT. If you
21:16 - 21:18
actually go to any LinkedIn
21:18 - 21:19
post or X post.
21:19 - 21:21
Now, most of the comments
21:21 - 21:22
are written by AI. You can
21:22 - 21:24
just see it, like people are
21:24 - 21:25
just trying to write. In
21:25 - 21:28
fact, even with an X,
21:28 - 21:30
there's like a Grok tweet
21:30 - 21:31
enhancer, or in LinkedIn
21:31 - 21:32
there's an AI enhancer,
21:33 - 21:37
or in Google Docs and Word.
21:37 - 21:38
There are AI tools to like
21:38 - 21:40
rewrite your stuff. So if
21:40 - 21:41
you do something there and
21:41 - 21:43
copy paste somewhere on the
21:43 - 21:44
internet, it's naturally
21:44 - 21:45
going to have some elements
21:45 - 21:48
of a ChatGPT like training,
21:48 - 21:49
right? And there's a lot of
21:49 - 21:51
people who don't even bother
21:51 - 21:53
to strip away that I'm a
21:53 - 21:55
language model, right?
21:55 - 21:56
–part. So, they just paste
21:56 - 21:58
it somewhere and it's very
21:58 - 21:59
difficult to control for
21:59 - 22:01
this. I think xAI has spoken
22:01 - 22:02
about this too, so I
22:02 - 22:04
wouldn't like disregard
22:04 - 22:05
their technical
22:05 - 22:07
accomplishment just because
22:07 - 22:08
like for some prompts like
22:08 - 22:10
who are you, or like which
22:10 - 22:11
model are you at response
22:11 - 22:12
like that? It doesn't even
22:12 - 22:13
matter in my opinion.
22:13 - 22:14
time we thought, I don't
22:14 - 22:15
know if you agreed with us,
22:15 - 22:17
China was behind in AI,
22:17 - 22:18
what does this do to that
22:18 - 22:20
race? Can we say that China
22:20 - 22:22
is catching up or has it
22:23 - 22:25
I mean, like if we say the
22:25 - 22:27
matter is catching up to
22:27 - 22:28
OpenAI and Anthropic,
22:28 - 22:31
if you make that claim,
22:31 - 22:32
then the same claim can be
22:32 - 22:33
made for China catching up
22:34 - 22:36
A lot of papers from China
22:36 - 22:37
that have tried to replicate
22:37 - 22:39
o1, in fact, I saw more
22:39 - 22:41
papers from China after o1
22:42 - 22:43
announcement that tried to
22:43 - 22:44
replicate it than from
22:44 - 22:46
America. Like,
22:46 - 22:47
and the amount of compute
22:48 - 22:50
Deepseek has access to is
22:50 - 22:52
roughly similar to what PhD
22:52 - 22:54
students in the U.S .
22:54 - 22:55
have access to. By the way,
22:55 - 22:56
this is not meant to
22:56 - 22:57
criticize others like even
22:57 - 22:59
for ourselves, like,
22:59 - 23:00
you know, I for Perplexity,
23:00 - 23:01
we decided not to train
23:01 - 23:02
models because we thought
23:02 - 23:03
it's like a very expensive
23:03 - 23:07
thing. A nd we thought like,
23:07 - 23:08
there's no way to catch up
23:08 - 23:09
with the rest.
23:09 - 23:10
But will you incorporate
23:10 - 23:12
Deepseek into Perplexity?
23:12 - 23:13
Oh, we already are beginning
23:15 - 23:16
I think they have an API,
23:16 - 23:18
and we're also they have
23:18 - 23:18
open source weights, so we
23:18 - 23:20
can host it ourselves, too.
23:20 - 23:21
And it's good to, like,
23:21 - 23:22
try to start using that
23:23 - 23:24
because it's actually,
23:24 - 23:25
allows us to do a lot of the
23:25 - 23:27
things at lower cost.
23:27 - 23:28
But what I'm kind of
23:28 - 23:30
thinking is beyond that,
23:30 - 23:31
which is like, okay, if
23:31 - 23:33
these guys actually could
23:33 - 23:34
train such a great model
23:34 - 23:37
with, good team like,
23:37 - 23:38
and there's no excuse
23:38 - 23:39
anymore for companies in the
23:39 - 23:41
U.S., including ourselves,
23:41 - 23:42
to like, not try to do
23:42 - 23:43
something like that.
23:43 - 23:44
You hear a lot in public
23:44 - 23:45
from a lot of, you know,
23:45 - 23:46
thought leaders in
23:46 - 23:47
generative AI, both on the
23:47 - 23:48
research side, on the
23:48 - 23:50
entrepreneurial side,
23:50 - 23:51
like Elon Musk and others
23:51 - 23:53
say that China can't catch
23:53 - 23:55
up. Like it's the stakes are
23:55 - 23:56
too big. The geopolitical
23:56 - 23:58
stakes, whoever dominates AI
23:58 - 23:59
is going to kind of dominate
23:59 - 24:01
the economy, dominate the
24:01 - 24:02
world. You know,
24:02 - 24:03
it's been talked about in
24:03 - 24:04
those massive terms. Are you
24:04 - 24:06
worried about what China
24:06 - 24:07
proved it was able to do?
24:08 - 24:09
Firstly, I don't know if
24:09 - 24:10
Elon ever said China can't
24:11 - 24:13
I'm not – just the threat of
24:13 - 24:14
China. He's only identified
24:14 - 24:16
the threat of letting China,
24:16 - 24:17
and you know, Sam Altman has
24:17 - 24:18
said similar things, we
24:18 - 24:20
can't let China win the
24:20 - 24:22
You know, it's all I think
24:22 - 24:25
you got to decouple what
24:25 - 24:26
someone like Sam says to
24:26 - 24:27
like what is in his
24:27 - 24:29
self-interest. Right?
24:30 - 24:34
Look, I think the my point
24:34 - 24:37
is, like, whatever you did
24:37 - 24:38
to not let them catch up
24:38 - 24:40
didn't even matter. They
24:40 - 24:42
ended up catching up anyway.
24:42 - 24:43
Necessity is the mother of
24:44 - 24:46
like you said. And you it's
24:46 - 24:48
actually, you know what's
24:48 - 24:49
more dangerous than trying
24:49 - 24:51
to do all the things to not
24:51 - 24:52
let them catch up and, you
24:52 - 24:54
know, all this stuff is
24:54 - 24:55
what's more dangerous is
24:55 - 24:56
they have the best
24:56 - 24:57
open-source model. And all
24:57 - 24:59
the American developers are
24:59 - 25:00
building on that. Right.
25:00 - 25:02
That's more dangerous
25:02 - 25:05
because then they get to own
25:05 - 25:06
the mindshare, the
25:07 - 25:09
If the entire American AI
25:09 - 25:10
ecosystem look,
25:10 - 25:12
in general, it's known that
25:12 - 25:13
once open-source is caught
25:13 - 25:15
up or improved over closed
25:15 - 25:18
source software, all
25:18 - 25:19
developers migrate to that.
25:20 - 25:21
It's historically known,
25:21 - 25:23
When Llama was being built
25:23 - 25:24
and becoming more widely
25:24 - 25:25
used, there was this
25:25 - 25:26
question should we trust
25:26 - 25:27
Zuckerberg? But now the
25:27 - 25:29
question is should we trust
25:29 - 25:30
China? That's a very–You
25:30 - 25:31
trust open-source, that's
25:31 - 25:33
the like it's not about who,
25:33 - 25:35
is it Zuckerberg, or is it.
25:35 - 25:36
Does it matter then if it's
25:37 - 25:37
Chinese, if it's
25:37 - 25:38
open-source?
25:39 - 25:41
Look, it doesn't matter in
25:41 - 25:43
the sense that you still
25:43 - 25:44
have full control.
25:45 - 25:46
Y ou run it as your own,
25:47 - 25:48
like set of weights on your
25:48 - 25:50
own computer, you are in
25:50 - 25:52
charge of the model. But,
25:52 - 25:54
it's not a great look for
25:54 - 25:56
our own, like, talent to
25:57 - 25:58
rely on software built by
26:00 - 26:01
E ven if it's open-source,
26:01 - 26:04
there's always, like, a
26:04 - 26:05
point where open-source can
26:05 - 26:07
stop being open-source, too,
26:07 - 26:09
right? So the licenses are
26:09 - 26:10
very favorable today,
26:10 - 26:11
but if – you can close it –
26:11 - 26:13
exactly, over time,
26:13 - 26:15
they can always change the
26:15 - 26:16
license. So, it's important
26:16 - 26:18
that we actually have people
26:18 - 26:20
here in America building,
26:20 - 26:21
and that's why Meta is so
26:21 - 26:23
important. Like I look I
26:23 - 26:25
still think Meta will build
26:25 - 26:26
a better model than Deep
26:26 - 26:28
seek v3 and open-source it,
26:28 - 26:29
and they'll call it Llama 4
26:29 - 26:31
or 3 point something,
26:31 - 26:33
doesn't matter, but I think
26:33 - 26:35
what is more key is that we
26:35 - 26:38
don't try to focus all our
26:38 - 26:41
energy on banning them,
26:41 - 26:42
stopping them, and just try
26:42 - 26:43
to outcompete and win them.
26:43 - 26:44
That's just that's just the
26:44 - 26:45
American way of doing things
26:46 - 26:47
better. And it feels like
26:47 - 26:48
there's, you know, we hear a
26:48 - 26:49
lot more about these Chinese
26:49 - 26:51
companies who are developing
26:51 - 26:52
in a similar way, a lot more
26:52 - 26:53
efficiently, a lot more cost
26:53 - 26:55
effectively right? –Yeah,
26:55 - 26:56
again, like, look,
26:56 - 26:58
it's hard to fake scarcity,
26:58 - 27:01
right? If you raise $10
27:01 - 27:02
billion and you decide to
27:02 - 27:04
spend 80% of it on a compute
27:04 - 27:06
cluster, it's hard for you
27:06 - 27:07
to come up with the exact
27:07 - 27:08
same solution that someone
27:08 - 27:10
with $5 million would do.
27:10 - 27:12
And there's no point,
27:13 - 27:14
no need to, like, sort of
27:14 - 27:15
berate those who are putting
27:15 - 27:17
more money. They're trying
27:17 - 27:18
to do it as fast as they
27:18 - 27:19
When we say open -source,
27:19 - 27:20
there's so many different
27:20 - 27:21
versions. Some people
27:21 - 27:22
criticize Meta for not
27:22 - 27:23
publishing everything,
27:23 - 27:24
and even Deepseek itself
27:24 - 27:26
isn't totally transparent.
27:26 - 27:27
Yeah, you can go to the
27:27 - 27:28
limits of open-source and
27:28 - 27:30
say, I should exactly be
27:30 - 27:31
able to replicate your
27:31 - 27:33
training run. But first of
27:33 - 27:34
all, how many people even
27:34 - 27:36
have the resources to do
27:36 - 27:40
that. And I think the amount
27:40 - 27:41
of detail they've shared in
27:41 - 27:42
the technical report,
27:43 - 27:44
actually Meta did that too,
27:44 - 27:46
by the way, Meta's Llama 3.3
27:46 - 27:47
technical report is
27:47 - 27:48
incredibly detailed,
27:48 - 27:50
and very great for science.
27:51 - 27:52
So the amount of details
27:52 - 27:53
they get these people are
27:53 - 27:54
sharing is already a lot
27:54 - 27:56
more than what the other
27:56 - 27:57
companies are doing right
27:57 - 27:58
When you think about how
27:58 - 27:59
much it costs Deepseek to do
27:59 - 28:01
this, less than $6 million,
28:01 - 28:03
I think about what OpenAI
28:03 - 28:05
has spent to develop GPT
28:05 - 28:07
models. What does that mean
28:07 - 28:08
for the closed source model,
28:09 - 28:10
ecosystem trajectory,
28:10 - 28:12
momentum? What does it mean
28:13 - 28:15
I mean, it's very clear that
28:15 - 28:16
we'll have an open-source
28:17 - 28:19
version 4-O, or even better
28:19 - 28:21
than that, and much cheaper
28:21 - 28:22
than that open-source,
28:22 - 28:24
like completely this year.
28:24 - 28:25
Made by OpenAI?
28:26 - 28:27
Probably not. Most likely
28:27 - 28:29
not. And I don't think they
28:29 - 28:30
care if it's not made by
28:30 - 28:32
them. I think they've
28:32 - 28:33
already moved to a new
28:33 - 28:34
paradigm called the o1
28:34 - 28:38
family of models.
28:38 - 28:41
I looked at I can't like
28:41 - 28:42
Ilya Sutskever came and
28:42 - 28:44
said, pre-training is a
28:44 - 28:45
wall, right?
28:45 - 28:48
So, I mean, he didn't
28:48 - 28:49
exactly use the word, but he
28:49 - 28:50
clearly said–yeah–the age of
28:50 - 28:51
pre-training is over.
28:51 - 28:51
–many people have said that
28:52 - 28:55
Right? So, that doesn't mean
28:55 - 28:56
scaling has hit a wall.
28:56 - 28:58
I think we're scaling on
28:58 - 28:59
different dimensions now.
28:59 - 29:00
The amount of time model
29:00 - 29:01
spends thinking at test
29:01 - 29:03
time. Reinforcement
29:03 - 29:04
learning, like trying to,
29:04 - 29:06
like, make the model,
29:06 - 29:07
okay, if it doesn't know
29:07 - 29:09
what to do for a new prompt,
29:09 - 29:10
it'll go and reason and
29:10 - 29:11
collect data and interact
29:11 - 29:13
with the world,
29:13 - 29:14
use a bunch of tools.
29:14 - 29:15
I think that's where things
29:15 - 29:16
are headed, and I feel like
29:16 - 29:17
OpenAI is more focused on
29:17 - 29:19
that right now. Yeah.
29:19 - 29:20
–I nstead of just the
29:20 - 29:21
bigger, better model?
29:21 - 29:22
Correct. –Reasoning
29:22 - 29:23
capacities. But didn't you
29:23 - 29:24
say that deep seek is likely
29:24 - 29:25
to turn their attention to
29:26 - 29:27
100%, I think they will.
29:28 - 29:31
A nd that's why I'm pretty
29:31 - 29:32
excited about what they'll
29:32 - 29:33
produce next.
29:34 - 29:35
I guess that's then my
29:35 - 29:37
question is sort of what's
29:37 - 29:38
OpenAI's moat now?
29:39 - 29:41
Well, I still think that,
29:41 - 29:42
no one else has produced a
29:42 - 29:45
system similar to the o1
29:45 - 29:47
yet, exactly.
29:47 - 29:49
I know that there's debates
29:49 - 29:50
about whether o1 is actually
29:50 - 29:53
worth it. Y ou know,
29:53 - 29:54
on maybe a few prompts,
29:54 - 29:55
it's really better. But like
29:55 - 29:56
most of the times, it's not
29:56 - 29:57
producing any differentiated
29:57 - 29:58
output from Sonnet.
29:59 - 30:01
But, at least the results
30:01 - 30:03
they showed in o3 where,
30:03 - 30:05
they had like,
30:05 - 30:06
competitive coding
30:06 - 30:08
performance and almost like
30:08 - 30:09
an AI software engineer
30:10 - 30:11
Isn't it just a matter of
30:11 - 30:12
time, though, before the
30:12 - 30:13
internet is filled with
30:13 - 30:16
reasoning data that.
30:16 - 30:17
–yeah– Deepseek.
30:17 - 30:19
Again, it's possible.
30:19 - 30:20
Nobody knows yet.
30:20 - 30:22
Yeah. So until it's done,
30:23 - 30:24
it's still uncertain right?
30:24 - 30:26
Right. So maybe that
30:26 - 30:27
uncertainty is their moat.
30:27 - 30:28
T hat, like, no one else has
30:28 - 30:30
the same, reasoning
30:30 - 30:32
capability yet,
30:32 - 30:34
but will by end of this
30:34 - 30:36
year, will there be multiple
30:36 - 30:37
players even in the
30:37 - 30:38
reasoning arena?
30:38 - 30:40
I absolutely think so.
30:40 - 30:41
So are we seeing the
30:41 - 30:43
commoditization of large
30:43 - 30:44
language models?
30:44 - 30:45
I think we will see a
30:45 - 30:47
similar trajectory,
30:49 - 30:50
just like how in
30:50 - 30:50
pre-training and
30:51 - 30:52
post-training that that sort
30:52 - 30:54
of system for getting
30:54 - 30:57
commoditized this year will
30:57 - 30:57
be a lot more
30:57 - 30:59
commoditization there.
30:59 - 31:00
I think the reasoning kind
31:00 - 31:02
of models will go through a
31:02 - 31:04
similar trajectory where in
31:04 - 31:05
the beginning, 1 or 2
31:05 - 31:06
players really know how to
31:06 - 31:07
do it, but over time
31:07 - 31:08
–That's.
31:08 - 31:10
and who knows right? Because
31:10 - 31:11
OpenAI could make another
31:11 - 31:12
advancement to focus on.
31:13 - 31:14
But right now reasoning is
31:14 - 31:16
By the way, if advancements
31:16 - 31:18
keep happening again and
31:18 - 31:19
again and again, like,
31:20 - 31:21
I think the meaning of the
31:21 - 31:23
word advancement also loses
31:23 - 31:24
some of its value, right?
31:24 - 31:25
Totally. Even now it's very
31:25 - 31:26
difficult, right. Because
31:26 - 31:27
there's pre-training
31:27 - 31:28
advancements. Yeah.
31:28 - 31:29
And then we've moved into a
31:29 - 31:30
different phase.
31:30 - 31:32
Yeah, so what is guaranteed
31:32 - 31:33
to happen is whatever models
31:33 - 31:36
exist today, that level of
31:36 - 31:37
reasoning, that level of
31:37 - 31:40
multimodal capability in
31:40 - 31:41
like 5 or 10x cheaper
31:41 - 31:43
models, open source,
31:43 - 31:45
all that's going to happen.
31:45 - 31:46
It's just a matter of time.
31:46 - 31:49
What is unclear is if
31:49 - 31:50
something like a model that
31:50 - 31:53
reasons at test time will be
31:53 - 31:55
extremely cheap enough that
31:55 - 31:56
we can just run it on our
31:56 - 31:58
phones. I think that's not
31:58 - 31:58
clear to me yet.
31:58 - 31:59
It feels like so much of the
31:59 - 32:00
landscape has changed with
32:00 - 32:01
what Deepseek was able to
32:02 - 32:03
prove. Could you call it
32:03 - 32:05
China's ChatGPT moment?
32:07 - 32:10
I mean, I think it certainly
32:10 - 32:11
probably gave them a lot of
32:11 - 32:13
confidence that, like,
32:14 - 32:16
you know, we're not really
32:16 - 32:17
behind no matter what you do
32:17 - 32:19
to restrict our compute.
32:19 - 32:21
Like, we can always figure
32:21 - 32:22
out some workarounds.
32:22 - 32:23
And, yeah, I'm sure the team
32:23 - 32:24
feels pumped about the
32:26 - 32:27
How does this change,
32:27 - 32:28
like the investment
32:28 - 32:30
landscape, the hyperscalers
32:30 - 32:32
that are spending tens of
32:32 - 32:33
billions of dollars a year
32:33 - 32:34
on CapEx have just ramped it
32:34 - 32:36
up huge. And OpenAI and
32:36 - 32:37
Anthropic that are raising
32:37 - 32:38
billions of dollars for
32:38 - 32:39
GPUs, essentially.
32:39 - 32:41
But what Deepseek told us is
32:41 - 32:42
you don't need, you don't
32:42 - 32:44
necessarily need that.
32:45 - 32:47
I mean, look, I think it's
32:47 - 32:48
very clear that they're
32:48 - 32:50
going to go even harder on
32:50 - 32:53
reasoning because they
32:53 - 32:53
understand that, like,
32:53 - 32:54
whatever they were building
32:54 - 32:56
in the previous two years is
32:56 - 32:57
getting extremely cheap,
32:57 - 32:58
that it doesn't make sense
32:58 - 33:01
to go justify raising that–
33:01 - 33:02
Is the spending.
33:02 - 33:03
proposition the same? Do
33:03 - 33:05
they need the same amount
33:05 - 33:07
of, you know, high end GPUs,
33:07 - 33:08
or can you reason using the
33:08 - 33:09
lower end ones that
33:10 - 33:11
Again, it's hard to say no
33:11 - 33:13
until proven it's not.
33:14 - 33:17
But I guess, like in the
33:17 - 33:19
spirit of moving fast,
33:19 - 33:20
you would want to use the
33:20 - 33:22
high end chips, and you
33:22 - 33:24
would want to, like, move
33:24 - 33:24
faster than your
33:24 - 33:26
competitors. I think,
33:26 - 33:27
like the best talent still
33:27 - 33:28
wants to work in the team
33:28 - 33:31
that made it happen first.
33:31 - 33:32
You know, there's always
33:32 - 33:33
some glory to like, who did
33:33 - 33:34
this, actually? Like, who's
33:34 - 33:36
the real pioneer? Versus
33:36 - 33:38
who's the fast follow right?
33:38 - 33:39
That was like kind of like
33:39 - 33:41
Sam Altman's tweet kind of
33:41 - 33:43
veiled response to what
33:43 - 33:44
Deepseek has been able to,
33:44 - 33:45
he kind of implied that they
33:45 - 33:46
just copied, and anyone can
33:47 - 33:48
Right? Yeah, but then you
33:48 - 33:50
can always say that, like,
33:50 - 33:51
everybody copies everybody
33:51 - 33:53
in this field.
33:53 - 33:54
You can say Google did the
33:54 - 33:56
transformer first. It's not
33:56 - 33:57
OpenAI and OpenAI just
33:57 - 33:59
copied it. Google built the
33:59 - 34:01
first large language models.
34:01 - 34:02
They didn't productise it,
34:02 - 34:04
but OpenAI did it in a
34:04 - 34:05
productized way. So you can
34:06 - 34:09
say all this in many ways,
34:09 - 34:09
it doesn't matter.
34:09 - 34:11
I remember asking you being
34:11 - 34:12
like, you know, why don't
34:12 - 34:13
you want to build the model?
34:13 - 34:14
Yeah, that's that's,
34:14 - 34:16
you know, the glory. And a
34:16 - 34:18
year later, just one year
34:18 - 34:19
later, you look very,
34:19 - 34:21
very smart. To not engage in
34:21 - 34:23
that extremely expensive
34:23 - 34:24
race that has become so
34:24 - 34:25
competitive. And you kind of
34:25 - 34:27
have this lead now in what
34:27 - 34:28
everyone wants to see now,
34:28 - 34:30
which is like real world
34:30 - 34:31
applications, killer
34:31 - 34:33
applications of generative
34:33 - 34:35
AI. Talk a little bit about
34:35 - 34:37
like that decision and how
34:37 - 34:38
that's sort of guided you
34:39 - 34:40
where you see Perplexity
34:40 - 34:41
going from here.
34:41 - 34:43
Look, one year ago,
34:43 - 34:45
I don't even think we had
34:45 - 34:47
something like,
34:47 - 34:51
this is what, like 2024
34:51 - 34:54
beginning, right? I feel
34:54 - 34:54
like we didn't even have
34:54 - 34:56
something like Sonnet 3.5,
34:56 - 34:58
right? W e had GPT -4,
34:58 - 35:00
I believe, and it was kind
35:00 - 35:01
of nobody else was able to
35:01 - 35:03
catch up to it. Yeah.
35:03 - 35:05
B ut there was no multimodal
35:05 - 35:08
nothing, and my sense was
35:08 - 35:09
like, okay, if people with
35:09 - 35:10
way more resources and way
35:10 - 35:12
more talent cannot catch up,
35:12 - 35:14
it's very difficult to play
35:14 - 35:15
that game. So let's play a
35:15 - 35:17
different game. Anyway,
35:17 - 35:18
people want to use these
35:18 - 35:21
models. And there's one use
35:21 - 35:22
case of asking questions and
35:22 - 35:23
getting accurate answers
35:23 - 35:25
with sources, with real time
35:25 - 35:27
information, accurate
35:27 - 35:28
information.
35:28 - 35:30
There's still a lot of work
35:30 - 35:31
there to do outside the
35:31 - 35:33
model, and making sure the
35:33 - 35:34
product works reliably,
35:34 - 35:36
keep scaling it up to usage.
35:36 - 35:38
Keep building custom UIs,
35:38 - 35:39
there's just a lot of work
35:39 - 35:40
to do, and we would focus on
35:40 - 35:42
that, and we would benefit
35:42 - 35:43
from all the tailwinds of
35:43 - 35:44
models getting better and
35:44 - 35:46
better. That's essentially
35:46 - 35:47
what happened, in fact, I
35:47 - 35:50
would say, Sonnet 3.5 made
35:50 - 35:51
our product so good,
35:51 - 35:54
in the sense that if you use
35:54 - 35:56
Sonnet 3.5 as the model
35:56 - 35:58
choice within Perplexity,
35:59 - 36:00
it's very difficult to find
36:00 - 36:01
a hallucination. I'm not
36:01 - 36:03
saying it's impossible,
36:04 - 36:06
but it dramatically reduced
36:06 - 36:08
the rate of hallucinations,
36:08 - 36:10
which meant, the problem of
36:10 - 36:11
question-answering,
36:11 - 36:12
asking a question, getting
36:12 - 36:13
an answer, doing fact
36:13 - 36:15
checks, research, going and
36:15 - 36:16
asking anything out there
36:16 - 36:17
because almost all the
36:17 - 36:18
information is on the
36:18 - 36:21
web,was such a big unlock.
36:22 - 36:24
And that helped us grow 10x
36:24 - 36:24
over the course of the year
36:24 - 36:25
in terms of usage.
36:25 - 36:26
And you've made huge strides
36:27 - 36:28
in terms of users,
36:28 - 36:29
and you know, we hear on
36:29 - 36:30
CNBC a lot, like big
36:30 - 36:32
investors who are huge fans.
36:32 - 36:33
Yeah. Jensen Huang himself
36:33 - 36:34
right? He mentioned it the
36:34 - 36:35
other, in his keynote.
36:35 - 36:37
Yeah. The other night.
36:37 - 36:38
He's a pretty regular user,
36:38 - 36:39
actually, he's not just
36:39 - 36:40
saying it. He's actually a
36:40 - 36:41
pretty regular user.
36:42 - 36:43
So, a year ago we weren't
36:43 - 36:44
even talking about
36:44 - 36:45
monetization because you
36:45 - 36:46
guys were just so new and
36:46 - 36:48
you wanted to, you know,
36:48 - 36:49
get yourselves out there and
36:49 - 36:50
build some scale, but now
36:50 - 36:51
you are looking at things
36:51 - 36:53
like that, increasingly an
36:53 - 36:54
ad model, right?
36:54 - 36:55
Yeah, we're experimenting
36:56 - 36:58
I know there's some
36:58 - 37:00
controversy on like,
37:00 - 37:01
why should we do ads?
37:01 - 37:03
Whether you can have a
37:03 - 37:04
truthful answer engine
37:04 - 37:05
despite having ads.
37:06 - 37:08
And in my opinion,
37:08 - 37:10
we've been pretty
37:10 - 37:11
proactively thoughtful about
37:11 - 37:12
it where we said,
37:13 - 37:14
okay, as long as the answer
37:14 - 37:15
is always accurate,
37:15 - 37:17
unbiased and not corrupted
37:17 - 37:19
by someone's advertising
37:19 - 37:21
budget, only you get to see
37:21 - 37:23
some sponsored questions,
37:23 - 37:24
and even the answers to
37:24 - 37:25
those sponsored questions
37:25 - 37:27
are not influenced by them,
37:27 - 37:30
and questions are also not
37:30 - 37:31
picked in a way where it's
37:31 - 37:33
manipulative. Sure,
37:34 - 37:35
there are some things that
37:35 - 37:36
the advertiser also wants,
37:36 - 37:37
which is they want you to
37:37 - 37:38
know about their brand, and
37:38 - 37:39
they want you to know the
37:39 - 37:41
best parts of their brand,
37:41 - 37:42
just like how you go,
37:42 - 37:43
and if you're introducing
37:43 - 37:44
yourself to someone you want
37:44 - 37:45
to, you want them to see the
37:45 - 37:47
best parts of you, right?
37:47 - 37:48
So that's all there.
37:48 - 37:50
But you still don't have to
37:50 - 37:51
click on a sponsored
37:51 - 37:53
question. You can ignore it.
37:53 - 37:54
And we're only charging them
37:54 - 37:55
CPM right now.
37:55 - 37:57
So we're not we ourselves
37:57 - 37:58
are not even incentivized to
37:58 - 37:59
make you click yet.
38:00 - 38:02
So I think considering all
38:02 - 38:03
this, we're actually trying
38:03 - 38:05
to get it right long term.
38:05 - 38:06
Instead of going the Google
38:06 - 38:08
way of forcing you to click
38:08 - 38:08
on links. I remember when
38:08 - 38:09
people were talking about
38:09 - 38:10
the commoditization of
38:10 - 38:11
models a year ago and you
38:11 - 38:12
thought, oh, it was
38:12 - 38:14
controversial, but now it's
38:14 - 38:15
not controversial. It's kind
38:15 - 38:16
of like that's happening and
38:16 - 38:17
you're keeping your eye on
38:17 - 38:19
that is smart.
38:19 - 38:20
By the way, we benefit a lot
38:20 - 38:22
from model commoditization,
38:22 - 38:23
except we also need to
38:23 - 38:24
figure out something to
38:24 - 38:26
offer to the paid users,
38:26 - 38:27
like a more sophisticated
38:27 - 38:29
research agent that can do
38:29 - 38:30
like multi-step reasoning,
38:30 - 38:31
go and like do like 15
38:31 - 38:32
minutes worth of searching
38:32 - 38:34
and give you like an
38:34 - 38:35
analysis, an analyst type of
38:35 - 38:37
answer. All that's going to
38:37 - 38:38
come, all that's going to
38:38 - 38:39
stay in the product. Nothing
38:39 - 38:41
changes there. But there's a
38:41 - 38:43
ton of questions every free
38:43 - 38:45
user asks day-to-day basis
38:45 - 38:46
that that needs to be quick,
38:46 - 38:48
fast answers, like it
38:48 - 38:49
shouldn't be slow,
38:49 - 38:51
and all that will be free,
38:51 - 38:52
whether you like it or not,
38:52 - 38:53
it has to be free. That's
38:53 - 38:54
what people are used to.
38:55 - 38:57
And that means like figuring
38:57 - 38:58
out a way to make that free
38:58 - 39:00
traffic also monetizable.
39:00 - 39:01
So you're not trying to
39:01 - 39:02
change user habits. But it's
39:02 - 39:03
interesting because you are
39:03 - 39:04
kind of trying to teach new
39:04 - 39:05
habits to advertisers.
39:05 - 39:07
They can't have everything
39:07 - 39:08
that they have in a Google
39:08 - 39:09
ten blue links search.
39:09 - 39:10
What's the response been
39:10 - 39:11
from them so far? Are they
39:11 - 39:12
willing to accept some of
39:12 - 39:13
the trade offs?
39:13 - 39:14
Yeah, I mean that's why they
39:14 - 39:17
are trying stuff like Intuit
39:17 - 39:18
is working with us.
39:18 - 39:20
And then there's many other
39:20 - 39:23
brands. Dell, like all these
39:23 - 39:24
people are working with us
39:24 - 39:25
to test, right?
39:26 - 39:27
They're also excited about,
39:28 - 39:30
look, everyone knows that,
39:30 - 39:31
like, whether you like it or
39:31 - 39:33
not, 5 or 10 years from now,
39:33 - 39:34
most people are going to be
39:34 - 39:36
asking AIs most of the
39:36 - 39:37
things, and not on the
39:37 - 39:38
traditional search engine,
39:38 - 39:40
everybody understands that.
39:40 - 39:43
So everybody wants to be
39:43 - 39:45
early adopters of the new
39:45 - 39:47
platforms, new UX,
39:47 - 39:48
and learn from it,
39:48 - 39:49
and build things together.
39:49 - 39:51
Not like they're not viewing
39:51 - 39:52
it as like, okay, you guys
39:52 - 39:53
go figure out everything
39:53 - 39:54
else and then we'll come
39:55 - 39:56
I'm smiling because it goes
39:56 - 39:57
back perfectly to the point
39:57 - 39:58
you made when you first sat
39:58 - 40:00
down today, which is
40:00 - 40:01
necessity is the mother of
40:01 - 40:03
all invention,
40:03 - 40:03
right? And that's what
40:03 - 40:04
advertisers are essentially
40:04 - 40:05
looking at. They're saying
40:05 - 40:06
this field is changing.
40:06 - 40:07
We have to learn to adapt
40:07 - 40:09
with it. Okay,
40:09 - 40:10
Arvind, I took up so much of
40:10 - 40:11
your time. Thank you so much
40:11 - 40:12
for taking the time.