00:00 - 00:01

China's latest AI

00:01 - 00:03

breakthrough has leapfrogged

00:03 - 00:04

the world.

00:04 - 00:05

I think we should take the

00:05 - 00:06

development out of China

00:06 - 00:07

very, very seriously.

00:08 - 00:09

A game changing move that

00:09 - 00:11

does not come from OpenAI,

00:11 - 00:13

Google or Meta.

00:13 - 00:14

There is a new model that

00:14 - 00:16

has all of the valley

00:16 - 00:16

buzzing.

00:17 - 00:18

But from a Chinese lab

00:18 - 00:20

called Deepseek.

00:20 - 00:22

It's opened a lot of eyes of

00:22 - 00:23

like what is actually

00:23 - 00:24

happening in AI in China.

00:25 - 00:26

What took Google and OpenAI

00:26 - 00:27

years and hundreds of

00:27 - 00:28

millions of dollars to

00:28 - 00:30

build... Deepseek says took

00:30 - 00:32

it just two months and less

00:32 - 00:34

than $6 million dollars.

00:34 - 00:35

They have the best open

00:35 - 00:37

source model, and all the

00:37 - 00:38

American developers are

00:38 - 00:39

building on that.

00:39 - 00:40

I'm Deirdre Bosa with the

00:40 - 00:42

tech check take... China's

00:42 - 00:43

AI breakthrough.

00:53 - 00:55

It was a technological leap

00:55 - 00:57

that shocked Silicon Valley.

00:57 - 00:59

A newly unveiled free,

00:59 - 01:01

open-source AI model that

01:01 - 01:02

beats some of the most

01:02 - 01:03

powerful ones on the market.

01:03 - 01:04

But it wasn't a new launch

01:04 - 01:06

from OpenAI or model

01:06 - 01:07

announcement from Anthropic.

01:07 - 01:09

This one was built in the

01:09 - 01:11

East by a Chinese research

01:11 - 01:12

lab called Deepseek.

01:13 - 01:14

And the details behind its

01:14 - 01:16

development stunned top AI

01:16 - 01:17

researchers here in the U.S.

01:17 - 01:19

First-the cost.

01:19 - 01:20

The AI lab reportedly spent

01:20 - 01:22

just $5.6 million dollars to

01:22 - 01:24

build Deepseek version 3.

01:24 - 01:26

Compare that to OpenAI,

01:26 - 01:27

which is spending $5 billion

01:27 - 01:28

a year, and Google,

01:28 - 01:30

which expects capital

01:30 - 01:32

expenditures in 2024 to soar

01:32 - 01:34

to over $50 billion.

01:34 - 01:35

And then there's Microsoft

01:35 - 01:36

that shelled out more than

01:36 - 01:39

$13 billion just to invest

01:39 - 01:40

in OpenAI.

01:40 - 01:42

But even more stunning how

01:42 - 01:43

Deepseek's scrap pier model

01:43 - 01:45

was able to outperform the

01:45 - 01:46

lavishly-funded American

01:46 - 01:47

ones.

01:47 - 01:48

To see the Deepseek,

01:49 - 01:51

new model. It's super

01:51 - 01:52

impressive in terms of both

01:52 - 01:53

how they have really

01:53 - 01:55

effectively done an

01:55 - 01:56

open-source model that does

01:56 - 01:58

what is this inference time

01:58 - 01:59

compute. And it's super

01:59 - 02:00

compute efficient.

02:00 - 02:02

It beat Meta's Llama,

02:02 - 02:04

OpenAI's GPT 4-O and

02:04 - 02:05

Anthropic's Claude Sonnet

02:05 - 02:07

3.5 on accuracy on

02:07 - 02:08

wide-ranging tests.

02:08 - 02:09

A subset of 500 math

02:09 - 02:11

problems, an AI math

02:11 - 02:12

evaluation, coding

02:12 - 02:14

competitions, and a test of

02:14 - 02:16

spotting and fixing bugs in

02:16 - 02:18

code. Quickly following that

02:18 - 02:19

up with a new reasoning

02:19 - 02:20

model called R1,

02:20 - 02:22

which just as easily

02:22 - 02:23

outperformed OpenAI's

02:23 - 02:25

cutting-edge o1 in some of

02:25 - 02:26

those third-party tests.

02:26 - 02:29

Today we released Humanity's

02:29 - 02:31

Last Exam, which is a new

02:31 - 02:32

evaluation or benchmark of

02:32 - 02:34

AI models that we produced

02:34 - 02:36

by getting math,

02:36 - 02:37

physics, biology,

02:37 - 02:39

chemistry professors to

02:39 - 02:40

provide the hardest

02:40 - 02:41

questions they could

02:41 - 02:42

possibly imagine. Deepseek,

02:42 - 02:44

which is the leading Chinese

02:44 - 02:47

AI lab, their model is

02:47 - 02:48

actually the top performing,

02:48 - 02:50

or roughly on par with the

02:50 - 02:51

best American models.

02:51 - 02:52

They accomplished all that

02:52 - 02:53

despite the strict

02:53 - 02:54

semiconductor restrictions

02:54 - 02:55

that the U.S . government

02:55 - 02:57

has imposed on China,

02:57 - 02:58

which has essentially

02:58 - 02:59

shackled the amount of

02:59 - 03:01

computing power. Washington

03:01 - 03:02

has drawn a hard line

03:02 - 03:03

against China in the AI

03:03 - 03:05

race. Cutting the country

03:05 - 03:06

off from receiving America's

03:06 - 03:08

most powerful chips like...

03:08 - 03:10

Nvidia's H-100 GPUs.

03:10 - 03:11

Those were once thought to

03:11 - 03:13

be essential to building a

03:13 - 03:14

competitive AI model.

03:15 - 03:16

With startups and big tech

03:16 - 03:17

firms alike scrambling to

03:17 - 03:18

get their hands on any

03:18 - 03:20

available. But Deepseek

03:20 - 03:21

turned that on its head.

03:21 - 03:22

Side-stepping the rules by

03:22 - 03:24

using Nvidia's less

03:24 - 03:27

performant H-800s to build

03:27 - 03:29

the latest model and showing

03:29 - 03:30

that the chip export

03:30 - 03:31

controls were not the

03:31 - 03:33

chokehold D.C. intended.

03:33 - 03:33

They

03:33 - 03:34

were able to take whatever

03:34 - 03:36

hardware they were trained

03:36 - 03:37

on, but use it way more

03:37 - 03:38

efficiently.

03:38 - 03:40

But just who's behind Deep

03:40 - 03:42

seek anyway? Despite its

03:42 - 03:43

breakthrough, very,

03:43 - 03:45

very little is known about

03:45 - 03:46

its lab and its founder,

03:46 - 03:47

Liang Wenfeng.

03:48 - 03:49

According to Chinese media

03:49 - 03:50

reports, Deepseek was born

03:50 - 03:52

out of a Chinese hedge fund

03:52 - 03:53

called High Flyer Quant.

03:53 - 03:55

That manages about $8

03:55 - 03:56

billion in assets.

03:56 - 03:57

The mission, on its

03:57 - 03:58

developer site, it reads

03:58 - 04:00

simply: "unravel the mystery

04:00 - 04:02

of AGI with curiosity.

04:03 - 04:04

Answer the essential

04:04 - 04:06

question with long-termism."

04:06 - 04:08

The leading American AI

04:08 - 04:09

startups, meanwhile – OpenAI

04:09 - 04:11

and Anthropic – they have

04:11 - 04:12

detailed charters and

04:12 - 04:13

constitutions that lay out

04:13 - 04:14

their principles and their

04:14 - 04:15

founding missions,

04:15 - 04:17

like these sections on AI

04:17 - 04:19

safety and responsibility.

04:19 - 04:20

Despite several attempts to

04:20 - 04:22

reach someone at Deepeseek,

04:22 - 04:24

we never got a response.

04:24 - 04:26

How did they actually

04:26 - 04:27

assemble this talent?

04:27 - 04:28

How did they assemble all

04:28 - 04:29

the hardware? How did they

04:29 - 04:30

assemble the data to do all

04:30 - 04:32

this? We don't know, and

04:32 - 04:33

it's never been publicized,

04:33 - 04:34

and hopefully we can learn

04:34 - 04:34

that.

04:35 - 04:36

But the mystery brings into

04:36 - 04:38

sharp relief just how urgent

04:38 - 04:40

and complex the AI face off

04:40 - 04:42

against China has become.

04:42 - 04:43

Because it's not just

04:43 - 04:44

Deepseek. Other,

04:44 - 04:45

more well -known Chinese AI

04:45 - 04:47

models have carved out

04:47 - 04:48

positions in the race with

04:48 - 04:49

limited resources as well.

04:50 - 04:51

Kai Fu Lee, he's one of the

04:51 - 04:53

leading AI researchers in

04:53 - 04:54

China, formerly leading

04:54 - 04:55

Google's operations there.

04:55 - 04:57

Now, his startup,

04:57 - 04:59

"Zero One Dot AI," it's

04:59 - 05:00

attracting attention,

05:00 - 05:01

becoming a unicorn just

05:01 - 05:02

eight months after founding

05:02 - 05:04

and bringing in almost $14

05:04 - 05:06

million in revenue in 2024.

05:06 - 05:08

The thing that shocks my

05:08 - 05:09

friends in the Silicon

05:09 - 05:10

Valley is not just our

05:10 - 05:12

performance, but that we

05:12 - 05:14

trained the model with only

05:14 - 05:17

$3 million, and GPT-4 was

05:17 - 05:18

trained by $80 to $100

05:18 - 05:18

million.

05:19 - 05:20

Trained with just three

05:20 - 05:22

million dollars. Alibaba's

05:22 - 05:23

Qwen, meanwhile, cut costs

05:23 - 05:25

by as much as 85% on its

05:25 - 05:27

large language models in a

05:27 - 05:27

bid to attract more

05:27 - 05:29

developers and signaling

05:29 - 05:35

that the race is on.

05:37 - 05:38

China's breakthrough

05:38 - 05:40

undermines the lead that our

05:40 - 05:42

AI labs were once thought to

05:42 - 05:44

have. In early 2024,

05:44 - 05:45

former Google CEO Eric

05:45 - 05:46

Schmidt. He predicted China

05:46 - 05:48

was 2 to 3 years behind the

05:48 - 05:50

U.S . in AI.

05:50 - 05:51

But now , Schmidt is singing

05:51 - 05:52

a different tune.

05:52 - 05:54

Here he is on ABC's "This

05:54 - 05:54

Week."

05:55 - 05:56

I used to think we were a

05:56 - 05:57

couple of years ahead of

05:57 - 05:59

China, but China has caught

05:59 - 06:01

up in the last six months in

06:01 - 06:02

a way that is remarkable.

06:02 - 06:03

The fact of the matter is

06:03 - 06:06

that a couple of the Chinese

06:06 - 06:07

programs, one,

06:07 - 06:08

for example, is called Deep

06:08 - 06:10

seek, looks like they've

06:10 - 06:11

caught up.

06:11 - 06:12

It raises major questions

06:12 - 06:15

about just how wide open

06:15 - 06:16

AI's moat really is.

06:16 - 06:17

Back when OpenAI released

06:17 - 06:19

ChatGPT to the world in

06:19 - 06:20

November of 2022,

06:21 - 06:22

it was unprecedented and

06:22 - 06:24

uncontested.

06:24 - 06:25

Now, the company faces not

06:25 - 06:26

only the international

06:26 - 06:27

competition from Chinese

06:27 - 06:29

models, but fierce domestic

06:29 - 06:30

competition from Google's

06:30 - 06:32

Gemini, Anthropic's Claude,

06:32 - 06:33

and Meta's open source Llama

06:33 - 06:35

Model. And now the game has

06:35 - 06:36

changed. The widespread

06:36 - 06:38

availability of powerful

06:38 - 06:40

open-source models allows

06:40 - 06:41

developers to skip the

06:41 - 06:43

demanding, capital-intensive

06:43 - 06:45

steps of building and

06:45 - 06:46

training models themselves.

06:46 - 06:48

Now they can build on top of

06:48 - 06:49

existing models,

06:49 - 06:51

making it significantly

06:51 - 06:52

easier to jump to the

06:52 - 06:53

frontier, that is the front

06:53 - 06:55

of the race, with a smaller

06:55 - 06:57

budget and a smaller team.

06:57 - 06:59

In the last two weeks,

06:59 - 07:01

AI research teams have

07:01 - 07:04

really opened their eyes and

07:04 - 07:05

have become way more

07:05 - 07:07

ambitious on what's possible

07:07 - 07:08

with a lot less capital.

07:09 - 07:10

So previously,

07:11 - 07:12

to get to the frontier,

07:13 - 07:13

you would have to think

07:13 - 07:14

about hundreds of millions

07:14 - 07:16

of dollars of investment and

07:16 - 07:17

perhaps a billion dollars of

07:17 - 07:18

investment. What Deepseek

07:18 - 07:19

has now done here in Silicon

07:19 - 07:21

Valley is it's opened our

07:21 - 07:22

eyes to what you can

07:22 - 07:24

actually accomplish with 10,

07:24 - 07:26

15, 20, or 30 million

07:26 - 07:26

dollars.

07:27 - 07:28

It also means any company,

07:28 - 07:30

like OpenAI, that claims the

07:30 - 07:32

frontier today ...could lose

07:32 - 07:34

it tomorrow. That's how

07:34 - 07:35

Deepseek was able to catch

07:35 - 07:36

up so quickly. It started

07:36 - 07:37

building on the existing

07:37 - 07:39

frontier of AI,

07:39 - 07:40

its approach focusing on

07:40 - 07:41

iterating on existing

07:41 - 07:43

technology rather than

07:43 - 07:44

reinventing the wheel.

07:44 - 07:47

They can take a really good

07:47 - 07:49

big model and use a process

07:49 - 07:50

called distillation. And

07:50 - 07:51

what distillation is,

07:51 - 07:53

basically you use a very

07:53 - 07:56

large model to help your

07:56 - 07:57

small model get smart at the

07:57 - 07:58

thing that you want it to

07:58 - 07:59

get smart at. And that's

07:59 - 08:00

actually a very cost

08:00 - 08:01

efficient.

08:01 - 08:03

It closed the gap by using

08:03 - 08:04

available data sets,

08:04 - 08:06

applying innovative tweaks,

08:06 - 08:07

and leveraging existing

08:07 - 08:09

models. So much so,

08:09 - 08:10

that Deepseek's model has

08:10 - 08:12

run into an identity crisis.

08:13 - 08:14

It's convinced that its

08:14 - 08:16

ChatGPT, when you ask it

08:16 - 08:17

directly, "what model are

08:17 - 08:19

you?" Deepseek responds...

08:19 - 08:20

I'm an AI language model

08:20 - 08:21

created by OpenAI,

08:22 - 08:23

specifically based on the

08:23 - 08:25

GPT -4 architecture.

08:25 - 08:26

Leading OpenAI CEO Sam

08:26 - 08:28

Altman to post in a thinly

08:28 - 08:30

veiled shot at Deepseek just

08:30 - 08:31

days after the model was

08:31 - 08:33

released. "It's relatively

08:33 - 08:34

easy to copy something that

08:34 - 08:35

you know works.

08:35 - 08:36

It's extremely hard to do

08:36 - 08:37

something new,

08:37 - 08:39

risky, and difficult when

08:39 - 08:40

you don't know if it will

08:40 - 08:42

work." But that's not

08:42 - 08:43

exactly what Deepseek did.

08:44 - 08:45

It emulated GPT by

08:45 - 08:47

leveraging OpenAI's existing

08:47 - 08:48

outputs and architecture

08:48 - 08:49

principles, while quietly

08:49 - 08:50

introducing its own

08:50 - 08:51

enhancements, really

08:51 - 08:53

blurring the line between

08:53 - 08:54

itself and ChatGPT.

08:55 - 08:56

It all puts pressure on a

08:56 - 08:57

closed source leader like

08:57 - 08:58

OpenAI to justify its

08:58 - 09:00

costlier model as more

09:00 - 09:01

potentially nimbler

09:01 - 09:02

competitors emerge.

09:02 - 09:03

Everybody copies everybody

09:03 - 09:05

in this field.

09:05 - 09:07

You can say Google did the

09:07 - 09:08

transformer first. It's not

09:08 - 09:10

OpenAI and OpenAI just

09:10 - 09:12

copied it. Google built the

09:12 - 09:13

first large language models.

09:13 - 09:14

They didn't productise it,

09:14 - 09:16

but OpenAI did it into a

09:16 - 09:19

productized way. So you can

09:19 - 09:21

say all this in many ways.

09:21 - 09:22

It doesn't matter.

09:22 - 09:24

So if everyone is copying

09:24 - 09:25

one another, it raises the

09:25 - 09:28

question, is massive spend

09:28 - 09:31

on individual L-L-Ms even a

09:31 - 09:32

good investment anymore?

09:32 - 09:34

Now, no one has as much at

09:34 - 09:35

stake as OpenAI.

09:35 - 09:36

The startup raised over $6

09:36 - 09:38

billion in its last funding

09:38 - 09:39

round alone. But,

09:39 - 09:41

the company has yet to turn

09:41 - 09:43

a profit. And with its core

09:43 - 09:44

business centered on

09:44 - 09:45

building the models -

09:45 - 09:46

it's much more exposed than

09:46 - 09:47

companies like Google and

09:47 - 09:49

Amazon, who have cloud and

09:49 - 09:51

ad businesses bankrolling

09:51 - 09:53

their spend. For OpenAI,

09:53 - 09:54

reasoning will be key.

09:54 - 09:56

A model that thinks before

09:56 - 09:57

it generates a response,

09:57 - 09:58

going beyond pattern

09:58 - 09:59

recognition to analyze,

09:59 - 10:01

draw logical conclusions,

10:01 - 10:02

and solve really complex

10:02 - 10:04

problems. For now,

10:04 - 10:05

the startup's o1 reasoning

10:05 - 10:07

model is still cutting edge.

10:08 - 10:09

But for how long?

10:09 - 10:10

Researchers at Berkeley

10:10 - 10:11

showed that they could build

10:11 - 10:13

a reasoning model for $450

10:13 - 10:15

just last week. So you can

10:15 - 10:16

actually create these models

10:16 - 10:18

that do thinking for much,

10:18 - 10:19

much less. You don't need

10:19 - 10:21

those huge amounts of to

10:21 - 10:22

pre-train the models. So I

10:22 - 10:24

think the game is shifting.

10:24 - 10:26

It means that staying on top

10:26 - 10:27

may require as much

10:27 - 10:29

creativity as capital.

10:29 - 10:31

Deepseek's breakthrough also

10:31 - 10:32

comes at a very tricky time

10:32 - 10:33

for the AI darling.

10:33 - 10:35

Just as OpenAI is moving to

10:35 - 10:37

a for-profit model and

10:37 - 10:39

facing unprecedented brain

10:39 - 10:41

drain. Can it raise more

10:41 - 10:42

money at ever higher

10:42 - 10:43

valuations if the game is

10:43 - 10:44

changing? As Chamath

10:44 - 10:46

Palihapitiya puts it...

10:46 - 10:47

let me say the quiet part

10:47 - 10:49

out loud: AI model building

10:49 - 10:51

is a money trap.

10:58 - 10:59

Those trip restrictions from

10:59 - 11:00

the U.S . government, they

11:00 - 11:03

were intended to slow down

11:03 - 11:04

the race. To keep American

11:04 - 11:06

tech on American ground,

11:06 - 11:07

to stay ahead in the race.

11:07 - 11:08

What we want to do is we

11:08 - 11:09

want to keep it in this

11:09 - 11:10

country. China is a

11:10 - 11:11

competitor and others are

11:11 - 11:12

competitors.

11:12 - 11:14

So instead, the restrictions

11:14 - 11:15

might have been just what

11:15 - 11:16

China needed.

11:16 - 11:17

Necessity is the mother of

11:17 - 11:19

invention.

11:19 - 11:22

B ecause they had to go

11:22 - 11:24

figure out workarounds,

11:25 - 11:25

they actually ended up

11:25 - 11:26

building something a lot

11:26 - 11:27

more efficient.

11:27 - 11:28

It's really remarkable the

11:28 - 11:29

amount of progress they've

11:29 - 11:31

made with as little capital

11:32 - 11:33

as it's taken them to make

11:33 - 11:34

that progress.

11:34 - 11:35

It drove them to get

11:35 - 11:36

creative. With huge

11:36 - 11:38

implications. Deepseek is an

11:38 - 11:39

open-source model, meaning

11:39 - 11:41

that developers have full

11:41 - 11:42

access and they can

11:42 - 11:43

customize its weights or

11:43 - 11:44

fine -tune it to their

11:44 - 11:44

liking.

11:45 - 11:46

It's known that once open

11:46 - 11:47

-source is caught up or

11:48 - 11:49

improved over closed source

11:49 - 11:52

software, all developers

11:52 - 11:53

migrate to that.

11:53 - 11:53

But

11:53 - 11:55

key is that it's also

11:55 - 11:57

inexpensive. The lower the

11:57 - 11:58

cost, the more attractive it

11:58 - 12:00

is for developers to adopt.

12:00 - 12:01

The bottom line is our

12:01 - 12:03

inference cost is 10 cents

12:03 - 12:05

per million tokens,

12:05 - 12:07

and that's 1/30th of what

12:07 - 12:08

the typical comparable model

12:08 - 12:10

charges. Where's it going?

12:10 - 12:11

It's well, the 10 cents

12:11 - 12:14

would lead to building apps

12:14 - 12:15

for much lower costs.

12:15 - 12:17

So if you wanted to build a

12:17 - 12:19

u.com or Perplexity or some

12:19 - 12:21

other app, you can either

12:21 - 12:23

pay OpenAI $4.40 per million

12:23 - 12:26

tokens, or if you have our

12:26 - 12:27

model, it costs you just 10

12:27 - 12:27

cents.

12:28 - 12:29

It could mean that the

12:29 - 12:30

prevailing model in global

12:30 - 12:32

AI may be open -source,

12:32 - 12:34

as organizations and nations

12:34 - 12:35

come around to the idea that

12:35 - 12:36

collaboration and

12:36 - 12:37

decentralization,

12:37 - 12:38

those things can drive

12:38 - 12:40

innovation faster and more

12:40 - 12:41

efficiently than

12:41 - 12:42

proprietary, closed

12:42 - 12:44

ecosystems. A cheaper,

12:44 - 12:45

more efficient, widely

12:45 - 12:47

adopted open -source model

12:47 - 12:49

from China that could lead

12:49 - 12:51

to a major shift in

12:51 - 12:51

dynamics.

12:52 - 12:53

That's more dangerous,

12:54 - 12:56

because then they get to own

12:56 - 12:58

the mindshare, the

12:58 - 12:58

ecosystem.

12:59 - 13:00

In other words, the adoption

13:00 - 13:01

of a Chinese open-source

13:01 - 13:02

model at scale that could

13:02 - 13:04

undermine U.S . leadership

13:04 - 13:06

while embedding China more

13:06 - 13:07

deeply into the fabric of

13:07 - 13:09

global tech infrastructure.

13:09 - 13:10

There's always a point where

13:10 - 13:12

open source can stop being

13:12 - 13:13

open -source, too,

13:13 - 13:15

right? So, the licenses are

13:15 - 13:16

very favorable today,

13:16 - 13:17

but-it could close it.

13:17 - 13:19

Exactly, over time,

13:19 - 13:20

they can always change the

13:20 - 13:22

license. So, it's important

13:22 - 13:24

that we actually have people

13:24 - 13:26

here in America building,

13:26 - 13:27

and that's why Meta is so

13:27 - 13:28

important.

13:28 - 13:29

Another consequence of

13:29 - 13:31

China's AI breakthrough is

13:31 - 13:32

giving its Communist Party

13:32 - 13:34

control of the narrative.

13:34 - 13:35

AI models built in China t

13:35 - 13:36

hey're forced to adhere to a

13:36 - 13:37

certain set of rules set by

13:37 - 13:39

the state. They must embody

13:39 - 13:41

"core socialist values."

13:41 - 13:42

Studies have shown that

13:42 - 13:44

models created by Tencent

13:44 - 13:45

and Alibaba, they will

13:45 - 13:46

censor historical events

13:46 - 13:48

like Tiananmen Square,

13:48 - 13:49

deny human rights abuse,

13:50 - 13:51

and filter criticism of

13:51 - 13:53

Chinese political leaders.

13:53 - 13:54

That contest is about

13:54 - 13:55

whether we're going to have

13:55 - 13:56

democratic AI informed by

13:56 - 13:58

democratic values,

13:58 - 14:00

built to serve democratic

14:00 - 14:01

purposes, or we're going to

14:01 - 14:03

end up with with autocratic

14:03 - 14:03

AI.

14:03 - 14:04

If developers really begin

14:04 - 14:06

to adopt these models en

14:06 - 14:07

masse because they're more

14:07 - 14:08

efficient, that could have a

14:08 - 14:10

serious ripple effect.

14:10 - 14:11

Trickle down to even

14:11 - 14:12

consumer-facing AI

14:12 - 14:13

applications, and influence

14:13 - 14:15

how trustworthy those

14:15 - 14:16

AI-generated responses from

14:16 - 14:18

chatbots really are.

14:18 - 14:19

And there's really only two

14:19 - 14:20

countries right now in the

14:20 - 14:22

world that can build this at

14:22 - 14:23

scale, you know,

14:23 - 14:25

and that is the U.S .

14:25 - 14:27

and China, and so,

14:27 - 14:28

you know, the consequences

14:28 - 14:30

of the stakes in and around

14:30 - 14:32

this are just enormous.

14:32 - 14:33

Enormous stakes,

14:33 - 14:35

enormous consequences,

14:35 - 14:37

and hanging in the balance:

14:37 - 14:38

A merica's lead.

14:42 - 14:44

For a topic so complex and

14:44 - 14:45

new, we turn to an expert

14:45 - 14:47

who's actually building in

14:47 - 14:48

the space, and

14:48 - 14:50

model-agnostic. Perplexity

14:50 - 14:51

co-founder and CEO Arvind

14:51 - 14:52

Srinivas – who you heard

14:52 - 14:53

from throughout our piece.

14:54 - 14:55

He sat down with me for more

14:55 - 14:56

than 30 minutes to discuss

14:56 - 14:57

Deepseek and its

14:57 - 14:59

implications, as well as

14:59 - 15:00

Perplexity's roadmap.

15:00 - 15:01

We think it's worth

15:01 - 15:02

listening to that whole

15:02 - 15:04

conversation, so here it is.

15:04 - 15:05

So first I want to know what

15:05 - 15:07

the stakes are. What,

15:07 - 15:09

like describe the AI race

15:09 - 15:11

between China and the U.S .

15:11 - 15:12

and what's at stake.

15:13 - 15:14

Okay, so first of all,

15:14 - 15:16

China has a lot of

15:16 - 15:18

disadvantages in competing

15:18 - 15:21

with the U.S. Number one is,

15:21 - 15:22

the fact that they don't get

15:22 - 15:24

access to all the hardware

15:24 - 15:26

that we have access to here.

15:27 - 15:28

So they're kind of working

15:28 - 15:30

with lower end GPUs than us.

15:31 - 15:32

I t's almost like working

15:32 - 15:33

with the previous generation

15:33 - 15:35

GPUs, scrappily.

15:35 - 15:38

S o and the fact that the

15:38 - 15:39

bigger models tend to be

15:39 - 15:42

more smarter, naturally puts

15:42 - 15:43

them at a disadvantage.

15:43 - 15:46

But the flip side of this is

15:46 - 15:47

that necessity is the mother

15:47 - 15:51

of invention, because they

15:51 - 15:52

had to go figure out

15:53 - 15:55

workarounds. They actually

15:55 - 15:56

ended up building something

15:56 - 15:58

a lot more efficient.

15:58 - 15:59

It's like saying, "hey look,

15:59 - 16:01

you guys really got to get a

16:01 - 16:04

top notch model, and I'm not

16:04 - 16:05

going to give you resources

16:05 - 16:07

and figure out something,"

16:07 - 16:08

right? Unless it's

16:08 - 16:09

impossible, unless it's

16:09 - 16:11

mathematically possible to

16:11 - 16:13

prove that it's impossible

16:13 - 16:14

to do so, you can always try

16:14 - 16:15

to like come up with

16:15 - 16:17

something more efficient.

16:17 - 16:20

But that is likely to make

16:20 - 16:21

them come up with a more

16:21 - 16:22

efficient solution than

16:22 - 16:24

America. And of course,

16:24 - 16:25

they have open -sourced it,

16:25 - 16:27

so we can still adopt

16:27 - 16:28

something like that here.

16:28 - 16:30

But that kind of talent

16:30 - 16:32

they're building to do that

16:32 - 16:33

will become an edge for them

16:33 - 16:34

over time right?

16:35 - 16:36

T he leading open-source

16:36 - 16:38

model in America's Meta's

16:38 - 16:40

Llama family. It's really

16:40 - 16:41

good. It's kind of like a

16:41 - 16:42

model that you can run on

16:42 - 16:43

your computer.

16:43 - 16:45

B ut even though it got

16:45 - 16:47

pretty close to GBT-4,

16:48 - 16:50

and at the time of its

16:50 - 16:51

release, the model that was

16:51 - 16:54

closest in quality was the

16:54 - 16:56

giant 405B, not the 70B that

16:56 - 16:56

you could run on your

16:56 - 16:59

computer. And so there was

16:59 - 17:01

still a not a small,

17:01 - 17:02

cheap, fast, efficient,

17:02 - 17:04

open-source model that

17:04 - 17:06

rivaled the most powerful

17:06 - 17:07

closed models from OpenAI,

17:07 - 17:09

Anthropic. Nothing from

17:09 - 17:11

America, nothing from

17:11 - 17:12

Mistral AI either.

17:12 - 17:13

And then these guys come

17:13 - 17:16

out, with like a crazy model

17:16 - 17:17

that's like 10x cheaper and

17:17 - 17:19

API pricing than GPT -4 and

17:19 - 17:21

15x cheaper than Sonnet,

17:21 - 17:23

I believe. Really fast,

17:23 - 17:24

16 tokens per second–60

17:24 - 17:25

tokens per second,

17:26 - 17:29

and pretty much equal or

17:29 - 17:30

better in some benchmarks

17:30 - 17:31

and worse in some others.

17:31 - 17:32

But like roughly in that

17:32 - 17:34

ballpark of 4-O's quality.

17:35 - 17:36

And they did it all with

17:36 - 17:39

like approximately just 20,

17:39 - 17:41

48, 800 GPUs, which is

17:41 - 17:42

actually equivalent to like

17:42 - 17:44

somewhere around 1,500 or

17:44 - 17:47

1,000 to 1,500 H100 GPUs.

17:47 - 17:50

That's like 20 to 30x lower

17:50 - 17:52

than the amount of GPUs that

17:52 - 17:53

GPT -4s is usually trained

17:53 - 17:56

on, and roughly $5 million

17:56 - 17:58

in total compute budget.

17:59 - 18:00

They did it with so little

18:00 - 18:02

money and such an amazing

18:02 - 18:04

model, gave it away for

18:04 - 18:04

free, wrote a technical

18:04 - 18:06

paper, and definitely it

18:06 - 18:09

makes us all question like,

18:09 - 18:10

"okay, like if we have the

18:10 - 18:12

equivalent of Doge for like

18:12 - 18:14

model training,

18:14 - 18:15

this is an example of that,

18:15 - 18:16

right?"

18:16 - 18:17

Right. Yeah. Efficiency,

18:18 - 18:19

is what you're getting at.

18:19 - 18:20

So, fraction of the price,

18:21 - 18:22

fraction of the time.

18:22 - 18:23

Yeah. Dumb down GPUs

18:23 - 18:25

essentially. What was your

18:25 - 18:27

surprise when you understood

18:27 - 18:28

what they had done.

18:28 - 18:30

So my surprise was that when

18:30 - 18:31

I actually went through the

18:31 - 18:33

technical paper,

18:33 - 18:35

the amount of clever

18:35 - 18:37

solutions they came up with,

18:38 - 18:39

first of all, they train a

18:39 - 18:40

mixture of experts model.

18:40 - 18:42

It's not that easy to train,

18:43 - 18:44

there's a lot of like,

18:44 - 18:46

the main reason people find

18:46 - 18:46

it difficult to catch up

18:46 - 18:48

with OpenAI, especially on

18:48 - 18:49

the MoE architecture,

18:49 - 18:51

is that there's a lot of,

18:52 - 18:54

irregular loss spikes.

18:54 - 18:56

The numerics are not stable,

18:56 - 18:57

so often, like,

18:57 - 18:59

you've got to restart the

18:59 - 19:00

training checkpoint again,

19:00 - 19:01

and a lot of infrastructure

19:01 - 19:03

needs to be built for that.

19:03 - 19:04

And they came up with very

19:04 - 19:06

clever solutions to balance

19:06 - 19:07

that without adding

19:07 - 19:09

additional hacks.

19:09 - 19:12

T hey also figured out

19:12 - 19:13

floating point-8 bit

19:13 - 19:15

training, at least for some

19:15 - 19:17

of the numerics. And they

19:17 - 19:18

cleverly figured out which

19:18 - 19:19

has to be in higher

19:19 - 19:20

precision, which has to be

19:20 - 19:22

in lower precision. T o my

19:22 - 19:24

knowledge, I think floating

19:24 - 19:26

point-8 training is not that

19:26 - 19:27

well understood. Most of the

19:27 - 19:28

training in America is still

19:28 - 19:30

running in FP16.

19:30 - 19:31

Maybe OpenAI and some of the

19:31 - 19:32

people are trying to explore

19:32 - 19:33

that, but it's pretty

19:33 - 19:35

difficult to get it right.

19:35 - 19:36

So because necessity is the

19:36 - 19:37

mother of invention, because

19:37 - 19:38

they don't have that much

19:38 - 19:39

memory, that many GPUs.

19:40 - 19:41

T hey figured out a lot

19:41 - 19:42

of

19:42 - 19:44

numerical stability stuff

19:44 - 19:45

that makes their training

19:45 - 19:46

work. And they claimed in

19:46 - 19:48

the paper that for majority

19:48 - 19:49

of the training was stable.

19:50 - 19:51

Which means what? They can

19:51 - 19:53

always rerun those training

19:53 - 19:57

runs again and on more data

19:57 - 19:59

or better data. And then,

20:00 - 20:01

it only trained for 60 days.

20:02 - 20:03

So that's pretty amazing.

20:04 - 20:05

Safe to say you were

20:05 - 20:05

surprised.

20:05 - 20:06

So I was definitely

20:06 - 20:08

surprised. Usually the

20:08 - 20:11

wisdom or, like I wouldn't

20:11 - 20:12

say, wisdom, the myth, is

20:12 - 20:14

that Chinese are just good

20:14 - 20:16

at copying. So if we start

20:16 - 20:18

stop writing research papers

20:18 - 20:20

in America, if we stop

20:20 - 20:22

describing the details of

20:22 - 20:23

our infrastructure or

20:23 - 20:25

architecture, and stop open

20:25 - 20:27

sourcing, they're not going

20:27 - 20:29

to be able to catch up. But

20:29 - 20:30

the reality is, some of the

20:30 - 20:33

details in Deep seek v3 are

20:33 - 20:34

so good that I wouldn't be

20:34 - 20:36

surprised if Meta took a

20:36 - 20:38

look at it and incorporated

20:38 - 20:38

some of that –tried to copy

20:38 - 20:41

them . Right.

20:41 - 20:42

I wouldn't necessarily say

20:42 - 20:43

copy. It's all like,

20:43 - 20:44

you know, sharing science,

20:45 - 20:47

engineering, but the point

20:47 - 20:48

is like, it's changing.

20:48 - 20:50

Like, it's not like China is

20:50 - 20:51

just copycat. They're also

20:51 - 20:52

innovating.

20:52 - 20:53

We don't know exactly the

20:53 - 20:55

data that it was trained on

20:55 - 20:56

right? Even though it's open

20:56 - 20:57

-source, we know some of the

20:57 - 20:59

ways and things that was

20:59 - 20:59

trained up, but not

20:59 - 21:01

everything. And there's this

21:01 - 21:02

idea that it was trained on

21:02 - 21:05

public ChatGPT outputs,

21:05 - 21:06

which would mean it just was

21:06 - 21:07

copied. But you're saying it

21:07 - 21:08

goes beyond that? There's

21:08 - 21:09

real innovation in there?

21:09 - 21:09

Yeah,

21:09 - 21:11

look, I mean, they've

21:11 - 21:13

trained it on 14.8 trillion

21:13 - 21:15

tokens. T he internet has so

21:15 - 21:16

much ChatGPT. If you

21:16 - 21:18

actually go to any LinkedIn

21:18 - 21:19

post or X post.

21:19 - 21:21

Now, most of the comments

21:21 - 21:22

are written by AI. You can

21:22 - 21:24

just see it, like people are

21:24 - 21:25

just trying to write. In

21:25 - 21:28

fact, even with an X,

21:28 - 21:30

there's like a Grok tweet

21:30 - 21:31

enhancer, or in LinkedIn

21:31 - 21:32

there's an AI enhancer,

21:33 - 21:37

or in Google Docs and Word.

21:37 - 21:38

There are AI tools to like

21:38 - 21:40

rewrite your stuff. So if

21:40 - 21:41

you do something there and

21:41 - 21:43

copy paste somewhere on the

21:43 - 21:44

internet, it's naturally

21:44 - 21:45

going to have some elements

21:45 - 21:48

of a ChatGPT like training,

21:48 - 21:49

right? And there's a lot of

21:49 - 21:51

people who don't even bother

21:51 - 21:53

to strip away that I'm a

21:53 - 21:55

language model, right?

21:55 - 21:56

–part. So, they just paste

21:56 - 21:58

it somewhere and it's very

21:58 - 21:59

difficult to control for

21:59 - 22:01

this. I think xAI has spoken

22:01 - 22:02

about this too, so I

22:02 - 22:04

wouldn't like disregard

22:04 - 22:05

their technical

22:05 - 22:07

accomplishment just because

22:07 - 22:08

like for some prompts like

22:08 - 22:10

who are you, or like which

22:10 - 22:11

model are you at response

22:11 - 22:12

like that? It doesn't even

22:12 - 22:13

matter in my opinion.

22:13 - 22:13

For a long

22:13 - 22:14

time we thought, I don't

22:14 - 22:15

know if you agreed with us,

22:15 - 22:17

China was behind in AI,

22:17 - 22:18

what does this do to that

22:18 - 22:20

race? Can we say that China

22:20 - 22:22

is catching up or has it

22:22 - 22:22

caught up?

22:23 - 22:25

I mean, like if we say the

22:25 - 22:27

matter is catching up to

22:27 - 22:28

OpenAI and Anthropic,

22:28 - 22:31

if you make that claim,

22:31 - 22:32

then the same claim can be

22:32 - 22:33

made for China catching up

22:33 - 22:34

to America.

22:34 - 22:36

A lot of papers from China

22:36 - 22:37

that have tried to replicate

22:37 - 22:39

o1, in fact, I saw more

22:39 - 22:41

papers from China after o1

22:42 - 22:43

announcement that tried to

22:43 - 22:44

replicate it than from

22:44 - 22:46

America. Like,

22:46 - 22:47

and the amount of compute

22:48 - 22:50

Deepseek has access to is

22:50 - 22:52

roughly similar to what PhD

22:52 - 22:54

students in the U.S .

22:54 - 22:55

have access to. By the way,

22:55 - 22:56

this is not meant to

22:56 - 22:57

criticize others like even

22:57 - 22:59

for ourselves, like,

22:59 - 23:00

you know, I for Perplexity,

23:00 - 23:01

we decided not to train

23:01 - 23:02

models because we thought

23:02 - 23:03

it's like a very expensive

23:03 - 23:07

thing. A nd we thought like,

23:07 - 23:08

there's no way to catch up

23:08 - 23:09

with the rest.

23:09 - 23:10

But will you incorporate

23:10 - 23:12

Deepseek into Perplexity?

23:12 - 23:13

Oh, we already are beginning

23:13 - 23:15

to use it.

23:15 - 23:16

I think they have an API,

23:16 - 23:18

and we're also they have

23:18 - 23:18

open source weights, so we

23:18 - 23:20

can host it ourselves, too.

23:20 - 23:21

And it's good to, like,

23:21 - 23:22

try to start using that

23:23 - 23:24

because it's actually,

23:24 - 23:25

allows us to do a lot of the

23:25 - 23:27

things at lower cost.

23:27 - 23:28

But what I'm kind of

23:28 - 23:30

thinking is beyond that,

23:30 - 23:31

which is like, okay, if

23:31 - 23:33

these guys actually could

23:33 - 23:34

train such a great model

23:34 - 23:37

with, good team like,

23:37 - 23:38

and there's no excuse

23:38 - 23:39

anymore for companies in the

23:39 - 23:41

U.S., including ourselves,

23:41 - 23:42

to like, not try to do

23:42 - 23:43

something like that.

23:43 - 23:44

You hear a lot in public

23:44 - 23:45

from a lot of, you know,

23:45 - 23:46

thought leaders in

23:46 - 23:47

generative AI, both on the

23:47 - 23:48

research side, on the

23:48 - 23:50

entrepreneurial side,

23:50 - 23:51

like Elon Musk and others

23:51 - 23:53

say that China can't catch

23:53 - 23:55

up. Like it's the stakes are

23:55 - 23:56

too big. The geopolitical

23:56 - 23:58

stakes, whoever dominates AI

23:58 - 23:59

is going to kind of dominate

23:59 - 24:01

the economy, dominate the

24:01 - 24:02

world. You know,

24:02 - 24:03

it's been talked about in

24:03 - 24:04

those massive terms. Are you

24:04 - 24:06

worried about what China

24:06 - 24:07

proved it was able to do?

24:08 - 24:09

Firstly, I don't know if

24:09 - 24:10

Elon ever said China can't

24:10 - 24:11

catch up.

24:11 - 24:13

I'm not – just the threat of

24:13 - 24:14

China. He's only identified

24:14 - 24:16

the threat of letting China,

24:16 - 24:17

and you know, Sam Altman has

24:17 - 24:18

said similar things, we

24:18 - 24:20

can't let China win the

24:20 - 24:20

race.

24:20 - 24:22

You know, it's all I think

24:22 - 24:25

you got to decouple what

24:25 - 24:26

someone like Sam says to

24:26 - 24:27

like what is in his

24:27 - 24:29

self-interest. Right?

24:30 - 24:34

Look, I think the my point

24:34 - 24:37

is, like, whatever you did

24:37 - 24:38

to not let them catch up

24:38 - 24:40

didn't even matter. They

24:40 - 24:42

ended up catching up anyway.

24:42 - 24:43

Necessity is the mother of

24:43 - 24:44

invention

24:44 - 24:46

like you said. And you it's

24:46 - 24:48

actually, you know what's

24:48 - 24:49

more dangerous than trying

24:49 - 24:51

to do all the things to not

24:51 - 24:52

let them catch up and, you

24:52 - 24:54

know, all this stuff is

24:54 - 24:55

what's more dangerous is

24:55 - 24:56

they have the best

24:56 - 24:57

open-source model. And all

24:57 - 24:59

the American developers are

24:59 - 25:00

building on that. Right.

25:00 - 25:02

That's more dangerous

25:02 - 25:05

because then they get to own

25:05 - 25:06

the mindshare, the

25:06 - 25:07

ecosystem.

25:07 - 25:09

If the entire American AI

25:09 - 25:10

ecosystem look,

25:10 - 25:12

in general, it's known that

25:12 - 25:13

once open-source is caught

25:13 - 25:15

up or improved over closed

25:15 - 25:18

source software, all

25:18 - 25:19

developers migrate to that.

25:20 - 25:21

It's historically known,

25:21 - 25:21

right?

25:21 - 25:23

When Llama was being built

25:23 - 25:24

and becoming more widely

25:24 - 25:25

used, there was this

25:25 - 25:26

question should we trust

25:26 - 25:27

Zuckerberg? But now the

25:27 - 25:29

question is should we trust

25:29 - 25:30

China? That's a very–You

25:30 - 25:30

should

25:30 - 25:31

trust open-source, that's

25:31 - 25:33

the like it's not about who,

25:33 - 25:35

is it Zuckerberg, or is it.

25:35 - 25:36

Does it matter then if it's

25:37 - 25:37

Chinese, if it's

25:37 - 25:38

open-source?

25:39 - 25:41

Look, it doesn't matter in

25:41 - 25:43

the sense that you still

25:43 - 25:44

have full control.

25:45 - 25:46

Y ou run it as your own,

25:47 - 25:48

like set of weights on your

25:48 - 25:50

own computer, you are in

25:50 - 25:52

charge of the model. But,

25:52 - 25:54

it's not a great look for

25:54 - 25:56

our own, like, talent to

25:57 - 25:58

rely on software built by

25:59 - 26:00

others.

26:00 - 26:01

E ven if it's open-source,

26:01 - 26:04

there's always, like, a

26:04 - 26:05

point where open-source can

26:05 - 26:07

stop being open-source, too,

26:07 - 26:09

right? So the licenses are

26:09 - 26:10

very favorable today,

26:10 - 26:11

but if – you can close it –

26:11 - 26:13

exactly, over time,

26:13 - 26:15

they can always change the

26:15 - 26:16

license. So, it's important

26:16 - 26:18

that we actually have people

26:18 - 26:20

here in America building,

26:20 - 26:21

and that's why Meta is so

26:21 - 26:23

important. Like I look I

26:23 - 26:25

still think Meta will build

26:25 - 26:26

a better model than Deep

26:26 - 26:28

seek v3 and open-source it,

26:28 - 26:29

and they'll call it Llama 4

26:29 - 26:31

or 3 point something,

26:31 - 26:33

doesn't matter, but I think

26:33 - 26:35

what is more key is that we

26:35 - 26:38

don't try to focus all our

26:38 - 26:41

energy on banning them,

26:41 - 26:42

stopping them, and just try

26:42 - 26:43

to outcompete and win them.

26:43 - 26:44

That's just that's just the

26:44 - 26:45

American way of doing things

26:45 - 26:46

just be

26:46 - 26:47

better. And it feels like

26:47 - 26:48

there's, you know, we hear a

26:48 - 26:49

lot more about these Chinese

26:49 - 26:51

companies who are developing

26:51 - 26:52

in a similar way, a lot more

26:52 - 26:53

efficiently, a lot more cost

26:53 - 26:55

effectively right? –Yeah,

26:55 - 26:56

again, like, look,

26:56 - 26:58

it's hard to fake scarcity,

26:58 - 27:01

right? If you raise $10

27:01 - 27:02

billion and you decide to

27:02 - 27:04

spend 80% of it on a compute

27:04 - 27:06

cluster, it's hard for you

27:06 - 27:07

to come up with the exact

27:07 - 27:08

same solution that someone

27:08 - 27:10

with $5 million would do.

27:10 - 27:12

And there's no point,

27:13 - 27:14

no need to, like, sort of

27:14 - 27:15

berate those who are putting

27:15 - 27:17

more money. They're trying

27:17 - 27:18

to do it as fast as they

27:18 - 27:18

can.

27:18 - 27:19

When we say open -source,

27:19 - 27:20

there's so many different

27:20 - 27:21

versions. Some people

27:21 - 27:22

criticize Meta for not

27:22 - 27:23

publishing everything,

27:23 - 27:24

and even Deepseek itself

27:24 - 27:26

isn't totally transparent.

27:26 - 27:27

Yeah, you can go to the

27:27 - 27:28

limits of open-source and

27:28 - 27:30

say, I should exactly be

27:30 - 27:31

able to replicate your

27:31 - 27:33

training run. But first of

27:33 - 27:34

all, how many people even

27:34 - 27:36

have the resources to do

27:36 - 27:40

that. And I think the amount

27:40 - 27:41

of detail they've shared in

27:41 - 27:42

the technical report,

27:43 - 27:44

actually Meta did that too,

27:44 - 27:46

by the way, Meta's Llama 3.3

27:46 - 27:47

technical report is

27:47 - 27:48

incredibly detailed,

27:48 - 27:50

and very great for science.

27:51 - 27:52

So the amount of details

27:52 - 27:53

they get these people are

27:53 - 27:54

sharing is already a lot

27:54 - 27:56

more than what the other

27:56 - 27:57

companies are doing right

27:57 - 27:57

now.

27:57 - 27:58

When you think about how

27:58 - 27:59

much it costs Deepseek to do

27:59 - 28:01

this, less than $6 million,

28:01 - 28:03

I think about what OpenAI

28:03 - 28:05

has spent to develop GPT

28:05 - 28:07

models. What does that mean

28:07 - 28:08

for the closed source model,

28:09 - 28:10

ecosystem trajectory,

28:10 - 28:12

momentum? What does it mean

28:12 - 28:13

for OpenAI?

28:13 - 28:15

I mean, it's very clear that

28:15 - 28:16

we'll have an open-source

28:17 - 28:19

version 4-O, or even better

28:19 - 28:21

than that, and much cheaper

28:21 - 28:22

than that open-source,

28:22 - 28:24

like completely this year.

28:24 - 28:25

Made by OpenAI?

28:26 - 28:27

Probably not. Most likely

28:27 - 28:29

not. And I don't think they

28:29 - 28:30

care if it's not made by

28:30 - 28:32

them. I think they've

28:32 - 28:33

already moved to a new

28:33 - 28:34

paradigm called the o1

28:34 - 28:38

family of models.

28:38 - 28:41

I looked at I can't like

28:41 - 28:42

Ilya Sutskever came and

28:42 - 28:44

said, pre-training is a

28:44 - 28:45

wall, right?

28:45 - 28:48

So, I mean, he didn't

28:48 - 28:49

exactly use the word, but he

28:49 - 28:50

clearly said–yeah–the age of

28:50 - 28:51

pre-training is over.

28:51 - 28:51

–many people have said that

28:52 - 28:52

.

28:52 - 28:55

Right? So, that doesn't mean

28:55 - 28:56

scaling has hit a wall.

28:56 - 28:58

I think we're scaling on

28:58 - 28:59

different dimensions now.

28:59 - 29:00

The amount of time model

29:00 - 29:01

spends thinking at test

29:01 - 29:03

time. Reinforcement

29:03 - 29:04

learning, like trying to,

29:04 - 29:06

like, make the model,

29:06 - 29:07

okay, if it doesn't know

29:07 - 29:09

what to do for a new prompt,

29:09 - 29:10

it'll go and reason and

29:10 - 29:11

collect data and interact

29:11 - 29:13

with the world,

29:13 - 29:14

use a bunch of tools.

29:14 - 29:15

I think that's where things

29:15 - 29:16

are headed, and I feel like

29:16 - 29:17

OpenAI is more focused on

29:17 - 29:19

that right now. Yeah.

29:19 - 29:20

–I nstead of just the

29:20 - 29:21

bigger, better model?

29:21 - 29:22

Correct. –Reasoning

29:22 - 29:23

capacities. But didn't you

29:23 - 29:24

say that deep seek is likely

29:24 - 29:25

to turn their attention to

29:25 - 29:26

reasoning?

29:26 - 29:27

100%, I think they will.

29:28 - 29:31

A nd that's why I'm pretty

29:31 - 29:32

excited about what they'll

29:32 - 29:33

produce next.

29:34 - 29:35

I guess that's then my

29:35 - 29:37

question is sort of what's

29:37 - 29:38

OpenAI's moat now?

29:39 - 29:41

Well, I still think that,

29:41 - 29:42

no one else has produced a

29:42 - 29:45

system similar to the o1

29:45 - 29:47

yet, exactly.

29:47 - 29:49

I know that there's debates

29:49 - 29:50

about whether o1 is actually

29:50 - 29:53

worth it. Y ou know,

29:53 - 29:54

on maybe a few prompts,

29:54 - 29:55

it's really better. But like

29:55 - 29:56

most of the times, it's not

29:56 - 29:57

producing any differentiated

29:57 - 29:58

output from Sonnet.

29:59 - 30:01

But, at least the results

30:01 - 30:03

they showed in o3 where,

30:03 - 30:05

they had like,

30:05 - 30:06

competitive coding

30:06 - 30:08

performance and almost like

30:08 - 30:09

an AI software engineer

30:09 - 30:10

level.

30:10 - 30:11

Isn't it just a matter of

30:11 - 30:12

time, though, before the

30:12 - 30:13

internet is filled with

30:13 - 30:16

reasoning data that.

30:16 - 30:17

–yeah– Deepseek.

30:17 - 30:19

Again, it's possible.

30:19 - 30:20

Nobody knows yet.

30:20 - 30:22

Yeah. So until it's done,

30:23 - 30:24

it's still uncertain right?

30:24 - 30:26

Right. So maybe that

30:26 - 30:27

uncertainty is their moat.

30:27 - 30:28

T hat, like, no one else has

30:28 - 30:30

the same, reasoning

30:30 - 30:32

capability yet,

30:32 - 30:34

but will by end of this

30:34 - 30:36

year, will there be multiple

30:36 - 30:37

players even in the

30:37 - 30:38

reasoning arena?

30:38 - 30:40

I absolutely think so.

30:40 - 30:41

So are we seeing the

30:41 - 30:43

commoditization of large

30:43 - 30:44

language models?

30:44 - 30:45

I think we will see a

30:45 - 30:47

similar trajectory,

30:49 - 30:50

just like how in

30:50 - 30:50

pre-training and

30:51 - 30:52

post-training that that sort

30:52 - 30:54

of system for getting

30:54 - 30:57

commoditized this year will

30:57 - 30:57

be a lot more

30:57 - 30:59

commoditization there.

30:59 - 31:00

I think the reasoning kind

31:00 - 31:02

of models will go through a

31:02 - 31:04

similar trajectory where in

31:04 - 31:05

the beginning, 1 or 2

31:05 - 31:06

players really know how to

31:06 - 31:07

do it, but over time

31:07 - 31:08

–That's.

31:08 - 31:10

and who knows right? Because

31:10 - 31:11

OpenAI could make another

31:11 - 31:12

advancement to focus on.

31:13 - 31:14

But right now reasoning is

31:14 - 31:14

their mode.

31:14 - 31:16

By the way, if advancements

31:16 - 31:18

keep happening again and

31:18 - 31:19

again and again, like,

31:20 - 31:21

I think the meaning of the

31:21 - 31:23

word advancement also loses

31:23 - 31:24

some of its value, right?

31:24 - 31:25

Totally. Even now it's very

31:25 - 31:26

difficult, right. Because

31:26 - 31:27

there's pre-training

31:27 - 31:28

advancements. Yeah.

31:28 - 31:29

And then we've moved into a

31:29 - 31:30

different phase.

31:30 - 31:32

Yeah, so what is guaranteed

31:32 - 31:33

to happen is whatever models

31:33 - 31:36

exist today, that level of

31:36 - 31:37

reasoning, that level of

31:37 - 31:40

multimodal capability in

31:40 - 31:41

like 5 or 10x cheaper

31:41 - 31:43

models, open source,

31:43 - 31:45

all that's going to happen.

31:45 - 31:46

It's just a matter of time.

31:46 - 31:49

What is unclear is if

31:49 - 31:50

something like a model that

31:50 - 31:53

reasons at test time will be

31:53 - 31:55

extremely cheap enough that

31:55 - 31:56

we can just run it on our

31:56 - 31:58

phones. I think that's not

31:58 - 31:58

clear to me yet.

31:58 - 31:59

It feels like so much of the

31:59 - 32:00

landscape has changed with

32:00 - 32:01

what Deepseek was able to

32:02 - 32:03

prove. Could you call it

32:03 - 32:05

China's ChatGPT moment?

32:07 - 32:07

Possible,

32:07 - 32:10

I mean, I think it certainly

32:10 - 32:11

probably gave them a lot of

32:11 - 32:13

confidence that, like,

32:14 - 32:16

you know, we're not really

32:16 - 32:17

behind no matter what you do

32:17 - 32:19

to restrict our compute.

32:19 - 32:21

Like, we can always figure

32:21 - 32:22

out some workarounds.

32:22 - 32:23

And, yeah, I'm sure the team

32:23 - 32:24

feels pumped about the

32:25 - 32:26

results.

32:26 - 32:27

How does this change,

32:27 - 32:28

like the investment

32:28 - 32:30

landscape, the hyperscalers

32:30 - 32:32

that are spending tens of

32:32 - 32:33

billions of dollars a year

32:33 - 32:34

on CapEx have just ramped it

32:34 - 32:36

up huge. And OpenAI and

32:36 - 32:37

Anthropic that are raising

32:37 - 32:38

billions of dollars for

32:38 - 32:39

GPUs, essentially.

32:39 - 32:41

But what Deepseek told us is

32:41 - 32:42

you don't need, you don't

32:42 - 32:44

necessarily need that.

32:44 - 32:45

Yeah.

32:45 - 32:47

I mean, look, I think it's

32:47 - 32:48

very clear that they're

32:48 - 32:50

going to go even harder on

32:50 - 32:53

reasoning because they

32:53 - 32:53

understand that, like,

32:53 - 32:54

whatever they were building

32:54 - 32:56

in the previous two years is

32:56 - 32:57

getting extremely cheap,

32:57 - 32:58

that it doesn't make sense

32:58 - 33:01

to go justify raising that–

33:01 - 33:02

Is the spending.

33:02 - 33:03

proposition the same? Do

33:03 - 33:05

they need the same amount

33:05 - 33:07

of, you know, high end GPUs,

33:07 - 33:08

or can you reason using the

33:08 - 33:09

lower end ones that

33:09 - 33:09

Deepseek–

33:10 - 33:11

Again, it's hard to say no

33:11 - 33:13

until proven it's not.

33:14 - 33:17

But I guess, like in the

33:17 - 33:19

spirit of moving fast,

33:19 - 33:20

you would want to use the

33:20 - 33:22

high end chips, and you

33:22 - 33:24

would want to, like, move

33:24 - 33:24

faster than your

33:24 - 33:26

competitors. I think,

33:26 - 33:27

like the best talent still

33:27 - 33:28

wants to work in the team

33:28 - 33:31

that made it happen first.

33:31 - 33:32

You know, there's always

33:32 - 33:33

some glory to like, who did

33:33 - 33:34

this, actually? Like, who's

33:34 - 33:36

the real pioneer? Versus

33:36 - 33:38

who's the fast follow right?

33:38 - 33:39

That was like kind of like

33:39 - 33:41

Sam Altman's tweet kind of

33:41 - 33:43

veiled response to what

33:43 - 33:44

Deepseek has been able to,

33:44 - 33:45

he kind of implied that they

33:45 - 33:46

just copied, and anyone can

33:46 - 33:46

copy.

33:47 - 33:48

Right? Yeah, but then you

33:48 - 33:50

can always say that, like,

33:50 - 33:51

everybody copies everybody

33:51 - 33:53

in this field.

33:53 - 33:54

You can say Google did the

33:54 - 33:56

transformer first. It's not

33:56 - 33:57

OpenAI and OpenAI just

33:57 - 33:59

copied it. Google built the

33:59 - 34:01

first large language models.

34:01 - 34:02

They didn't productise it,

34:02 - 34:04

but OpenAI did it in a

34:04 - 34:05

productized way. So you can

34:06 - 34:09

say all this in many ways,

34:09 - 34:09

it doesn't matter.

34:09 - 34:11

I remember asking you being

34:11 - 34:12

like, you know, why don't

34:12 - 34:13

you want to build the model?

34:13 - 34:14

Yeah, that's that's,

34:14 - 34:16

you know, the glory. And a

34:16 - 34:18

year later, just one year

34:18 - 34:19

later, you look very,

34:19 - 34:21

very smart. To not engage in

34:21 - 34:23

that extremely expensive

34:23 - 34:24

race that has become so

34:24 - 34:25

competitive. And you kind of

34:25 - 34:27

have this lead now in what

34:27 - 34:28

everyone wants to see now,

34:28 - 34:30

which is like real world

34:30 - 34:31

applications, killer

34:31 - 34:33

applications of generative

34:33 - 34:35

AI. Talk a little bit about

34:35 - 34:37

like that decision and how

34:37 - 34:38

that's sort of guided you

34:39 - 34:40

where you see Perplexity

34:40 - 34:41

going from here.

34:41 - 34:43

Look, one year ago,

34:43 - 34:45

I don't even think we had

34:45 - 34:47

something like,

34:47 - 34:51

this is what, like 2024

34:51 - 34:54

beginning, right? I feel

34:54 - 34:54

like we didn't even have

34:54 - 34:56

something like Sonnet 3.5,

34:56 - 34:58

right? W e had GPT -4,

34:58 - 35:00

I believe, and it was kind

35:00 - 35:01

of nobody else was able to

35:01 - 35:03

catch up to it. Yeah.

35:03 - 35:05

B ut there was no multimodal

35:05 - 35:08

nothing, and my sense was

35:08 - 35:09

like, okay, if people with

35:09 - 35:10

way more resources and way

35:10 - 35:12

more talent cannot catch up,

35:12 - 35:14

it's very difficult to play

35:14 - 35:15

that game. So let's play a

35:15 - 35:17

different game. Anyway,

35:17 - 35:18

people want to use these

35:18 - 35:21

models. And there's one use

35:21 - 35:22

case of asking questions and

35:22 - 35:23

getting accurate answers

35:23 - 35:25

with sources, with real time

35:25 - 35:27

information, accurate

35:27 - 35:28

information.

35:28 - 35:30

There's still a lot of work

35:30 - 35:31

there to do outside the

35:31 - 35:33

model, and making sure the

35:33 - 35:34

product works reliably,

35:34 - 35:36

keep scaling it up to usage.

35:36 - 35:38

Keep building custom UIs,

35:38 - 35:39

there's just a lot of work

35:39 - 35:40

to do, and we would focus on

35:40 - 35:42

that, and we would benefit

35:42 - 35:43

from all the tailwinds of

35:43 - 35:44

models getting better and

35:44 - 35:46

better. That's essentially

35:46 - 35:47

what happened, in fact, I

35:47 - 35:50

would say, Sonnet 3.5 made

35:50 - 35:51

our product so good,

35:51 - 35:54

in the sense that if you use

35:54 - 35:56

Sonnet 3.5 as the model

35:56 - 35:58

choice within Perplexity,

35:59 - 36:00

it's very difficult to find

36:00 - 36:01

a hallucination. I'm not

36:01 - 36:03

saying it's impossible,

36:04 - 36:06

but it dramatically reduced

36:06 - 36:08

the rate of hallucinations,

36:08 - 36:10

which meant, the problem of

36:10 - 36:11

question-answering,

36:11 - 36:12

asking a question, getting

36:12 - 36:13

an answer, doing fact

36:13 - 36:15

checks, research, going and

36:15 - 36:16

asking anything out there

36:16 - 36:17

because almost all the

36:17 - 36:18

information is on the

36:18 - 36:21

web,was such a big unlock.

36:22 - 36:24

And that helped us grow 10x

36:24 - 36:24

over the course of the year

36:24 - 36:25

in terms of usage.

36:25 - 36:26

And you've made huge strides

36:27 - 36:28

in terms of users,

36:28 - 36:29

and you know, we hear on

36:29 - 36:30

CNBC a lot, like big

36:30 - 36:32

investors who are huge fans.

36:32 - 36:33

Yeah. Jensen Huang himself

36:33 - 36:34

right? He mentioned it the

36:34 - 36:35

other, in his keynote.

36:35 - 36:37

Yeah. The other night.

36:37 - 36:38

He's a pretty regular user,

36:38 - 36:39

actually, he's not just

36:39 - 36:40

saying it. He's actually a

36:40 - 36:41

pretty regular user.

36:42 - 36:43

So, a year ago we weren't

36:43 - 36:44

even talking about

36:44 - 36:45

monetization because you

36:45 - 36:46

guys were just so new and

36:46 - 36:48

you wanted to, you know,

36:48 - 36:49

get yourselves out there and

36:49 - 36:50

build some scale, but now

36:50 - 36:51

you are looking at things

36:51 - 36:53

like that, increasingly an

36:53 - 36:54

ad model, right?

36:54 - 36:55

Yeah, we're experimenting

36:55 - 36:56

with it.

36:56 - 36:58

I know there's some

36:58 - 37:00

controversy on like,

37:00 - 37:01

why should we do ads?

37:01 - 37:03

Whether you can have a

37:03 - 37:04

truthful answer engine

37:04 - 37:05

despite having ads.

37:06 - 37:08

And in my opinion,

37:08 - 37:10

we've been pretty

37:10 - 37:11

proactively thoughtful about

37:11 - 37:12

it where we said,

37:13 - 37:14

okay, as long as the answer

37:14 - 37:15

is always accurate,

37:15 - 37:17

unbiased and not corrupted

37:17 - 37:19

by someone's advertising

37:19 - 37:21

budget, only you get to see

37:21 - 37:23

some sponsored questions,

37:23 - 37:24

and even the answers to

37:24 - 37:25

those sponsored questions

37:25 - 37:27

are not influenced by them,

37:27 - 37:30

and questions are also not

37:30 - 37:31

picked in a way where it's

37:31 - 37:33

manipulative. Sure,

37:34 - 37:35

there are some things that

37:35 - 37:36

the advertiser also wants,

37:36 - 37:37

which is they want you to

37:37 - 37:38

know about their brand, and

37:38 - 37:39

they want you to know the

37:39 - 37:41

best parts of their brand,

37:41 - 37:42

just like how you go,

37:42 - 37:43

and if you're introducing

37:43 - 37:44

yourself to someone you want

37:44 - 37:45

to, you want them to see the

37:45 - 37:47

best parts of you, right?

37:47 - 37:48

So that's all there.

37:48 - 37:50

But you still don't have to

37:50 - 37:51

click on a sponsored

37:51 - 37:53

question. You can ignore it.

37:53 - 37:54

And we're only charging them

37:54 - 37:55

CPM right now.

37:55 - 37:57

So we're not we ourselves

37:57 - 37:58

are not even incentivized to

37:58 - 37:59

make you click yet.

38:00 - 38:02

So I think considering all

38:02 - 38:03

this, we're actually trying

38:03 - 38:05

to get it right long term.

38:05 - 38:06

Instead of going the Google

38:06 - 38:08

way of forcing you to click

38:08 - 38:08

on links. I remember when

38:08 - 38:09

people were talking about

38:09 - 38:10

the commoditization of

38:10 - 38:11

models a year ago and you

38:11 - 38:12

thought, oh, it was

38:12 - 38:14

controversial, but now it's

38:14 - 38:15

not controversial. It's kind

38:15 - 38:16

of like that's happening and

38:16 - 38:17

you're keeping your eye on

38:17 - 38:19

that is smart.

38:19 - 38:20

By the way, we benefit a lot

38:20 - 38:22

from model commoditization,

38:22 - 38:23

except we also need to

38:23 - 38:24

figure out something to

38:24 - 38:26

offer to the paid users,

38:26 - 38:27

like a more sophisticated

38:27 - 38:29

research agent that can do

38:29 - 38:30

like multi-step reasoning,

38:30 - 38:31

go and like do like 15

38:31 - 38:32

minutes worth of searching

38:32 - 38:34

and give you like an

38:34 - 38:35

analysis, an analyst type of

38:35 - 38:37

answer. All that's going to

38:37 - 38:38

come, all that's going to

38:38 - 38:39

stay in the product. Nothing

38:39 - 38:41

changes there. But there's a

38:41 - 38:43

ton of questions every free

38:43 - 38:45

user asks day-to-day basis

38:45 - 38:46

that that needs to be quick,

38:46 - 38:48

fast answers, like it

38:48 - 38:49

shouldn't be slow,

38:49 - 38:51

and all that will be free,

38:51 - 38:52

whether you like it or not,

38:52 - 38:53

it has to be free. That's

38:53 - 38:54

what people are used to.

38:55 - 38:57

And that means like figuring

38:57 - 38:58

out a way to make that free

38:58 - 39:00

traffic also monetizable.

39:00 - 39:01

So you're not trying to

39:01 - 39:02

change user habits. But it's

39:02 - 39:03

interesting because you are

39:03 - 39:04

kind of trying to teach new

39:04 - 39:05

habits to advertisers.

39:05 - 39:07

They can't have everything

39:07 - 39:08

that they have in a Google

39:08 - 39:09

ten blue links search.

39:09 - 39:10

What's the response been

39:10 - 39:11

from them so far? Are they

39:11 - 39:12

willing to accept some of

39:12 - 39:13

the trade offs?

39:13 - 39:14

Yeah, I mean that's why they

39:14 - 39:17

are trying stuff like Intuit

39:17 - 39:18

is working with us.

39:18 - 39:20

And then there's many other

39:20 - 39:23

brands. Dell, like all these

39:23 - 39:24

people are working with us

39:24 - 39:25

to test, right?

39:26 - 39:27

They're also excited about,

39:28 - 39:30

look, everyone knows that,

39:30 - 39:31

like, whether you like it or

39:31 - 39:33

not, 5 or 10 years from now,

39:33 - 39:34

most people are going to be

39:34 - 39:36

asking AIs most of the

39:36 - 39:37

things, and not on the

39:37 - 39:38

traditional search engine,

39:38 - 39:40

everybody understands that.

39:40 - 39:43

So everybody wants to be

39:43 - 39:45

early adopters of the new

39:45 - 39:47

platforms, new UX,

39:47 - 39:48

and learn from it,

39:48 - 39:49

and build things together.

39:49 - 39:51

Not like they're not viewing

39:51 - 39:52

it as like, okay, you guys

39:52 - 39:53

go figure out everything

39:53 - 39:54

else and then we'll come

39:54 - 39:54

later.

39:55 - 39:56

I'm smiling because it goes

39:56 - 39:57

back perfectly to the point

39:57 - 39:58

you made when you first sat

39:58 - 40:00

down today, which is

40:00 - 40:01

necessity is the mother of

40:01 - 40:03

all invention,

40:03 - 40:03

right? And that's what

40:03 - 40:04

advertisers are essentially

40:04 - 40:05

looking at. They're saying

40:05 - 40:06

this field is changing.

40:06 - 40:07

We have to learn to adapt

40:07 - 40:09

with it. Okay,

40:09 - 40:10

Arvind, I took up so much of

40:10 - 40:11

your time. Thank you so much

40:11 - 40:12

for taking the time.

Article Title: Deepseek: China's AI Breakthrough and Its Implications for the Global Landscape

In the realm of artificial intelligence (AI), a groundbreaking new Open-Source AI Model developed by a Chinese research lab called Deepseek has positioned China as a major player in the global AI race. Valued at less than $6 million in development costs, this model outperforms some of the most powerful models created by American tech giants like OpenAI, Google, and Meta. The implications of Deepseek's success for the future of AI development are vast and varied, sparking a paradigm shift in the industry. From challenging the traditional approach of investing billions in closed-source models to emphasizing the importance of innovation and efficiency, Deepseek's achievements have major implications for the AI landscape.

Insights into Deepseek's Impact:

Deepseek's ability to accomplish so much with limited resources highlights the importance of creativity, efficient technology utilization, and the value of open-source models. By utilizing cost-effective strategies and innovative approaches to model development, Deepseek has not only caught up to, but also surpassed, many established players in the AI field.

The Rise of Open-Source Models:

The success of Deepseek underscores the growing importance of open-source models in driving AI innovation. Developers now have access to advanced models at a fraction of the cost, fueling a wave of transformative applications and developments across industries. The shift towards open-source may redefine the AI landscape, as collaboration and decentralization start driving faster and more efficient innovation than closed ecosystems can offer.

Navigating the AI Competition:

With China rapidly closing the gap in AI development, the competition between China and the US has escalated. The need for continuous innovation and adaptability in the face of evolving technologies is becoming increasingly vital. As Deepseek's achievements demonstrate, necessity can drive groundbreaking solutions that challenge the status quo and reshape industry norms.

Implications for AI Investment and Development:

The emergence of cost-effective, high-performing models like Deepseek raises questions about the future trajectory of AI investment. Unlike the traditional multi-billion-dollar investments in closed-source models, Deepseek's success showcases the potential for achieving remarkable results with significantly lower costs. This could lead to a shift in investment patterns towards more efficient and innovative approaches to AI development.

Conclusion:

As the AI landscape evolves, models like Deepseek are paving the way for a new era of innovation and efficiency in artificial intelligence. By demonstrating the power of creativity, open-source collaboration, and cost-effective development methods, Deepseek has set a new benchmark for success in the AI industry. As the global AI race intensifies, the ability to adapt, innovate, and leverage emerging technologies will be key to maintaining a competitive edge in the rapidly evolving field of artificial intelligence.

In the ever-changing world of AI, Deepseek's success serves as a testament to the limitless potential of human ingenuity and the transformative power of innovative thinking in driving progress and shaping the future of technology.