00:02 - 00:14

[Music]

00:21 - 00:25

hello welcome to the 12 days of open AI

00:24 - 00:27

we're going to try something that as far

00:25 - 00:28

as we know no tech company has done

00:27 - 00:31

before which is every day for the next

00:28 - 00:33

12 every week we are going to launch or

00:31 - 00:35

demo some new thing that we've built and

00:33 - 00:36

we think we've got some great stuff for

00:35 - 00:39

you starting today we hope you'll really

00:36 - 00:41

love it and you know we'll try to make

00:39 - 00:42

this fun and fast and not take too long

00:41 - 00:44

but it'll be a way to show you what

00:42 - 00:46

we've been working on and a little

00:44 - 00:47

holiday present from us so we'll jump

00:46 - 00:49

right into this first day uh today we

00:47 - 00:51

actually have two things to launch the

00:49 - 00:53

first one is the full version of 01 we

00:51 - 00:54

have been very hard at work we've

00:53 - 00:56

listened to your feedback you want uh

00:54 - 00:59

you like ow1 preview but you want it to

00:56 - 01:00

be smarter and faster and be multimodal

00:59 - 01:02

and be better instruction following a

01:00 - 01:04

bunch of other things so we' put a lot

01:02 - 01:06

of work into this and for scientists

01:04 - 01:08

engineers coders we think they will

01:06 - 01:10

really love this new model uh I'd like

01:08 - 01:13

to show you quickly about how it

01:10 - 01:16

performs so you can see uh the jump from

01:13 - 01:20

GPT 40 to 01 preview across math

01:16 - 01:22

competition coding GP QA Diamond um and

01:20 - 01:24

you can see that 01 is a pretty big step

01:22 - 01:26

forward um it's also much better in a

01:24 - 01:27

lot of other ways but raw intelligence

01:26 - 01:29

is something that we care about coding

01:27 - 01:30

performance in particular is an area

01:29 - 01:34

where people people are using the model

01:30 - 01:35

a lot so in just a minute uh these guys

01:34 - 01:38

will demo some things about1 they'll

01:35 - 01:39

show you how it does at speed how it

01:38 - 01:41

does at really hard problems how it does

01:39 - 01:42

with multimodality but first I want to

01:41 - 01:45

talk just for a minute about the second

01:42 - 01:47

thing we're launching today a lot of

01:45 - 01:49

people uh Power users of chat gbt at

01:47 - 01:51

this point they really use it a lot and

01:49 - 01:53

they want more compute than $20 a month

01:51 - 01:56

can buy so we're launching a new tier

01:53 - 01:58

chat gbt pro and pro has unlimited

01:56 - 02:01

access to our models uh and also things

01:58 - 02:04

like advanced voice mode it also has a

02:01 - 02:06

uh a new thing called 01 prom mode so 01

02:04 - 02:09

is the smartest model in the world now

02:06 - 02:10

except for 01 being used in PR mode and

02:09 - 02:13

for the hardest problems that people

02:10 - 02:15

have uh o1 Pro mode lets you do even a

02:13 - 02:17

little bit better um so you can see at

02:15 - 02:19

competition math you can see a GP QA

02:17 - 02:21

Diamond um and these boosts may look

02:19 - 02:22

small but in in complex workflows where

02:21 - 02:25

you're really pushing the limits of

02:22 - 02:27

these models it's pretty significant uh

02:25 - 02:30

I'll show you one more thing about Pro

02:27 - 02:31

about the pr mode so one that people

02:30 - 02:34

really have said they want is

02:31 - 02:36

reliability and here you can see how the

02:34 - 02:37

reliability of an answer from prom mode

02:36 - 02:40

Compares to1 and this isn't even

02:37 - 02:41

stronger Delta and again for our Pro

02:40 - 02:44

users we've heard a lot about how much

02:41 - 02:46

people want this chat GPT Pro is $200 a

02:44 - 02:48

month uh launches today over the course

02:46 - 02:50

of this these 12 days we have some other

02:48 - 02:52

things to add to it that we think you

02:50 - 02:55

also really love um but Unlimited Model

02:52 - 02:57

use and uh this new1 prom mode so I want

02:55 - 02:59

to jump right in and we'll show some of

02:57 - 03:01

those demos that we talked about uh and

02:59 - 03:03

these are some of the guys that helped

03:01 - 03:06

build 01 uh with many other people

03:03 - 03:09

behind them on the team thanks Sam hi um

03:06 - 03:11

I'm hung one I'm Jason and I'm Max we're

03:09 - 03:13

all research scientists who worked on

03:11 - 03:15

building 01 01 is really distinctive

03:13 - 03:17

because it's the first model we've

03:15 - 03:19

trained that thinks before it responds

03:17 - 03:21

meaning it gives much better and often

03:19 - 03:22

more detailed and more correct responses

03:21 - 03:25

than other models you might have tried

03:22 - 03:28

01 is being rolled out today to all uh

03:25 - 03:31

plus and soon to be Pro subscribers on

03:28 - 03:34

chat gbt replacing o1

03:31 - 03:36

o1 model is uh faster and smarter than

03:34 - 03:38

the o1 preview model which we launched

03:36 - 03:40

in September after the launch many

03:38 - 03:43

people asked about the multimodal input

03:40 - 03:46

so we added that uh so now the oan model

03:43 - 03:48

live today is able to Reon through both

03:46 - 03:50

images and text

03:48 - 03:52

jointly as Sam mentioned today we're

03:50 - 03:56

also going to launch a new tier of Chad

03:52 - 03:59

gbt called chbt pro chbt pro offers

03:56 - 04:03

unlimited access to our best models like

03:59 - 04:06

o1 40 and advanced voice chbt Pro also

04:03 - 04:09

has a special way of using 01 called 01

04:06 - 04:11

Pro mode with 01 Pro mode you can ask

04:09 - 04:13

the model to use even more compute to

04:11 - 04:14

think even harder on some of the most

04:13 - 04:17

difficult

04:14 - 04:19

problems we think the audience for chat

04:17 - 04:21

gbt Pro will be the power users of chat

04:19 - 04:22

GPT those who are already pushing the

04:21 - 04:25

models to the limits of their

04:22 - 04:27

capabilities on tasks like math

04:25 - 04:28

programming and writing it's been

04:27 - 04:30

amazing to see how much people are

04:28 - 04:32

pushing a one preview uh how much people

04:30 - 04:33

who do technical work all day get out of

04:32 - 04:36

this and uh we're really excited to let

04:33 - 04:38

them push it further yeah we also really

04:36 - 04:40

think that 01 will be much better for

04:38 - 04:42

everyday use cases not necessarily just

04:40 - 04:43

really hard math and programming

04:42 - 04:45

problems in particular one piece of

04:43 - 04:47

feedback we received about o1 preview

04:45 - 04:48

constantly was that it was way too slow

04:47 - 04:50

it would think for 10 seconds if you

04:48 - 04:52

said hi to it and we fixed that was

04:50 - 04:55

really annoying it it was kind of funny

04:52 - 04:56

honestly it really thought it cared

04:55 - 04:59

really thought hard about saying back

04:56 - 05:01

yeah um and so we fixed that 01 will now

04:59 - 05:03

think much more intelligently if you ask

05:01 - 05:04

it a simple question it'll respond

05:03 - 05:06

really quickly and if you ask it a

05:04 - 05:08

really hard question it'll think for a

05:06 - 05:09

really long time uh we ran a pretty

05:08 - 05:11

detailed Suite of human evaluations for

05:09 - 05:14

this model and what we found was that it

05:11 - 05:17

made major mistakes about 34% less often

05:14 - 05:19

than 01 preview while thinking fully

05:17 - 05:21

about 50% faster and we think this will

05:19 - 05:23

be a really really noticeable difference

05:21 - 05:25

for all of you so I really enjoy just

05:23 - 05:26

talking to these models I'm a big

05:25 - 05:28

history buff and I'll show you a really

05:26 - 05:30

quick demo of for example a sort of

05:28 - 05:33

question that I might ask one of these

05:30 - 05:36

models so uh right here I on the left I

05:33 - 05:37

have 01 on the right I have o1 preview

05:36 - 05:39

and I'm just asking at a really simple

05:37 - 05:41

history question list the Roman ERS of

05:39 - 05:44

the second century tell me about their

05:41 - 05:46

dates what they did um not hard but you

05:44 - 05:49

know GPT 40 actually gets this wrong a

05:46 - 05:51

reasonable fraction of the time um and

05:49 - 05:53

so I've asked o1 this I've asked o1

05:51 - 05:55

preview this I tested this offline a few

05:53 - 05:58

times and I found that 01 on average

05:55 - 05:59

responded about 60% faster than1 preview

05:58 - 06:01

um this could be a little bit varable

05:59 - 06:04

because right now we're in the process

06:01 - 06:07

of swapping all our gpus from 01 uh Pro

06:04 - 06:11

preview to 01 so actually 01 thought for

06:07 - 06:13

about 14 seconds 01 preview still

06:11 - 06:15

going there's a lot of Roman emperors

06:13 - 06:16

there's a lot of Roman emperors yeah 40

06:15 - 06:17

actually gets this wrong a lot of the

06:16 - 06:20

time there are a lot of folks who rolled

06:17 - 06:22

for like uh 6 days 12 days a month and

06:20 - 06:23

it sometimes forgets those can you do

06:22 - 06:25

them all for memory including the six

06:23 - 06:28

day people

06:25 - 06:30

no yep so here we go 01 thought for

06:28 - 06:32

about 14 seconds preview thought for

06:30 - 06:33

about 33 seconds these should both be

06:32 - 06:35

faster once we finish deploying but we

06:33 - 06:37

wanted this to go live right now exactly

06:35 - 06:39

um so yeah we we think you'll really

06:37 - 06:40

enjoy talking to this model we we found

06:39 - 06:42

that it gave great responses it thought

06:40 - 06:44

much faster it should just be a much

06:42 - 06:45

better user experience for everyone so

06:44 - 06:47

one other feature we know that people

06:45 - 06:49

really wanted for everyday use cases

06:47 - 06:50

that we've had requested a lot is

06:49 - 06:52

multimodal inputs and image

06:50 - 06:54

understanding and hungan is going to

06:52 - 06:57

talk about that now yep to illustrate

06:54 - 07:00

the multimodal input and reasoning uh I

06:57 - 07:03

created this toy problem uh with some

07:00 - 07:05

hand drone diagrams and so on so here it

07:03 - 07:08

is it's hard to see so I already took a

07:05 - 07:11

photo of this and so let's look at this

07:08 - 07:14

photo in a laptop so once you upload the

07:11 - 07:17

image into the chat GPT you can click on

07:14 - 07:20

it and um to see the zoomed in version

07:17 - 07:25

so this is a system of a data center in

07:20 - 07:28

space so maybe um in the future we might

07:25 - 07:30

want to train AI models in the space uh

07:28 - 07:33

I think we should do that but the Power

07:30 - 07:35

number looks a little low one G okay but

07:33 - 07:39

the general idea rookie numbers in this

07:35 - 07:41

rookie numbers rookie okay yeah so uh we

07:39 - 07:44

have a sun right here uh taking in power

07:41 - 07:46

on this solar panel and then uh there's

07:44 - 07:50

a small data center here that's exactly

07:46 - 07:53

what they look like yeah GPU R and then

07:50 - 07:55

pump nice pump here and one interesting

07:53 - 07:58

thing about um operation in space is

07:55 - 08:01

that on Earth we can do air cooling

07:58 - 08:03

water cooling to cool the gpus but in

08:01 - 08:06

space there's nothing there so we have

08:03 - 08:09

to radiate this um heat into the deep

08:06 - 08:12

space and that's why we need this uh

08:09 - 08:15

giant radiator cooling panel and this

08:12 - 08:18

problem is about finding the lower bound

08:15 - 08:22

estimate of the cooling panel area

08:18 - 08:24

required to operate um this 1 gaw uh uh

08:22 - 08:28

data center probably going to be very

08:24 - 08:30

Big Y let's see how big is let's see so

08:28 - 08:33

that's the problem and I'm going to this

08:30 - 08:36

prompt and uh yeah this is essentially

08:33 - 08:39

asking for that so let me uh hit go and

08:36 - 08:41

the model will think for

08:39 - 08:43

seconds by the way most people don't

08:41 - 08:46

know I've been working with hangan for a

08:43 - 08:48

long time henan actually has a PHD in

08:46 - 08:50

thermodynamics which it's totally

08:48 - 08:52

unrelated to Ai and you always joke that

08:50 - 08:55

you haven't been able to use your PhD

08:52 - 08:57

work in your job until today so you can

08:55 - 09:00

you can trust hungan on this analysis

08:57 - 09:03

finally finally uh thanks for hyping up

09:00 - 09:06

now I really have to get this right uh

09:03 - 09:08

okay so the model finished thinking only

09:06 - 09:11

10 seconds it's a simple problem so

09:08 - 09:14

let's see if how the model did it so

09:11 - 09:17

power input um so first of all this one

09:14 - 09:19

gwatt that was only drawn in the paper

09:17 - 09:21

so the model was able to pick that up

09:19 - 09:23

nicely and then um radiative heat

09:21 - 09:25

transfer only that's the thing I

09:23 - 09:29

mentioned so in space nothing else and

09:25 - 09:31

then some simplifying um uh choices and

09:29 - 09:32

one critical thing is that I

09:31 - 09:36

intentionally made this problem under

09:32 - 09:37

specified meaning that um the critical

09:36 - 09:41

parameter is a temperature of the

09:37 - 09:43

cooling panel uh I left it out so that

09:41 - 09:47

uh we can test out the model's ability

09:43 - 09:50

to handle um ambiguity and so on so the

09:47 - 09:53

model was able to recognize that this is

09:50 - 09:55

actually a unspecified but important

09:53 - 09:58

parameter and it actually picked the

09:55 - 10:00

right um range of param uh temperature

09:58 - 10:03

which is about the room temperature and

10:00 - 10:05

with that it continues to the analysis

10:03 - 10:09

and does a whole bunch of things and

10:05 - 10:10

then found out the area which is 2.42

10:09 - 10:13

million square meters just to get a

10:10 - 10:16

sense of how big this is this is about

10:13 - 10:19

2% of the uh land area of San Francisco

10:16 - 10:20

this is huge not that bad not that bad

10:19 - 10:24

yeah oh

10:20 - 10:26

okay um yeah so I guess this this uh

10:24 - 10:28

reasonable I'll skip through the rest of

10:26 - 10:33

the details but I think the model did a

10:28 - 10:35

great job job um making nice consistent

10:33 - 10:38

assumptions that um you know make the

10:35 - 10:42

required area as little as possible and

10:38 - 10:45

so um yeah so this is the demonstration

10:42 - 10:48

of the multimodal reasoning and this is

10:45 - 10:51

a simple problem but o1 is actually very

10:48 - 10:54

strong and on standard benchmarks like

10:51 - 10:55

mm muu and math Vista o actually has the

10:54 - 10:58

state-ofthe-art

10:55 - 10:59

performance now Jason will showcase the

10:58 - 11:02

the pro mode

10:59 - 11:06

great so I want to give a short demo of

11:02 - 11:09

uh chachy pt1 Pro mode um people will

11:06 - 11:11

find uh o1 PR mode the most useful for

11:09 - 11:13

say hard math science or programming

11:11 - 11:16

problems so here I have a pretty

11:13 - 11:19

challenging chemistry problem that o1

11:16 - 11:22

preview gets usually Incorrect and so I

11:19 - 11:24

will uh let the model start

11:22 - 11:27

thinking um one thing we've learned with

11:24 - 11:29

these models is that uh for these very

11:27 - 11:31

challenging problems the model can think

11:29 - 11:32

for up to a few minutes I think for this

11:31 - 11:35

problem the model usually thinks

11:32 - 11:37

anywhere from 1 minute to up to 3

11:35 - 11:39

minutes um and so we have to provide

11:37 - 11:41

some entertainment for for people while

11:39 - 11:43

the model is thinking so I'll describe

11:41 - 11:45

the problem a little bit and then if the

11:43 - 11:48

model is still thinking when I'm done

11:45 - 11:51

I've prepared a dad joke for for us uh

11:48 - 11:52

to fill the rest of the time um so I

11:51 - 11:56

hope it thinks for a long

11:52 - 11:59

time you can see uh the problem asks for

11:56 - 12:01

a protein that fits a very specif

11:59 - 12:04

specific set of criteria so uh there are

12:01 - 12:06

six criteria and the challenge is each

12:04 - 12:08

of them ask for pretty chemistry domain

12:06 - 12:09

specific knowledge that the model would

12:08 - 12:11

have to

12:09 - 12:14

recall and the other thing to know about

12:11 - 12:16

this problem uh is that none of these

12:14 - 12:18

criteria actually give away what the

12:16 - 12:20

correct answer is so for any given

12:18 - 12:23

criteria there could be dozens of

12:20 - 12:24

proteins that might fit that criteria

12:23 - 12:26

and so the model has to think through

12:24 - 12:27

all the candidates and then check if

12:26 - 12:30

they fit all the

12:27 - 12:33

criteria okay so you could see the model

12:30 - 12:36

actually was faster this time uh so it

12:33 - 12:38

finished in 53 seconds you can click and

12:36 - 12:40

see some of the thought process that the

12:38 - 12:42

model went through to get the answer uh

12:40 - 12:44

you could see it's uh thinking about

12:42 - 12:46

different candidates like neural Lian

12:44 - 12:49

initially um and then it arrives at the

12:46 - 12:51

correct answer which is uh retino chisen

12:49 - 12:54

uh which is

12:51 - 12:59

great um okay so to summarize um we saw

12:54 - 13:02

from Max that o1 is smarter and faster

12:59 - 13:05

than uh o1 preview we saw from hangan

13:02 - 13:08

that oan can now reason over both text

13:05 - 13:11

and images and then finally we saw with

13:08 - 13:15

Chach BT Pro mode uh you can use o1 to

13:11 - 13:17

think about uh the the to to to to

13:15 - 13:20

reason about the hardest uh science and

13:17 - 13:23

math problems yep there's more to come

13:20 - 13:26

um for the chat PT Pro tier uh we're

13:23 - 13:28

working on even more computer intensive

13:26 - 13:31

tasks to uh Power longer and bigger

13:28 - 13:34

tasks task for those who want to push

13:31 - 13:37

the model even further and we're still

13:34 - 13:41

working on adding tools to the o1 um

13:37 - 13:43

model such as web browsing file uploads

13:41 - 13:45

and things like that we're also hard at

13:43 - 13:47

work to bring o1 to the to the API we're

13:45 - 13:49

going to be adding some new features for

13:47 - 13:52

developers structured outputs function

13:49 - 13:53

calling developer messages and API image

13:52 - 13:55

understanding which we think you'll

13:53 - 13:57

really enjoy we expect this to be a

13:55 - 13:59

great model for developers and really

13:57 - 14:00

unlock a whole new frontier of aent

13:59 - 14:02

things you guys can build we hope you

14:00 - 14:05

love it as much as we

14:02 - 14:07

do that was great thank you guys so much

14:05 - 14:10

congratulations uh to you and the team

14:07 - 14:13

on on getting this done uh we we really

14:10 - 14:15

hope that you'll enjoy 01 and PR mode uh

14:13 - 14:16

or Pro tier uh we have a lot more stuff

14:15 - 14:19

to come tomorrow we'll be back with

14:16 - 14:21

something great for developers uh and

14:19 - 14:24

we'll keep going from there before we

14:21 - 14:27

wrap up can can we hear your joke yes uh

14:24 - 14:31

so um I made this joke this

14:27 - 14:33

morning the the joke is this so Santa

14:31 - 14:36

was trying to get his large language

14:33 - 14:37

model to do a math problem and he was

14:36 - 14:40

prompting it really hard but it wasn't

14:37 - 14:46

working how did he eventually fix

14:40 - 14:46

it no idea he used reindeer enforcement

14:48 - 14:53

learning thank you very much thank you

14:55 - 15:02

[Music]

Enhancing AI Capabilities for Better User Experience

In the era of advanced technology, OpenAI is revolutionizing the AI landscape with its groundbreaking developments. The company has embarked on the "12 Days of Open AI" journey, introducing new features and enhancements to its already impressive lineup. Today, we delve into the latest upgrades that promise to elevate user interactions with AI models.

Unveiling the Next Generation: 01 Model and Chat GPT Pro

OpenAI sets the stage by revealing two major launches. The refined version of the 01 model takes center stage, boasting improved intelligence, speed, and multimodal capabilities. Users can expect enhanced performance across various domains, from math competitions to coding challenges.

Additionally, OpenAI introduces Chat GPT Pro, catering to power users seeking advanced functionalities beyond the standard offerings. With unlimited model access and the innovative 01 Pro mode, users can now delve deeper into complex problem-solving scenarios. The Pro mode's ability to leverage increased compute power sets a new standard for tackling intricate tasks with precision.

Navigating the Evolution: Demos and User Experience Enhancements

To showcase the prowess of the upgraded models, the OpenAI team conducts live demonstrations. The 01 model impresses with its swift responses and reduced error rates, elevating the user experience significantly. The integration of multimodal inputs furthers the model's adaptability, allowing seamless interaction with both images and text inputs.

The spotlight then shifts to Chat GPT Pro's Pro mode, designed to tackle challenging science, math, and programming problems with unmatched accuracy. Through engaging demos, the team underscores the model's ability to reason through complex criteria efficiently, marking a paradigm shift in AI capabilities.

Embracing the Future: Ongoing Developments and Developer Tools

As OpenAI continues to push the boundaries of AI innovation, the roadmap ahead is teeming with exciting possibilities. Upcoming enhancements for Chat GPT Pro promise more computational power for extended tasks, catering to users pushing the limits of AI applications. Moreover, the integration of developer-centric features like structured outputs and API image understanding hints at a rich ecosystem for creating AI-driven solutions.

In conclusion, OpenAI's relentless pursuit of excellence in AI technologies has culminated in the launch of the advanced 01 model and Chat GPT Pro, ushering in a new era of intelligent interactions. The fusion of cutting-edge capabilities, user-centric enhancements, and developer tools underscores OpenAI's commitment to shaping a smarter and more accessible AI landscape. As the "12 Days of Open AI" unfolds, anticipation builds for the transformative possibilities that lie ahead.

Let's embrace the wave of AI innovation together, where every interaction with OpenAI's models holds the promise of unlocking new frontiers and redefining what's possible in the realm of artificial intelligence.


In the realm of AI innovation, OpenAI's latest advancements promise to reshape user experiences and empower developers with cutting-edge tools. Join the journey of exploration and discovery as we uncover the limitless potential of AI through the lens of the 01 model and Chat GPT Pro. Let's embark on this transformative quest together, where intelligence knows no bounds, and innovation paves the way for a smarter future.