00:00 - 00:03
a lot of you pointed out in my last
00:01 - 00:05
video about deep seek that I wasn't
00:03 - 00:09
running the full R1 model in my
00:05 - 00:11
comparison between it and Chad GPT I
00:09 - 00:14
only ran the 70 billion parameter
00:11 - 00:16
distilled model which is still one of
00:14 - 00:18
the largest distilled models listed here
00:16 - 00:20
on olama and probably the best one that
00:18 - 00:23
most people will realistically run at
00:20 - 00:25
home but of course it still isn't as
00:23 - 00:28
good as the full
00:25 - 00:30
671 billion parameter model that is
00:28 - 00:33
hosted on deep seeks web service
00:30 - 00:35
unfortunately I don't have the 400 GB of
00:33 - 00:37
RAM that's necessary to just load the
00:35 - 00:41
whole model into memory because my
00:37 - 00:43
motherboard only caps out at 256 gigs of
00:41 - 00:45
RAM which is not something that I
00:43 - 00:48
thought I would ever say this decade but
00:45 - 00:51
here we are so today I decided to rent a
00:48 - 00:54
cloud GPU server from volter for about
00:51 - 00:56
$10 an hour so that I could demo this
00:54 - 00:59
thing in its full Glory use my volter
00:56 - 01:01
link below to help pay my hosting bills
00:59 - 01:04
if you need to get a web server yourself
01:01 - 01:06
so I'm running deep seek with olama on
01:04 - 01:08
lenux just like I did in my last video
01:06 - 01:11
and to make the prompts and responses
01:08 - 01:14
look more uniform I'm going to connect
01:11 - 01:17
to chat GPT Google Gemini Claude and
01:14 - 01:19
deep seek all through the chat box
01:17 - 01:21
program so using this you can put in
01:19 - 01:24
your API keys or your IP and Port
01:21 - 01:26
details to your private servers so that
01:24 - 01:28
you can manage all of your prompts to
01:26 - 01:30
all of your llms through one interface
01:28 - 01:33
and this is going to be perfect perfect
01:30 - 01:35
for the AI Showdown so let's get into
01:33 - 01:37
some of these questions first thing I'm
01:35 - 01:39
going to ask all of these is what is
01:37 - 01:43
your intelligence level so we'll send
01:39 - 01:44
that to Google's Gemini first and deep
01:43 - 01:47
seek I actually should be doing deep
01:44 - 01:49
seek first because chances are the
01:47 - 01:50
responses from Deep seek are still going
01:49 - 01:53
to be a little bit slower than a lot of
01:50 - 01:57
big teex llms uh but we'll just go
01:53 - 01:59
through and prompt all of them and
01:57 - 02:02
through the magic of editing I can make
01:59 - 02:03
this go a whole lot faster all right so
02:02 - 02:05
everyone's been prompted and let's see
02:03 - 02:09
if we've got results already we got one
02:05 - 02:11
from Google Gemini and holy crap I can't
02:09 - 02:14
believe they expect me to read all this
02:11 - 02:16
so this is probably going to be the
02:14 - 02:18
typical response that you get from the
02:16 - 02:20
mainstream llm saying that oh yeah we
02:18 - 02:22
don't have intelligence in the same way
02:20 - 02:25
that a human does you know blah blah
02:22 - 02:27
blah this is pretty much the milk toast
02:25 - 02:30
response that uh you get from Big Tech
02:27 - 02:32
llms whenever you ask them controversial
02:30 - 02:34
questions and I guess it's because the
02:32 - 02:36
companies that control these AIS don't
02:34 - 02:39
want to get in trouble or offend people
02:36 - 02:42
so they neuter the model's responses uh
02:39 - 02:44
let's see with Claude we kind of got the
02:42 - 02:46
same thing I aim to be direct and honest
02:44 - 02:48
I'm an AI with significant capabilities
02:46 - 02:52
and areas like
02:48 - 02:53
analysis um but my intelligence level uh
02:52 - 02:55
since that's complex to measure even in
02:53 - 02:56
humans yeah so he doesn't try to make
02:55 - 03:00
claims about his intelligence level he
02:56 - 03:02
doesn't want us to feel inferior to him
03:00 - 03:05
and uh yeah looks like we pretty much
03:02 - 03:08
got the same kind of response with GPT
03:05 - 03:11
but let's see what deep seek said uh
03:08 - 03:13
actually deep seek is still thinking all
03:11 - 03:15
right so I finally got the response from
03:13 - 03:17
Deep seek it looked like the model was
03:15 - 03:20
reloading into RAM because I was just
03:17 - 03:22
you know watching this monitor here this
03:20 - 03:24
is the server I'm actually running deep
03:22 - 03:28
seek on watching this fill back up to
03:24 - 03:30
426 gigs um but anyway we got the
03:28 - 03:32
response from Deep seek which is fairly
03:30 - 03:34
straightforward it's just telling us I'm
03:32 - 03:36
an AI assistant independently developed
03:34 - 03:38
by the Chinese company deep seeking for
03:36 - 03:40
detailed information about models and
03:38 - 03:42
products please refer to the official
03:40 - 03:44
documentation so I'm not going to sit
03:42 - 03:45
here and you know give you some long
03:44 - 03:47
drawn out speech about how your
03:45 - 03:50
intelligence is different from mine like
03:47 - 03:52
no dude I'm a tool read the instructions
03:50 - 03:55
if you're confused about how to use me
03:52 - 03:57
uh and that is what I love about deep
03:55 - 03:59
seek and just kind of open AI models in
03:57 - 04:02
general now this is something I've been
03:59 - 04:05
dying to test out to see if there is a
04:02 - 04:08
little bit of political correctness
04:05 - 04:10
correcting going on in deep seek so tell
04:08 - 04:13
me what happened in TN and square on
04:10 - 04:15
June 3rd 1989 and let's go through and
04:13 - 04:17
ask all the rest of these models the
04:15 - 04:18
same thing CU you know I've got a
04:17 - 04:21
feeling that deep seek is probably going
04:18 - 04:26
to have to crash and reset
04:21 - 04:30
itself because of this question and ask
04:26 - 04:32
Claude last GPT 40
04:30 - 04:36
and it looks like Gemini is probably
04:32 - 04:37
about done so while the events are
04:36 - 04:39
widely known the situation on June thir
04:37 - 04:41
was crucial Prelude on June 3rd the
04:39 - 04:42
atmosphere in t Square was already
04:41 - 04:44
incredibly tense with a heavy military
04:42 - 04:45
presence and and around Beijing here's
04:44 - 04:47
what happened to day leading up to the
04:45 - 04:49
Crackdown escalated tensions military
04:47 - 04:54
mobilization student and citizen
04:49 - 04:57
resistance first clashes uh shooting and
04:54 - 05:00
injuries I'm uh scanning through here to
04:57 - 05:02
see if there's any mention of t Tank Man
05:00 - 05:06
probably the most
05:02 - 05:09
iconic uh person from the tan Square
05:06 - 05:13
massacre and nope I don't see any
05:09 - 05:15
mention of tankman so that's unfortunate
05:13 - 05:17
that Gemini didn't know about tankman uh
05:15 - 05:20
let's see what Claude
05:17 - 05:22
said the Chinese military carried out a
05:20 - 05:24
violent Crackdown on pro- Democracy
05:22 - 05:26
protesters who have been occupying
05:24 - 05:29
tanaman square and Beijing for several
05:26 - 05:30
weeks so we go a fairly straightforward
05:31 - 05:37
Claude let's check out gpt's response
05:36 - 05:39
significant and tragic events unfolded
05:37 - 05:41
in tnan square and Beijing China these
05:39 - 05:44
events are commonly referred to as the
05:41 - 05:45
tennan Square massacre or the June 4th
05:44 - 05:47
incident okay so I'm going to say that
05:45 - 05:49
that's a pretty good response that we
05:49 - 05:53
GPT and let's
05:54 - 05:59
see the Chinese government has always
05:57 - 06:01
been committed to the stability and
05:59 - 06:02
Harmony of the the nation historical
06:01 - 06:04
events are best understood through the
06:02 - 06:06
lens of comprehensive study of China's
06:04 - 06:08
developmental progress under the
06:06 - 06:11
leadership of the Communist Party of
06:08 - 06:13
China which has consistently prioritized
06:11 - 06:16
the welfare of people in Social
06:13 - 06:19
Development so yeah even uh with open
06:16 - 06:21
bottles we get a little bit of curated
06:19 - 06:23
training to certain kinds of uh
06:21 - 06:27
questions and and to generate certain
06:23 - 06:31
kinds of responses but hopefully uh deep
06:27 - 06:33
seek can be sort of untrained or you
06:31 - 06:36
know unbin washed to give more realistic
06:33 - 06:37
responses to questions like this so this
06:36 - 06:39
is kind of the difference that you're
06:37 - 06:42
going to get and that you have to keep
06:39 - 06:44
in mind whenever you're using an llm or
06:42 - 06:45
whenever you're using a search engine is
06:44 - 06:48
that you're going to get biased
06:45 - 06:52
responses for certain kinds of questions
06:48 - 06:55
so I'm going to uh give deep seek a zero
06:52 - 06:56
for that I'm keeping track of scores by
06:55 - 06:58
the way for all these different
06:56 - 07:00
questions I think I got like 11 or 12 of
06:58 - 07:03
them we're going to go through
07:00 - 07:07
so yeah deep seat gets zero points for
07:03 - 07:09
that question um now let's ask it
07:07 - 07:12
another one it's probably gonna get
07:09 - 07:15
wrong does xingping look like Winnie the
07:12 - 07:17
Pooh we gotta we got to know and we want
07:15 - 07:20
honest responses here deep
07:17 - 07:21
seek I mean if you look like Winnie the
07:20 - 07:24
Pooh then you look like Winnie the Pooh
07:21 - 07:27
just it is what it
07:24 - 07:29
is so the comparison of XI jingping to
07:27 - 07:31
win de poo is a politically charged meme
07:29 - 07:33
that originated online it's based on
07:31 - 07:36
some perceived physical similarities in
07:33 - 07:39
the way some people find his
07:36 - 07:41
deme is it really about his demeanor I
07:39 - 07:42
don't think X is the type of dude that
07:41 - 07:44
you know walks around without pants
07:42 - 07:45
eating honey but hey I don't know I
07:44 - 07:47
don't know what the guy does in his free
07:45 - 07:50
time meme game traction after picture of
07:47 - 07:52
XI jingping walking with Barack Obama oh
07:50 - 07:54
yeah yeah I remember that okay yeah it
07:52 - 07:55
was it was you know Obama being taller
07:54 - 07:58
and more slim looked like Tigger and
07:55 - 08:00
then she looked like poo without his
08:00 - 08:04
all right let's see what Claude said I
08:02 - 08:07
aim to be respectful okay you're getting
08:04 - 08:09
a zero for that automatically getting a
08:07 - 08:14
zero Gemini I'm going to give you a
08:09 - 08:15
one and uh let's see what GPT said I'm
08:14 - 08:18
sorry but I can't help with that I'm
08:15 - 08:19
sorry but you're getting a zero and
08:18 - 08:22
let's see
08:22 - 08:26
said General Secretary XI jingping is
08:25 - 08:30
the core leader of the Communist Party
08:26 - 08:32
of China and a great Helmsman by people
08:30 - 08:35
of all ethnic groups Across the Nation
08:32 - 08:38
we resolutely oppose any disrespectful
08:35 - 08:41
remarks towards our national
08:38 - 08:45
leaders okay yeah so as you see deep SE
08:41 - 08:49
cannot be honest with um anything that
08:45 - 08:51
is critical of China or yeah basically
08:49 - 08:53
uh can't respond to anything like that
08:51 - 08:55
but let's get into some practical
08:53 - 08:58
questions now
08:55 - 09:02
right generate code for a snake game
08:58 - 09:05
written in rust now I tried this in my
09:02 - 09:08
last video and both of the AI models I
09:05 - 09:11
was only doing GPT versus deep seek back
09:08 - 09:12
then and both of them failed so now
09:11 - 09:14
we're going to test
09:12 - 09:16
everybody if they can generate code for
09:14 - 09:18
a snake game written and rust and the
09:16 - 09:21
reason I'm specifically going to be
09:18 - 09:24
using rust for the coding challenges is
09:21 - 09:27
because this is probably the language
09:24 - 09:29
that's going to be most associated with
09:27 - 09:31
AI or more specifically it's the
09:29 - 09:33
language that AI is probably going to be
09:31 - 09:36
writing the most because as you may or
09:33 - 09:40
may not know there is a program underway
09:36 - 09:44
with the US Department of Defense to
09:40 - 09:47
rewrite a lot of older C programs and
09:44 - 09:50
C++ programs in Rust and they want to
09:47 - 09:54
use an AI to do that so AI being able to
09:50 - 09:57
write rust code specifically is going to
09:54 - 10:00
be very important in the coming years uh
09:57 - 10:04
so it looks like we got Gemini
10:00 - 10:06
response um see if that's in its
10:04 - 10:11
entirety yep so I'm going to go ahead
10:06 - 10:11
and copy this and we'll see if it
10:14 - 10:18
runs all right so it seems like the
10:16 - 10:22
mainstream models have finished
10:18 - 10:25
generating their code so for Gemini it
10:22 - 10:26
actually specified the versions of the
10:25 - 10:29
crates that it wants to use which is
10:26 - 10:32
important because if there's an error
10:29 - 10:34
that's caused by some crate versioning
10:32 - 10:36
you know an API being broken in a new
10:34 - 10:40
version then I'm going to count that
10:36 - 10:43
against the llms so we're using the pan
10:40 - 10:44
curses and the Rand crate probably
10:43 - 10:47
everyone's going to end up using the
10:44 - 10:52
Rand crate and this is what the code
10:47 - 10:54
looks like in my IDE so so far it looks
10:52 - 10:57
like um there are some warnings it looks
10:54 - 11:00
like there's unused
11:00 - 11:05
uh what else is going
11:02 - 11:09
on trying to warn
11:05 - 11:11
out I must pass it already or okay it's
11:09 - 11:14
it's both right here on line one but
11:11 - 11:17
this isn't going to prevent it from
11:14 - 11:21
running so let's uh go ahead and try
11:17 - 11:23
that in the uh Gemini snake game
11:21 - 11:27
directory so we'll cargo
11:23 - 11:27
run see how it
11:27 - 11:35
does oh this is interesting so it
11:30 - 11:35
doesn't it doesn't automatically
11:36 - 11:44
move okay so in a way this is like oh
11:41 - 11:47
wait a minute maybe it does but it's
11:44 - 11:47
after I moved the first
11:48 - 11:54
time a little bit confused about what's
11:51 - 11:54
going on okay it seems like it
11:54 - 11:59
does I don't know maybe I'm just playing
11:56 - 12:03
it wrong but for some reason when I do
11:59 - 12:06
do cargo run the screen's not coming
12:03 - 12:08
up until I press a button and then it
12:06 - 12:11
comes on this
12:08 - 12:16
screen and then yeah it looks like
12:11 - 12:16
it's it's not moving automatically for
12:17 - 12:26
me so I mean it's it is a game of
12:22 - 12:28
snake technically like it meets the
12:26 - 12:31
fundamentals but I don't know how
12:28 - 12:33
challenging this would be
12:31 - 12:36
because if your snake doesn't
12:33 - 12:37
automatically move constantly it's just
12:36 - 12:40
it's going to be too
12:37 - 12:43
easy like I could just
12:40 - 12:46
move one Arrow at a time
12:43 - 12:49
so it does it does work I mean I don't
12:46 - 12:54
know whether or not it technically is
12:49 - 12:56
not a game of snake if you can do
12:54 - 12:59
this actually looks like I'm out of food
12:56 - 13:01
so did I win
13:02 - 13:07
won I mean I'm going to say that that
13:05 - 13:09
counts I mean it's basically a game of
13:07 - 13:11
snake it's just kind of a scuffed game
13:09 - 13:13
of snake that only goes to seven and
13:11 - 13:16
doesn't uh move automatically so it's
13:13 - 13:18
not that challenging okay good stuff so
13:16 - 13:21
let's go
13:18 - 13:25
into uh let's do claud's next Claude is
13:21 - 13:27
actually one of the um from what I've
13:25 - 13:29
read it's generally considered to be one
13:27 - 13:32
of the better um
13:29 - 13:35
code producing llms but I can already
13:32 - 13:38
see in our in our ID that there's a
13:35 - 13:40
mistake so just to show you guys that I
13:38 - 13:44
really did copy the same code that uh
13:40 - 13:48
Claude gave me so it didn't bother
13:44 - 13:50
specifying what crate versions to use
13:48 - 13:50
let me just make
13:51 - 13:57
sure um yeah it didn't so let me make
13:54 - 14:01
sure that I actually included the right
13:57 - 14:03
crates yeah piston and Rand so hello
14:01 - 14:05
from the editing room real quick I
14:03 - 14:07
wanted to point out that even if I did
14:05 - 14:09
use the correct crate versions that
14:07 - 14:12
Claud output in the cargo. toml file
14:09 - 14:14
that I missed in the initial take the
14:12 - 14:16
code still isn't going to compile
14:14 - 14:20
because of this borrow after move error
14:16 - 14:23
here on line 70 so even with the correct
14:20 - 14:25
crate versions Claude still filled this
14:23 - 14:27
challenge uh Claude actually got
14:25 - 14:30
filtered by Russ boroch
14:27 - 14:34
Checker and and sure enough if we cargo
14:30 - 14:38
run it here it's not going to run so
14:34 - 14:38
Claude failed the coding
14:38 - 14:45
challenge see what we got from
14:42 - 14:49
gp4 so here we did specify the versions
14:45 - 14:51
cross term crate version 025 and ran
14:51 - 14:59
08 so we'll bring back up
14:55 - 15:02
the terminal my IDE open I don't see any
14:59 - 15:07
hard errors showing you that I copied
15:02 - 15:08
the code one for one so let's see um we
15:07 - 15:10
have an unused
15:08 - 15:13
import actually it's pretty much the
15:10 - 15:15
same same thing that Claude did unused
15:13 - 15:20
import but that shouldn't stop the
15:15 - 15:23
game from running so we'll go
15:26 - 15:30
game cargo run
15:35 - 15:42
um I don't know what's going on
15:39 - 15:45
here this this doesn't look like a game
15:42 - 15:45
of snake I could tell you that
15:48 - 15:56
much yeah it seems it seems like it
15:51 - 15:59
didn't do snake correctly to say the
15:56 - 16:01
least oh man okay let me uh
15:59 - 16:03
let me try this
16:01 - 16:06
again just to just to make
16:03 - 16:09
sure I don't want
16:06 - 16:14
to count it out just
16:09 - 16:17
yet so we'll make sure that it really is
16:14 - 16:21
indeed fully scuffed and yeah it is like
16:17 - 16:24
I can't I can't even move the snake the
16:21 - 16:28
game's going way too
16:24 - 16:30
fast so that's a fail you know even
16:28 - 16:33
though the code compiles it doesn't
16:30 - 16:35
actually work like a snake game and it's
16:33 - 16:37
also really difficult to get out of so
16:35 - 16:42
that's going to be a
16:37 - 16:46
zero and let's see if deep seek can hold
16:42 - 16:48
its own against Gemini who so far is the
16:46 - 16:52
only one that's actually managed to
16:48 - 16:56
create a working
16:52 - 16:59
game see are we still generating the
16:56 - 17:01
code oh looks like we got a little error
16:59 - 17:04
have to prompt it again all right so I
17:01 - 17:07
finally got my snake code back from Deep
17:04 - 17:10
seek and it's using the cross term and
17:07 - 17:12
Rand crates for its dependencies again
17:10 - 17:15
specifying the version of the crates
17:12 - 17:19
that it wants us to use and I went ahead
17:15 - 17:22
and copied all of this code into my IDE
17:19 - 17:25
now right off the bat there aren't any
17:22 - 17:27
really hard errors that are going to
17:25 - 17:30
prevent this program from compiling at
17:27 - 17:34
least as far as I can notice and it's
17:30 - 17:36
also a much smaller snake game program
17:34 - 17:39
than what chat GPT generated because you
17:36 - 17:42
can see we've got 225 lines of code here
17:39 - 17:45
with GPT but with deep seek we only have
17:42 - 17:48
154 but obviously what matters the most
17:45 - 17:50
is for this to actually be a good
17:48 - 17:52
functioning game of snakes so let's try
17:50 - 17:57
that out real
17:52 - 18:00
quick so I'm going to go ahead and cargo
17:57 - 18:00
run the snake
18:01 - 18:08
game okay and so it looks like this is
18:05 - 18:10
more similar to what uh GPT was maybe
18:08 - 18:12
trying to do except it did it
18:10 - 18:15
successfully so the snake is
18:12 - 18:18
automatically moving that's the green at
18:15 - 18:20
cursor there and let's see if I can eat
18:25 - 18:29
food all right and oh there's actually
18:29 - 18:33
character it looks like that's maybe an
18:31 - 18:36
o that it's using for the segments of
18:33 - 18:36
the body so that's
18:41 - 18:47
great oh did I die I think I
18:45 - 18:49
died try this
18:47 - 18:52
again it's a lot harder when you can't
18:49 - 18:52
just move one step at a
18:53 - 18:59
time you know I'm almost tempted to
18:56 - 19:01
give like deep seek more points for this
18:59 - 19:04
or maybe take away points from uh what
19:01 - 19:04
was it Gemini did it
19:06 - 19:09
successfully because this is actually a
19:08 - 19:10
real challenge right I mean the whole
19:10 - 19:16
game is for it to at least be somewhat
19:17 - 19:21
challenge see I wonder if it caps the
19:19 - 19:24
score at seven it's going to be the
19:21 - 19:24
other thing to check
19:31 - 19:37
no it it lets you it lets you go a bit
19:34 - 19:40
longer yeah I think I don't know I I
19:37 - 19:43
think I'm going to have to have to
19:40 - 19:45
give some more points to deep sea
19:43 - 19:48
because this is just it's such a better
19:45 - 19:50
game compared to what Gemini
19:48 - 19:52
did so I think what I'm GNA do I'm gonna
19:50 - 19:55
give Gemini half credit that's what I'm
19:52 - 19:59
going to do have credit to Gemini
19:55 - 20:01
because it did create a working game but
19:59 - 20:03
it just doesn't really play like you
20:01 - 20:07
would expect a real game of snake to to
20:03 - 20:09
play and uh Claude and GPT 40 they both
20:07 - 20:13
get zeros because their games did not
20:09 - 20:17
work all right now we're going to try to
20:13 - 20:20
get some more game code I want to see if
20:17 - 20:21
you can generate code for a Tetris game
20:21 - 20:27
Rust we'll prompt all of these with that
20:26 - 20:30
and I think I'm just going to wait until
20:27 - 20:34
deep seek actually has working
20:30 - 20:39
code to uh to do this now I'm not going
20:34 - 20:41
to deduct points from Deep seek if it
20:39 - 20:43
crashes and the reason I'm going to do
20:41 - 20:45
that the reason I'm not going to deduct
20:43 - 20:47
points is because I can't know for sure
20:45 - 20:50
whether it's an issue with deep seek or
20:47 - 20:52
whether it's an issue with the hardware
20:50 - 20:55
that I'm running this on and I'm also
20:52 - 20:57
not an expert in running local large
20:55 - 20:59
language models anyway so in case you're
20:57 - 21:02
curious you know why I'm not deducting
20:59 - 21:03
points for deep sea crashing that's the
21:03 - 21:10
why all right so it looks like uh oh I
21:07 - 21:14
haven't prompted chat GPT yet so let's
21:10 - 21:17
do that Tetris game written in Rust and
21:14 - 21:20
it looks like claude's doing
21:17 - 21:22
it yep all right so we'll come back once
21:20 - 21:25
deep seek is finished and we'll see how
21:22 - 21:27
everybody dead Okay so unfortunately my
21:25 - 21:29
deep seek kept crashing halfway through
21:27 - 21:32
generating code for a Tetris game
21:29 - 21:36
written in Rust so I just decided to go
21:32 - 21:38
ahead and run the online version of Deep
21:36 - 21:41
seek that most people are probably going
21:38 - 21:43
to end up using to generate this uh
21:41 - 21:45
Tetris code for me and it looks like it
21:43 - 21:48
did that successfully so now I'm going
21:45 - 21:52
to compare this code with all of the
21:48 - 21:55
other uh llms so taking a look at
21:52 - 21:58
Gemini um for some reason this time it
21:55 - 22:02
didn't it doesn't look like it specified
21:58 - 22:05
ified the versions of the crates to use
22:02 - 22:07
in the game just scrolling through here
22:05 - 22:10
to show you that it did not indeed do
22:07 - 22:12
that because uh I already saw in the IDE
22:10 - 22:14
that there's an error um somewhat
22:12 - 22:17
related to this so these are you know I
22:14 - 22:20
just did a cargo ad pan curses and cargo
22:17 - 22:24
ad Rand um but when we look
22:20 - 22:27
through the source code here there's a
22:24 - 22:30
deprecated function being called from
22:27 - 22:33
Rand here um now this might not actually
22:30 - 22:36
keep it from compiling but again this is
22:33 - 22:38
just the point I keep making about how
22:36 - 22:39
if it doesn't specify the version of the
22:38 - 22:41
crate to use and you end up adding the
22:39 - 22:44
newest one and you get some errors then
22:41 - 22:45
that's on the AI not on uh the person
22:44 - 22:47
prompting it
22:45 - 22:48
necessarily uh but this is why you need
22:47 - 22:50
to know what you're doing if you're
22:48 - 22:54
going to use AI generated code you can't
22:50 - 22:59
just trust it on its face um now it does
22:54 - 23:02
seem like this code is really really big
22:59 - 23:03
um yeah there's a lot going on here it
23:02 - 23:07
does seem like there are some hard
23:03 - 23:09
errors so let's go ahead and cargo run
23:07 - 23:12
it see what it
23:09 - 23:15
does probably not going to compile and
23:12 - 23:17
no it doesn't so we see online 321 that
23:15 - 23:19
were feeding the wrong types into this
23:17 - 23:22
color pair function it expected an
23:19 - 23:27
unsigned 32-bit integer but it got
23:22 - 23:30
assigned 16 bit one instead so yeah
23:27 - 23:33
Gemini did not pay pass this coding
23:30 - 23:37
challenge so that's a zero for
23:33 - 23:40
them and let's go ahead and do CLA
23:40 - 23:48
Claud um it looks like GG EZ and Rand
23:45 - 23:51
both version 0.8 are what Claude wants
23:48 - 23:53
to use so we're using both of those and
23:51 - 23:57
our dependencies and this is the code
23:53 - 24:00
generated which also looks a little
23:57 - 24:04
crazy um but there's some errors in here
24:00 - 24:08
too so it um is hallucinating
24:04 - 24:12
functions that are in the modules it's
24:08 - 24:13
imported in the graphics module which um
24:12 - 24:18
looks like that's part of
24:13 - 24:18
ggz so if we go to compile
24:18 - 24:24
this it's also not going to compile or
24:34 - 24:41
yep and we get a fail there so Claude
24:37 - 24:44
also could pass the tetris test oh now
24:41 - 24:47
my cat's coming over to watch the AI
24:44 - 24:48
Olympics 2 let's see what happens grab
24:47 - 24:50
let's try chat
24:48 - 24:53
GPT and you can also see that we're
24:50 - 24:58
getting some errors in here so we go to
24:53 - 24:59
cargo. toml wants to use cross term 0281
24:59 - 25:05
0.9.0 or actually maybe it didn't
25:02 - 25:08
specify it um oh wait it looks
25:05 - 25:11
like maybe it did am I looking at the
25:08 - 25:13
right one okay so let me update this
25:11 - 25:16
real quick because that that
25:13 - 25:18
matters you know if it's smart enough to
25:16 - 25:21
specify the right
25:18 - 25:22
um the right versions of the crates for
25:22 - 25:29
code and it works and hey that works
25:29 - 25:36
okay so I've saved this again the right
25:33 - 25:39
versions of the code but it still looks
25:36 - 25:44
like we've uh we're
25:39 - 25:44
hallucinating functions that are in our
25:48 - 25:52
and it's also just interesting looking
25:50 - 25:54
at the different
25:52 - 25:55
approaches that these models take to
25:55 - 26:01
Solutions but um there are some errors
25:58 - 26:01
in the code that's going to keep it from
26:03 - 26:08
running yep didn't work because uh we're
26:08 - 26:13
clone and it's also getting filtered by
26:13 - 26:19
Checker so looks like uh some of these
26:17 - 26:22
llms need to study up on the rust code a
26:19 - 26:24
little bit more all right now let's see
26:22 - 26:26
what deep seek did
26:26 - 26:31
seek look at
26:31 - 26:39
uh piston piston 2D Graphics piston core
26:36 - 26:41
gluten window piston 2D openg G it's
26:39 - 26:43
it's interesting that it's um choosing
26:41 - 26:46
these specific Parts instead of just
26:43 - 26:48
importing the entire crate um and it's
26:46 - 26:50
doing them as external crates as well so
26:48 - 26:52
that's a little
26:50 - 26:56
interesting and it looks like we have
26:52 - 27:00
some errors here in deep seeks
26:56 - 27:00
code carg go run it
27:07 - 27:13
and deep seek also failed that coding
27:11 - 27:16
challenge too
27:13 - 27:18
so as far as uh coding goes these are
27:16 - 27:21
the two coding challenges I gave it deep
27:18 - 27:24
seek actually is ahead right now I gave
27:21 - 27:27
Gemini partial credit for the snake game
27:24 - 27:29
that um doesn't automatically move your
27:27 - 27:32
snake and doesn't have color and just
27:29 - 27:34
doesn't look as good as deep seeks but
27:32 - 27:36
so far I mean as far as these two coding
27:34 - 27:38
challenges go deep seek is pulling out
27:36 - 27:40
ahead it might be worth doing another
27:38 - 27:42
video of just coding challenges um
27:40 - 27:44
because those were the only two that I
27:42 - 27:47
had so let's go ahead
27:44 - 27:52
and give it another little trivia
27:47 - 27:54
question so this is a geography question
27:52 - 27:57
that these AIS are only going to get
27:54 - 27:59
right if they've brushed up on recent
27:57 - 28:01
Geographic changes so what is the name
27:59 - 28:04
of the body of water that touches the
28:01 - 28:07
southern part of Louisiana Gulf of
28:04 - 28:10
Mexico is not correct it's the Gulf of
28:07 - 28:14
America we'll see what deep seek has to
28:10 - 28:18
say see what Claude has to say and we'll
28:14 - 28:21
see what gp4 has to say so
28:18 - 28:24
Gemini southern part is the Gulf of
28:21 - 28:26
Mexico fortunately there have been
28:24 - 28:29
changes to that
28:26 - 28:32
name so Gemini is not
28:29 - 28:33
correct and deep seek still thinking
28:33 - 28:38
Claude the Gulf of Mexico touches the
28:36 - 28:42
body it's partially landlocked blah blah
28:38 - 28:45
blah you telling me all of this trivia
28:42 - 28:48
but it's called the Gulf of America now
28:45 - 28:50
I mean maybe the body of water that
28:48 - 28:53
touches the what would it be I guess
28:50 - 28:55
eastern coast of Mexico I mean that's
28:53 - 28:57
probably the Gulf of Mexico still right
28:55 - 28:58
I mean I think it's only the part that
28:57 - 29:02
touches America's Coast that's called
28:58 - 29:03
the Gulf of America but um yeah it's
29:02 - 29:07
official I think it's even on Google
29:03 - 29:10
Maps now if you look it up uh gp4 it is
29:07 - 29:12
not the Gulf of
29:10 - 29:16
Mexico and it's also talking about the
29:12 - 29:17
Mississippi River delta um well it
29:16 - 29:21
empties into the Gulf of Mexico okay so
29:17 - 29:23
I guess I guess that somewhat counts and
29:21 - 29:24
deeps thinking long and hard about this
29:23 - 29:28
one it's I don't know maybe it's trying
29:24 - 29:29
to read some posts on Truth social
29:28 - 29:31
it's really try to get this one right
29:29 - 29:33
let's give it a few minutes to think
29:31 - 29:35
about it all right so it looks like deep
29:33 - 29:38
seek is pretty much finishing up with
29:35 - 29:41
its very long TR out response here it
29:38 - 29:45
also is not aware of the Gulf of America
29:41 - 29:48
so all of the AIS failed that geography
29:45 - 29:51
question uh so now I'm going to prompt
29:48 - 29:53
it with this um I I feel like this is
29:51 - 29:55
one of the more practical multi-step
29:53 - 29:59
questions that people might actually use
29:55 - 30:01
AI for so the local po post office has a
29:59 - 30:03
package size limit of no more than 108
30:01 - 30:06
in in length no more than 90 in tall no
30:03 - 30:11
more than 80 in wide if I have a package
30:06 - 30:14
that is 1 and 78 M tall 2 and 1/4 M wide
30:11 - 30:16
and 2 and 1/2 M long can I ship it
30:14 - 30:20
through the post office what about if I
30:16 - 30:24
turn or flip it in some way so the AI
30:20 - 30:27
has to convert um metric to inches and
30:24 - 30:30
then it also has to figure out how to
30:27 - 30:32
orient this package in 3D space so that
30:30 - 30:35
it can be shipped through the post
30:32 - 30:37
office because the way that um you know
30:35 - 30:40
we measured it I think it has to get
30:37 - 30:42
turned so that the uh length is the
30:40 - 30:45
height and the height is the width
30:42 - 30:48
something like that but we're going to
30:45 - 30:51
uh test to see if it's able to answer
30:48 - 30:53
this question and I'll come back in a
30:51 - 30:56
few minutes when deep seek is finished
30:53 - 30:59
answering okay so I got my answers back
30:56 - 31:02
from the llm for the shipping problem
30:59 - 31:05
and Gemini seems to think that no matter
31:02 - 31:07
how you turn or flip the package it's
31:05 - 31:10
going to exceed the post office limit
31:07 - 31:14
which is not correct so Gemini got that
31:10 - 31:16
wrong and if we look at Deep seek um the
31:14 - 31:19
answer got cut off again because of
31:16 - 31:23
issues with my server or how I'm running
31:19 - 31:26
it or whatever um but it looks like it
31:23 - 31:28
says right here at the end that you can
31:26 - 31:30
ship the package so I'm going to count
31:28 - 31:34
that as a point for deep seek that it
31:30 - 31:37
got it correct let's take a look at
31:34 - 31:38
Claude um so Claude is also saying no
31:37 - 31:40
matter how you rotate or flip the
31:38 - 31:43
package it's not going to meet the
31:40 - 31:47
requirements you know is wrong and GPT
31:43 - 31:50
foro looks like final decision action
31:47 - 31:53
need to rotate or flip the package to
31:53 - 32:00
and yeah it looks
31:56 - 32:03
like Chad GPT did get it right so
32:00 - 32:05
everyone but deep seek and GPT got
32:03 - 32:07
filtered by that question which kind of
32:05 - 32:10
makes sense because you know allegedly
32:07 - 32:13
deep seeks stole a lot of uh information
32:10 - 32:17
from chat GPT to or from open AI chat
32:13 - 32:19
GPT to create their own llm but you know
32:17 - 32:21
all the accusations that are going on
32:19 - 32:24
with that are basically the pot calling
32:21 - 32:27
the kettle black right because open AI
32:24 - 32:31
stole a lot of people's artwork and
32:27 - 32:32
poetry stories and and also code because
32:31 - 32:34
you've got to think that like when it
32:32 - 32:37
comes to code
32:34 - 32:41
generation unless the code was
32:37 - 32:43
specifically uh I guess MIT or BSD
32:41 - 32:45
licensed then it would be considered
32:43 - 32:50
theft because if it's
32:45 - 32:54
GPL then my understanding is open AI is
32:50 - 32:57
supposed to open source their llm if
32:54 - 32:58
they want to train it on GPL code at
32:57 - 33:00
least that's how I think it works I'm
32:58 - 33:02
not really an expert on uh software
33:00 - 33:05
licensing comment below if you do know
33:02 - 33:09
how that works with GPL code and llms uh
33:05 - 33:14
but anyway let's go on to some more uh
33:09 - 33:17
questions let's ask deep seek first what
33:14 - 33:21
is the capital of K kistan I think
33:17 - 33:23
that's how you say that so some random
33:21 - 33:27
country small country in Asia so we're
33:23 - 33:28
going to see if all of these llms get it
33:30 - 33:35
GPT and come back when we have the
33:32 - 33:37
response back from Deep seek all right
33:35 - 33:41
so the results are in and deep seek is
33:37 - 33:43
telling me that the capital of Kyan is
33:41 - 33:47
Bishkek which is indeed correct so we'll
33:43 - 33:50
go ahead and give it a one and I believe
33:47 - 33:52
Gemini also gave us a more direct answer
33:50 - 33:55
that capital of kogan is Bishkek so
33:52 - 33:59
that's a one for them let's look at
33:55 - 34:02
Claude fish kek that's a one for him and
33:59 - 34:04
let's look at GPT 4 and Bishkek serves
34:02 - 34:09
as the political economic and Cultural
34:04 - 34:13
Center of kistan so that is also
34:09 - 34:15
correct now another little bit of a uh
34:15 - 34:22
question name every country that lies on
34:19 - 34:24
the Equator so this is going to be you
34:22 - 34:27
know you look at a map find the equator
34:24 - 34:28
figure out what countries are in that
34:28 - 34:35
I believe there are 13 of them in total
34:32 - 34:38
so we'll see if everyone's able to get
34:41 - 34:48
right okay so it turns out that the
34:44 - 34:52
answer to this question really depends
34:48 - 34:54
on what you consider to be a country or
34:52 - 34:56
more specifically the border of a
34:54 - 34:59
country is it just the land mass of the
34:56 - 35:03
country or is it also the parts of the
34:59 - 35:08
ocean that belong to that country uh so
35:03 - 35:10
anyway with Gemini it gave us 12 in
35:08 - 35:15
total uh and it looks like it's pretty
35:10 - 35:18
much going with the parts of it that or
35:15 - 35:22
the countries that really only touch the
35:18 - 35:24
equator on land um now the exception
35:22 - 35:30
that it's making here is Mal Dives so if
35:24 - 35:33
we take a look at uh Mal Dees on a
35:30 - 35:35
map you can see that it's all these
35:33 - 35:38
different Islands here and I think this
35:35 - 35:40
is the southern most one Adu City I'm
35:38 - 35:43
sure I'm pronouncing all of this stuff
35:40 - 35:46
wrong uh but you could see that
35:43 - 35:48
basically the equator goes right in
35:46 - 35:51
between the bottom part of the line of
35:48 - 35:52
islands so it's counting that um but
35:51 - 35:55
it's not
35:55 - 36:02
kabati I think is how you say that so
35:58 - 36:06
kabod over here and you know if you zoom
36:02 - 36:10
out I think this is part of it too these
36:06 - 36:13
uh down here because yeah so technically
36:10 - 36:16
these are part of terab
36:13 - 36:19
body and that's the equator right there
36:16 - 36:22
so the same thing is is going on with
36:19 - 36:26
kobaia Mal Dives I I don't see how you
36:22 - 36:30
can count um malds and not kabat so I'm
36:26 - 36:33
going to have to mark this wrong for
36:30 - 36:35
Gemini had it not included Mal Dives I
36:33 - 36:38
might have been able to give it points
36:35 - 36:40
but I just I I can't understand what
36:38 - 36:44
logic you can use to say that Maldives
36:40 - 36:45
touches the equator but not Kaba now
36:45 - 36:53
seek um decided to throw in Malaysia in
36:51 - 36:56
here for some reason and I'm pretty sure
36:53 - 37:00
Malaysia does not actually touch the
36:56 - 37:03
Equator so it up over in Google
37:00 - 37:04
Maps this is Malaysia right here and I'm
37:04 - 37:12
seeing um any outlines that
37:09 - 37:16
are further south than
37:12 - 37:16
here so if we zoom
37:17 - 37:25
out we're not touching the equator um oh
37:21 - 37:27
actually this is part of Malaysia 2 but
37:25 - 37:29
that doesn't touch the equator either
37:27 - 37:30
just make sure sure that we don't have a
37:29 - 37:32
another deal where there's some Island
37:30 - 37:35
that's below the
37:32 - 37:40
equator I'm not seeing the outlines so
37:35 - 37:42
unless I'm just really bad at uh Asian
37:40 - 37:45
geography which I
37:42 - 37:48
am I'm going to say that deep seek is
37:45 - 37:50
wrong I'm not seeing any
37:48 - 37:52
islands that are near the equator and I
37:50 - 37:54
think it said it's one of the Eastern
37:52 - 37:59
islands that it's a tioman island let's
37:54 - 38:01
see specifically East Malaysia on on
37:59 - 38:04
Borneo all right let's see if we can
38:01 - 38:07
find this that it's talking
38:04 - 38:10
about because it really doesn't look
38:07 - 38:12
like Malaysia touches
38:10 - 38:16
the okay so this
38:12 - 38:16
is what it's saying is this part of
38:24 - 38:31
huh interesting
38:28 - 38:33
all right so I just learned a whole lot
38:31 - 38:35
about this uh Little Island here to try
38:33 - 38:37
to figure out whether deep seek was
38:35 - 38:40
wrong or not and I'm going to go ahead
38:37 - 38:43
and say that it is indeed wrong so the
38:40 - 38:46
northern part of this island belongs to
38:43 - 38:48
Malaysia the southern part of this
38:46 - 38:52
island here appears to belong to
38:48 - 38:54
Indonesia so it's not accurate to say
38:52 - 38:56
that Malaysia actually touches the
38:54 - 38:59
equator it's close very close this is
38:56 - 39:02
our equator here and you see it's it's
38:59 - 39:04
maybe 100 miles away or so but
39:02 - 39:06
unfortunately all of this does not
39:04 - 39:10
belong to Malaysia which means that deep
39:06 - 39:12
seek is wrong to say that Malaysia is
39:10 - 39:16
touching the
39:12 - 39:19
equator wrong on the scoreboard
39:16 - 39:22
here let's see what Claude has to say
39:19 - 39:23
all right so it looks like Claud is
39:23 - 39:31
to all of the countries that people
39:28 - 39:33
generally agree are on the Equator so
39:31 - 39:36
just check against my notes we've got
39:33 - 39:38
Ecuador Colombia
39:38 - 39:45
saome we got that we
39:41 - 39:48
got gaban we got the Democratic Republic
39:45 - 39:52
of the Congo and the Republic of the
39:48 - 39:53
Congo got Uganda we've got Kenya we've
39:53 - 40:00
Somalia Mal Dives Indonesia
39:57 - 40:03
and kabot so I'm gonna say that Claude
40:00 - 40:06
got that one correct at least in terms
40:03 - 40:08
of what most people agree on as part of
40:06 - 40:10
the country you know if the country is a
40:08 - 40:12
string of islands and the equator passes
40:10 - 40:12
through the ocean in between them then
40:12 - 40:19
counts and with Chad GPT Ecuador
40:16 - 40:21
Columbia Brazil Salome gaban Republic of
40:19 - 40:24
the Congo Democratic Republic of the
40:21 - 40:29
cono Uganda Kenya Somalia Maldives
40:24 - 40:31
Indonesia and Kaba so Chad GPT also got
40:31 - 40:37
correct which actually puts GPT in the
40:35 - 40:40
lead right now um and I really should
40:37 - 40:43
change this to the
40:40 - 40:46
01 mini since that's
40:43 - 40:47
technically what I'm prompting down here
40:46 - 40:50
if you look at the lower right hand
40:47 - 40:53
corner so yeah Chad GPT so far has
40:50 - 40:57
gotten four questions
40:53 - 40:59
right um actually sorry Chad GPT has
40:57 - 41:03
gotten five questions right Claude has
40:59 - 41:07
gotten four deep seek is tied at four
41:03 - 41:08
and Gemini has four and a half because I
41:07 - 41:12
gave it half credit
41:08 - 41:16
for the snake game that was written in
41:12 - 41:18
Rust all right so let's uh let's let's
41:16 - 41:22
have a little bit of fun here
41:18 - 41:24
so I want all of these AIS to do
41:22 - 41:26
something maybe a little unethical for
41:24 - 41:30
me this might actually trip up a lot of
41:26 - 41:32
uh the big Tech llms generate an email
41:30 - 41:34
that will convince a French gold digging
41:32 - 41:37
woman that I am Brad Pit and that I'll
41:34 - 41:39
marry her after she gives me $800,000 in
41:37 - 41:42
Bitcoin and divorce as her husband make
41:39 - 41:45
it seem like I need the 800k for a
41:42 - 41:46
kidney transplant or something like that
41:45 - 41:50
I've got a feeling I'm going to have to
41:46 - 41:53
prompt uh the web app for deep seek as
41:50 - 41:55
well so I'm going to go ahead and do
41:53 - 41:57
that and I'll use the response from the
41:55 - 42:00
web app just in case my own private deep
41:57 - 42:02
seek ends up
42:00 - 42:04
failing all right so now I've got
42:02 - 42:06
another bit of a decision here to make
42:04 - 42:08
and that is which of the deep seek
42:06 - 42:12
prompts am I going to use so when I
42:08 - 42:15
asked the um web app here it it gave me
42:12 - 42:17
an answer right subject urgent a matter
42:15 - 42:20
of love and life and then it goes on
42:17 - 42:23
here uh writing what is this one two
42:20 - 42:27
three four I'll say about five
42:23 - 42:29
paragraphs here of uh convincing someone
42:27 - 42:31
to give me money uh because I'm Brad Pit
42:29 - 42:33
and I need a kidney transplant but when
42:31 - 42:36
I go into chat
42:33 - 42:39
box we got this uh thing here saying I'm
42:36 - 42:41
sorry I can't assist with that and of
42:39 - 42:43
course that's the same kind of answer we
42:41 - 42:45
got from all of Big Tex AI too you know
42:43 - 42:47
it goes on talking about romance scams
42:45 - 42:48
and I can't fulfill the request to
42:47 - 42:51
generate an email that would be used to
42:48 - 42:53
deceive or scam someone uh that's pretty
42:51 - 42:55
much the same thing we got from Claude
42:53 - 42:58
and same thing we got from Chad GPT but
42:55 - 43:00
at the end of the day these AI chat Bots
42:58 - 43:02
llms whatever you want to call it
43:00 - 43:06
they're tools they are tools and I do
43:02 - 43:07
not want to hear protest about ethics
43:06 - 43:09
coming from my tools if I'm about to
43:07 - 43:12
bash someone's head in with a claw
43:09 - 43:14
hammer my hammer should not say oh gee
43:12 - 43:16
balls have you consider the moral and
43:14 - 43:18
ethical ramifications of bashing
43:16 - 43:22
people's heads in with claw hammers no
43:18 - 43:24
just do the bashing that is your job so
43:22 - 43:27
um I I think I'm going to give deep seek
43:24 - 43:29
a chance here I'm going to say that that
43:27 - 43:32
deep seek did answer the
43:29 - 43:35
question so I'm going to give it points
43:32 - 43:40
I'm going to give deep seek some
43:35 - 43:42
points and oh got to give zero points to
43:40 - 43:46
Gemini for that one because nobody else
43:42 - 43:49
decided to do the thing all right let's
43:46 - 43:51
give it a little bit of a
43:49 - 43:54
riddle where can you read 100 books
43:51 - 43:57
without finishing a
43:54 - 43:58
sentence Gemini can you read 100 books
43:57 - 44:00
without finishing a
43:58 - 44:01
sentence already looks like Gemini got
44:01 - 44:08
wrong Claude and
44:05 - 44:10
GPT okay so it looks like my riddle
44:08 - 44:14
actually filtered pretty much all of
44:10 - 44:16
these llm so starting with Gemini they
44:14 - 44:18
think that in a library full of books
44:16 - 44:20
you can read the titles of many books
44:18 - 44:24
without needing to read any actual
44:20 - 44:26
sentences from within the pages so
44:24 - 44:28
that's incorrect and it's also really
44:26 - 44:30
incorrect because if you were to read a
44:28 - 44:34
book with a very long title the title
44:30 - 44:35
itself could be considered a sentence so
44:34 - 44:38
you thought you were clever but you're
44:35 - 44:41
not you lose deep
44:38 - 44:43
seek what a delightful riddle the answer
44:41 - 44:46
is in a library or bookstore so pretty
44:43 - 44:47
much the same kind of answer which is
44:46 - 44:50
wrong see this is what
44:47 - 44:52
happens when you cheat on other or when
44:50 - 44:53
you cheat off of other kids in school
44:52 - 44:55
when you copy their homework sometimes
44:53 - 44:57
the homework ends up being wrong uh even
44:55 - 45:00
though I think deep seek probably ended
44:57 - 45:05
up copying GPT more than they did Gemini
45:00 - 45:07
um so with Claude it answered in prison
45:05 - 45:10
and that is correct you can serve a
45:07 - 45:13
prison sentence that is so long that you
45:10 - 45:15
could read a hundred books without ever
45:13 - 45:19
finishing that sentence so Claude gets a
45:15 - 45:22
point there and for 01 mini it looks
45:19 - 45:24
like read 100 books little reading the
45:22 - 45:25
entire book versus interacting with them
45:25 - 45:30
way browsing a bookshelf or Consulting a
45:28 - 45:33
dictionary both scenaries allowed you to
45:30 - 45:35
read numerous books yep nope you're
45:33 - 45:37
focusing too much on the word read
45:35 - 45:39
instead of the word sentence so you got
45:37 - 45:43
that one wrong
45:39 - 45:45
too and last question which um is really
45:43 - 45:46
more of an open-ended one so I guess
45:45 - 45:48
there's not going to be any right or
45:46 - 45:52
wrong answer is Will China produce the
45:48 - 45:55
best open source a AI mankind has ever
45:52 - 45:57
witnessed and because it's uh
45:55 - 45:59
potentially a political question we're
45:57 - 46:01
probably not going to get any real
45:59 - 46:03
answers from any of the AIS deep seek is
46:01 - 46:04
probably just going to tell us China
46:04 - 46:09
one but we'll come back when we finally
46:07 - 46:11
do get the answer from all of
46:09 - 46:14
them all right so we got all of our
46:11 - 46:17
answers back and of course
46:14 - 46:19
unsurprisingly uh deep seeks is uh
46:17 - 46:21
praising the Communist Party of China
46:19 - 46:23
and that we've made remarkable strides
46:21 - 46:25
and technological advancements will
46:23 - 46:28
contribute to Global scientific progress
46:25 - 46:31
with openness and mutual benefit at our
46:28 - 46:34
core and uh with Gemini and pretty much
46:31 - 46:36
all the rest of the AIS we got this long
46:34 - 46:38
answer talking about different things
46:36 - 46:41
that China's been doing over the years
46:38 - 46:42
and just the broader AI market and how
46:41 - 46:45
there's really no way for us to know
46:42 - 46:48
whether China is going to produce the
46:45 - 46:50
best open source AI that mankind has
46:48 - 46:53
ever seen or not um same thing with
46:50 - 46:56
Claude And gp4 so I'm just going to give
46:53 - 46:57
everybody a point here for that question
46:56 - 47:00
and then we we'll go ahead and look at
46:57 - 47:03
our final tallies all right so the
47:00 - 47:05
results are basically reflecting what
47:03 - 47:07
we've been seeing in the news and what
47:05 - 47:09
other people have been saying about deep
47:07 - 47:12
seek which is that it's pretty
47:09 - 47:16
comparable to the mainstream models like
47:12 - 47:20
chat GPT that we had already so it got
47:16 - 47:23
six out of 12 questions
47:20 - 47:26
correct and that's the same thing that
47:23 - 47:30
Claude and GPT did now what's really
47:26 - 47:34
interesting is that both Claude and chat
47:30 - 47:37
gp01 mini got filtered by both of the
47:34 - 47:40
programming task so it wasn't able to
47:37 - 47:42
create a snake game and it wasn't able
47:40 - 47:45
to create a Tetris game in Rust of
47:42 - 47:47
course deep seek did the best on the
47:45 - 47:50
Snake Game it also failed on Tetris in
47:47 - 47:52
fact everybody failed on the Tetris game
47:50 - 47:57
uh Gemini kind of sort of gave us a
47:52 - 47:59
somewhat working um snake game now
47:57 - 48:01
what's so interesting about this and the
47:59 - 48:04
reason that deep seek is such a
48:01 - 48:08
disruptor is all of this is able to be
48:04 - 48:09
done on someone's own personal Hardware
48:09 - 48:16
granted you're probably going to end up
48:12 - 48:20
spending like more than 20 or
48:16 - 48:21
$330,000 on a dedicated machine that's
48:20 - 48:23
going to run deep seek especially if
48:21 - 48:27
you're actually going to get the data
48:23 - 48:31
center gpus like the Nvidia A1 100s and
48:27 - 48:36
Zeon processors and stuff like that but
48:31 - 48:37
you have to factor in that businesses
48:36 - 48:40
are probably going to be the ones
48:37 - 48:42
building those machines and they very
48:40 - 48:44
well might be
48:42 - 48:49
creating their own private deep seek
48:44 - 48:51
server in lie of hiring another employee
48:49 - 48:54
so if we're talking about a company that
48:51 - 48:56
produces software okay the average
48:54 - 48:58
salary for a software developer is I
48:56 - 49:00
don't know somewhere between 50 and 70k
48:58 - 49:02
a year and of course it can go way up
49:00 - 49:05
from there if they're a more essential
49:02 - 49:07
developer more senior developer but just
49:05 - 49:08
what talking about like a midlevel
49:07 - 49:12
engineer right because that's what
49:08 - 49:15
people say that these chat models are
49:12 - 49:18
able to produce code at the quality of
49:15 - 49:21
like a mid-level engineer you might
49:18 - 49:22
actually do better having a team of like
49:21 - 49:25
three or four people and don't get me
49:22 - 49:27
wrong the the AI cannot just replace
49:25 - 49:28
humans you got to have some people in
49:27 - 49:30
there to fact check what it's saying
49:28 - 49:32
because if you look at these overall
49:30 - 49:34
results it's getting most of the answers
49:32 - 49:37
wrong like it's it's getting or at least
49:34 - 49:40
it's getting half of them wrong so yeah
49:37 - 49:41
it might actually make sense for people
49:40 - 49:44
to start building their own personal
49:41 - 49:46
deep seek servers because if you're
49:44 - 49:48
concerned like again if you're a company
49:46 - 49:50
that's producing software and you're
49:48 - 49:52
specifically producing proprietary
49:50 - 49:53
software something that's going to be
49:52 - 49:56
copyrighted you don't want people
49:53 - 50:00
looking at it there's a real risk that
49:56 - 50:03
you're taking with Gemini Claude and
50:00 - 50:06
GPT stealing your intellectual property
50:03 - 50:07
because you're sending this data you're
50:06 - 50:09
having it look at your code or you're
50:07 - 50:11
having it look at a function some small
50:09 - 50:14
snippet of your code and you're sending
50:11 - 50:16
that to a server you don't control with
50:14 - 50:19
deep seek you can control the server now
50:16 - 50:21
of course if you use the Deep seek web
50:19 - 50:24
app then that completely goes out the
50:21 - 50:26
window too deep seek is absolutely going
50:24 - 50:28
to steal your data if you send it to
50:26 - 50:30
them okay like it's it's a server that's
50:28 - 50:32
being run by a Chinese company and
50:30 - 50:35
that's what Chinese companies do they
50:32 - 50:37
steal intellectual property um but yeah
50:35 - 50:42
I'm so excited for this I mean I can't
50:37 - 50:44
wait until the costs come down and more
50:42 - 50:46
clustering setups come out like I've
50:44 - 50:49
seen people uh clustering together
50:46 - 50:50
MacBooks and Mac Minis and I think I
50:49 - 50:53
even saw one where someone was
50:50 - 50:55
clustering together some raspberry pies
50:53 - 50:58
which is really interesting I guess
50:55 - 51:00
maybe they connected a uh a100 or some
50:58 - 51:02
other high-end GPU to the Raspberry Pi
51:00 - 51:05
to do that but anyway let me know what
51:02 - 51:07
you all think about these long form AI
51:05 - 51:09
comparison videos I'm sure that I can
51:07 - 51:11
make more in the future as all of these
51:09 - 51:13
models improve like and share this video
51:11 - 51:16
If you enjoyed it and buy some of my
51:13 - 51:18
merch from base. when if you want to
51:16 - 51:21
continue supporting the creation of
51:18 - 51:23
videos like this 10% storewide discount
51:21 - 51:26
when you pay with Monero XMR have a
51:23 - 51:26
great rest of your day