GEMINI 2.5, GHIBLI GOES VIRAL, CHINA CATCHES UP: Jimmy & Matt debate their favourite AI stories from March/April 2025

GEMINI 2.5, GHIBLI GOES VIRAL, CHINA CATCHES UP: Jimmy & Matt debate their favourite AI stories from March/April 2025

Released Tuesday, 15th April 2025
Good episode? Give it some love!
GEMINI 2.5, GHIBLI GOES VIRAL, CHINA CATCHES UP: Jimmy & Matt debate their favourite AI stories from March/April 2025

GEMINI 2.5, GHIBLI GOES VIRAL, CHINA CATCHES UP: Jimmy & Matt debate their favourite AI stories from March/April 2025

GEMINI 2.5, GHIBLI GOES VIRAL, CHINA CATCHES UP: Jimmy & Matt debate their favourite AI stories from March/April 2025

GEMINI 2.5, GHIBLI GOES VIRAL, CHINA CATCHES UP: Jimmy & Matt debate their favourite AI stories from March/April 2025

Tuesday, 15th April 2025
Good episode? Give it some love!
Rate Episode

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:01

Welcome to Preparing for AI , the

0:04

AI podcast for everybody . With

0:07

your hosts , jimmy Rhodes and

0:09

me , matt Cartwright , we

0:11

explore the human and social impacts

0:13

of AI , looking at the impact on

0:15

jobs , ai and sustainability

0:17

and , most importantly , the urgent

0:20

need for safe development of AI governance

0:23

and alignment . urgent need for safe development of AI , governance and

0:25

alignment . Looking for some happiness , but

0:27

there is only loneliness to find . Jump

0:29

to the left , turn to the right , looking upstairs

0:31

and looking behind . Welcome to preparing

0:33

for AI with me , smirvish D Wool

0:35

.

0:36

And me , Trevor Herndon .

0:38

Yeah , that's right . Yeah , yeah , Good . Yeah , I'm really

0:40

tired this week . I was going to tell people that to

0:42

start . I'm really tired this week . I was going to

0:44

tell people that I'm wrecked . Jimmy's

0:46

Jimmy's tired . I've come to return the

0:48

podcast and found him asleep

0:50

. Um , he's literally just woken up

0:52

. I've just got back from South Africa

0:55

and New . York , where you've been doing

0:57

some AI research .

0:58

Uh yeah , In the Outback , not

1:00

Outback , Safari place .

1:02

You were trying to find places in the world

1:04

that were untouched by AI , right ?

1:07

So you went to New York , yeah , yeah

1:09

. And then you went to South Africa

1:11

. Yeah , all the way around have they been touched by AI

1:13

? South Africa

1:15

felt very much like it has not New

1:19

York . Actually , I've never been

1:21

to the States before at all . It

1:24

felt very familiar and not

1:26

very AI-ish .

1:28

Have you been touched ?

1:29

by AI . Yeah , definitely Were you touched

1:31

by AI on your trip . Not

1:33

inappropriately .

1:34

Well , anyway , welcome to Preparing for AI . And

1:36

today's episode is not officially

1:39

a roundup episode . I

1:41

don't know why it's not officially one . I think only because the

1:43

one we released two weeks ago was um

1:45

, it wasn't two weeks ago , it was ages ago actually

1:48

but this is a roundup episode , um

1:50

. So we're just going to do the usual kind of looking

1:52

at the latest ai news . Um , there's

1:55

been quite a lot yeah , we're going to be .

1:57

Yeah , there's loads to round up actually , so

1:59

should we talk about gemini

2:01

2.5 ? Okay , uh

2:04

, just off the top of your head , so I think I was gonna do

2:06

this one . Um , gemini

2:08

2.5 . So yet

2:11

again . And , uh , we say this almost

2:13

every roundup , to be honest , but

2:15

, like , yet again , the there's a new

2:17

llm at the top of the leaderboards .

2:20

Um , so let me just say , because it like happened

2:22

so quickly that when you came back

2:24

the first time I saw you and

2:26

we said , oh , we needed an episode , and

2:28

you told me that gemini 2.5 was the best

2:30

model , and I was like , oh really , oh , I didn't

2:32

know . I mean , it's like , yeah , we used to know every

2:35

time . Now it was just like I don't know , I

2:37

and I sort of don't really not , I

2:39

don't care , but it's like it sort of seems irrelevant , because

2:41

by the time this goes out it might not be no

2:44

, it's true .

2:44

I mean to explain , uh well , what's

2:46

been at the top of the leaderboards recently .

2:48

So grok , grok 3 deep

2:50

seek was the first one to kind of shake it up

2:52

, wasn't it ?

2:53

deep seek the thinking model . So that was all this year

2:55

as well , and we're in march but in sorry

2:58

, 3.7 claude possibly

3:00

or roundabout 3.7 was

3:02

briefly at the very top for specific

3:04

use cases . I think the difference

3:06

with there's been , there's been the

3:09

the thinking model revolution recently

3:11

which , like has happened basically over

3:13

the last couple of months , um , I

3:16

think I think , gemini 2.5 is a

3:18

little bit different in that it it

3:20

was . It's actually

3:23

basically is top of

3:25

all the benchmarks . So Claude

3:27

3.7 was the best model

3:29

for coding , but wasn't necessarily

3:31

better in other areas . Gemini

3:34

2.5 , so it's a model by Google

3:37

. Google have been pretty quiet , but

3:39

we've said for a long time that they're one of these

3:41

companies that potentially are just going to do something like this

3:43

Bizarre to call it like the dark horse , but it kind

3:45

of feels like that , doesn't it ? But

3:47

Gemini 2.5 has been , and

3:49

when we talk about top of the leaderboard , just to sort

3:51

of clarify a little bit , so there's something

3:53

called LLM Arena for

3:56

people who aren't familiar , and this is basically

3:58

where humans use these

4:00

models , but you don't get to to , you don't know which

4:02

model you're using , so it's all like

4:05

a blind test and then

4:07

humans ask the same

4:09

question and get given different answers

4:11

and then choose which one they think is

4:13

the best . And so gemini

4:16

2.5 was this mystery model for

4:18

a while on llm arena and

4:20

they tend to release them a little bit earlier there as well

4:22

. And

4:24

yeah , gemini was like topping all the charts

4:27

there and then they released it and

4:29

then you found out it was like top of all these benchmarks

4:31

, and so it's literally I

4:33

think there's like eight different benchmarks around

4:36

human type reasoning , natural

4:39

language processing , coding , that

4:41

kind of stuff , and Gemini just

4:43

is like the best at all of them and

4:45

it's free and it's completely free

4:47

uh , at the moment , I guess we

4:49

don't know how long that lasts , but the moment is free , isn't it ? it's

4:51

completely free . Yeah , this is the thing about it . So

4:53

I was . I was quite impressed because I

4:55

I mean , I , we're claude fans

4:58

. There's no secret there . Uh

5:00

, I've been using claude's 3.7

5:02

for doing coding and all this stuff , and this

5:04

model came out again , sort

5:06

of just out of nowhere , and when

5:08

I went to try it , it's like it's really

5:10

good , but it's also really fast . It's

5:12

much faster than Claude Probably

5:15

, yeah , at least two or three times faster

5:17

.

5:18

Reasoning or non-reasoning is this this

5:20

is thinking , so we're reasoning .

5:22

Everything it does . Yeah , yeah . So the specific

5:24

use case I've been using lms

5:26

for the most recently and where

5:28

you , where I need the best one is code

5:30

, is coding using cursor , which I talked about

5:32

on a previous episode , and

5:34

for coding up like an entire

5:36

uh website and , and

5:39

actually cursor hasn't even been optimized

5:41

for for gemini yet . Um

5:43

, it's just significantly quicker

5:45

than claude . Like claude , when you're

5:47

doing stuff like that , it can take quite a while . Um

5:50

, and this , this , uh gemini models , like

5:52

incredibly , incredibly fast . So , yeah

5:54

, if you want to check out the

5:56

cutting edge models , do a quick search

5:59

for google gemini , I

6:01

think it's like 2.5 pro

6:03

pro yeah yeah , and you can um , I

6:06

think you go . You go to like a . It's

6:08

like a workshop type thing , it's not like a

6:11

typical chat gpt type interface

6:13

, uh , and you can use it there and it's multimodal

6:16

, like it can . I haven't even had a go

6:18

with it , but there is an option to stream your desktop to it as well , which looks

6:20

pretty interesting . I think you , you basically do like a , almost like a screen share , like you're sharing

6:22

your desktop and I as well , which looks pretty interesting . I think you , you basically do

6:24

like a , almost like a screen share , like you're sharing

6:27

your desktop , and I would imagine

6:29

you can chat with it about what's going on on your desktop

6:31

in real time .

6:32

So what so almost agentic ?

6:35

so I don't think it can interact with it in

6:37

terms of being a genetic , but it can view

6:39

it , so I presume it can like

6:42

solve problems with you in real time based on what

6:44

you're looking at on your screen .

6:45

I presume I haven't tried it actually , so yeah

6:49

, I mean when , when we

6:51

said a minute ago that we'd um , that

6:55

we sort of called or I called it a dark

6:57

horse , and I was thinking back probably

7:00

a month ago , we recorded the episode where we looked

7:02

at like the best models at

7:05

the moment , and I remember specifically

7:07

calling google the kind of forgotten

7:09

model , and we didn't really talk about it much

7:11

, which was , you know , appropriate , because at the

7:13

time , I mean , I guess the thing is it was never

7:15

a , it was never that gemini was a bad model , it

7:18

was just that there was nothing , it had never been

7:20

the first one so it always come out and there

7:22

was already , you know , good models , so it didn't

7:24

do anything different . It never kind of had

7:26

anything that seemed to make it stand

7:28

out . Um , but

7:31

we always thought that google had , because

7:33

they've got the the whole kind of ecosystem

7:35

, that it was likely that at some

7:37

point , if they were not going to be the best model

7:39

like they , I

7:41

think we said at one point , like anthrop , anthropic , we

7:44

saw potential problems for because they were

7:46

not big enough and they didn't have enough of an

7:48

ecosystem there to sort of build

7:50

on . I'm not sure if that's the case , because actually it seems

7:52

like , because they have a lot of their revenue through coding and

7:54

through the API , actually they have got their

7:56

kind of niche to some degree , but with

7:58

Google , well , they've got Amazon backing them yeah

8:00

, true , yeah

8:05

, and backing them , yeah , true , yeah , it's not like they're a little startup , is it ? um , but it does feel

8:07

like with google now , like at this point it's like right , they're banging

8:09

it now and I and I kind of feel like they've

8:12

got their act together , like it was also the issues they had

8:14

. Remember the whole kind of woke thing

8:16

where they had the images of what black nazis

8:19

and all kinds of weird historical

8:22

images where they were . They were , you know , using

8:24

, I mean , as a black nazi . Thing's a bit weird . So

8:26

I don't know that there weren't black nazis like that's

8:28

not necessarily a thing that didn't exist , but I know it was it

8:30

was basically trying to be too diverse

8:33

in the way that it was generating diverse

8:35

images of historical figures .

8:36

So you could ask it for a image

8:39

of a red indian and it would give you a white version

8:41

of one , and then , equally

8:43

, you could have like yeah , black nazis

8:46

, interesting example , but yeah well , that

8:48

was one that came up when I saw it yeah , but

8:50

then the more I think when I , when I said it and thought about

8:52

it , I thought well , actually , like there are black nazis

8:54

, there are nazis of all colors , so that's not

8:56

actually historically incorrect . No

8:58

, it's probably an unusual

9:00

it's an example .

9:02

It is an unusual example but in a sense , like sometimes

9:05

the thing that we talk about these kind of image generation

9:07

models and I remember the episode where amy

9:09

amy aisha brown was on and she was talking

9:11

about how , if you ask for a picture

9:13

of an autistic child , it always

9:15

brings up a sad white boy . So

9:18

there was something built into the kind

9:20

of biases there that I think what they've done is like turn

9:22

the biases the opposite way , flipped

9:24

it . Yeah , anyway , that was . That was the kind

9:26

of I think what everyone remembers is the kind of

9:28

big mistake that google made .

9:30

I think for a while they they were kind

9:32

of tarred by that they made a few faux pas

9:34

, I think , um , and also it feels

9:36

like , I mean , we , we we're obviously

9:38

really into it , so we probably sort of see

9:41

all every single piece of news , but it

9:43

feels like google are just very quiet in that respect

9:45

, like they're not . Claude is all . Claude

9:47

has always been the

9:49

best at coding . Um . Grok

9:53

is famous for being

9:55

, you know , complete , like , like

9:57

having no biases and no filters

9:59

in there , no guardrails , um

10:01

, although most of them don't now anyway . And

10:04

then , uh , chat gpt is just famous

10:06

for being the first and and and

10:09

actually quite often been overall

10:11

the best multimodal and most capable model

10:13

overall .

10:14

Um which I guess we're going to talk about a little

10:16

bit later in the episode yeah , totally

10:18

.

10:19

But whereas , whereas google sort of have a

10:21

usp on it almost in a way , I

10:23

think the other thing don't still don't , though , do they .

10:26

Even with this model . It's the best

10:28

at everything , which we've said

10:30

to people a lot of time but it doesn't really matter to you if it's

10:32

the best , because unless you're doing certain things you've given

10:34

the example of coding . There will be

10:36

other examples where people

10:39

who have particular things

10:41

they need to do , so I saw that programming , creative

10:43

writing were two things that it was particularly

10:45

good at . There'll

10:47

be people who have specific use cases where they will definitely

10:49

value it , but for most people they

10:52

wouldn't necessarily notice a difference

10:54

. Maybe they notice it's quicker . And also , there

10:56

still isn't , in a way , a thing to bring you to the

10:58

model , apart from the fact it's free . I

11:00

do question , because you've used it a lot more than

11:02

me . I do question , like because you've used it a lot more

11:04

than me , like when we say it's free . You know ChatGPT is free

11:06

, but if I use ChatGPT once I've done something for half an

11:08

hour , it

11:12

says you need to use the other model until 7 o'clock tomorrow morning . So is it free , like

11:14

not unlimited , but for a reasonable amount of use

11:16

, or is it free just for you to use it for 10

11:18

minutes and then it you know ?

11:29

it puts , you get to use it . It might not be the

11:32

other amazing thing . Which is genuinely amazing

11:34

with uh uh , gemini

11:36

2.5 and specifically for coding

11:38

, but for other stuff as well is it's got a 1 million

11:41

token context window , um

11:43

, which is significantly that than almost

11:45

all of them .

11:46

That was the thing that Gemini did have before was

11:48

that it actually was already a factor

11:50

, but no one really talks about it . It had a mass even

11:52

on , I think , 1.5, . It had a bigger context

11:55

window . It was a million . Yeah , can you explain

11:57

what that is for people actually ?

11:59

Yeah , so token , you can roughly equate

12:01

a token to a word . It's

12:05

you can roughly equate a token to a word . It's not quite one for one . So , like

12:07

some words are made up of multiple tokens . But if you , to keep it simple , if you think of as

12:09

a token has been a word um , then

12:12

then it's basically how

12:14

many words it

12:16

can retain in its memory before

12:19

it almost like forgets the first one , so

12:21

to speak . So with a

12:23

lot of models when they first came out so went like

12:25

GPT-3 , I mean , I know we're going back a

12:27

little way there I think it had something like a

12:29

4,000 token context window

12:31

, which is quite

12:33

short . If you think about 4,000 words , and actually

12:36

it's probably more like 2,000 words

12:38

, then you

12:40

know after 2,000 words it

12:42

would start to if you carried on the conversation . It would start to if you carried on the conversation

12:45

it would start to forget the things you were talking about first

12:47

. Um . Most models now

12:49

have something like 120 000 token

12:52

context windows as a maximum . Um

12:55

and google

12:57

has a million tokens

12:59

. And once you start get to a million tokens

13:02

, then you're talking about being

13:04

able to have a whole novel in its memory

13:06

effectively .

13:07

Do you know what Precursor 7 Sonnet is

13:09

? I think

13:12

I remember that it was a big leap forward

13:14

.

13:14

I'd have to look it up , but I still think it's probably

13:17

maybe a quarter Wow , so

13:21

maybe 200,000 compared

13:23

to Google's a million , and I think there's a version

13:26

. I'm not sure this is the version that you can access

13:28

uh for free , but

13:30

I think they do have a two million context

13:32

window version as well , so double it so anyway

13:35

. I mean , it's something that google's models are

13:37

already famous for it's

13:39

. It also lends itself very much

13:42

to coding , so it's not only the

13:44

best model at coding , but like . A long

13:46

context window for coding is quite important

13:48

because when you're using

13:50

things like cursor , you can end up having to like

13:53

, especially as

13:55

the code gets more complicated , as your code

13:57

base gets more complicated , you can end

13:59

up having to upload a lot of information

14:01

into the context window every time you're talking

14:04

to it .

14:11

The last point on this before we move on . Um , I just remembered one other thing that was supposed to be

14:13

, you know what it kind of excels at , and that was it's the reduced hallucinations

14:16

. Down now every model that's released . Now

14:18

, one of the things that they talk about how

14:20

is how it's reduced hallucinations , and it seems like

14:22

there is you a methodology that's been

14:24

sort of generally applied . Yeah

14:29

, I mean , it hasn't ruled , it

14:31

hasn't stopped hallucinating and I

14:33

guess you know , in the near future

14:35

they probably never will . And

14:39

I think it's the reason

14:41

for me it's kind of interesting is there is an acknowledgement

14:43

, the fact that every model that comes out now is talking

14:45

about reduced hallucination . Yeah , there's

14:47

an acknowledgement . It's a key thing to

14:49

try and reduce down hallucinations

14:51

. I don't know if you've noticed anything from from

14:53

using it , like I've noticed that models

14:56

, the recent models that have come out , they

14:58

tend to be better at telling you when they're hallucinating

15:00

or warning you about it , um , which kind

15:03

of helps . But I don't know if you've noticed with gemini 2.5

15:05

that there's a noticeable difference in hallucinations

15:07

I haven't noticed that .

15:09

The . The main thing I've noticed with with

15:11

um gemini 2.5 is

15:13

it started telling me what to do , uh

15:16

, which is a bit mad , I think I was talking to you

15:18

about .

15:18

We thought we're a few years away from that , but we've already started

15:20

, haven't we ?

15:21

yeah , yeah . So when I've been coding with it , it's

15:24

like cause , when you're coding with a model , it's

15:26

sometimes you have to do stuff to help it

15:28

do what it needs to do , because you cause

15:30

it can't necessarily carry out as a jet , it's kind

15:32

of an agentic thing . It can't necessarily

15:34

carry out some actions for you . And so the

15:36

other day when I was coding with it , it

15:38

was . I

15:48

basically hadn't realized that I

15:50

needed to take these actions , so I

15:52

asked it , I asked it , I told it that it wasn't

15:54

working and it said it reminded me that

15:57

I needed to go and take some actions . And

15:59

then later on , in the same conversation , it started

16:01

like adding reminders for

16:03

user needs to do this , user needs to do that

16:06

at the end of the , at

16:08

the end of each conversation , just as a

16:10

because it basically picked up on the fact that I

16:12

wasn't really paying attention were

16:15

you jet lagged ?

16:17

uh , well , yeah , was that pre-jet lag ?

16:18

and then it was just the other day , so okay , well , that

16:20

makes sense . It probably knew , yeah nice

16:23

.

16:27

So should we talk about the new ? Well

16:30

is it . It is new , isn't it ? The gpt

16:32

open ai is chat gpt

16:34

image generation model

16:36

. I don't know what it's actually called , but it's now integrated

16:38

with chat gpt rather than being a

16:40

kind of standalone yeah , so

16:42

um , the

16:45

big news here is everyone's decided studio

16:47

ghibli is the ghibli

16:50

ghibli , ghibli .

16:52

Sorry , definitely not ghibli is the uh trending

16:54

.

16:54

what's the word Viral ? Yeah

17:03

, it's the new viral .

17:04

Yeah , exactly , it's kind of a new meme . It's

17:07

to do everything in Studio Ghibli

17:10

style using ChatG

17:12

, gpt . So , um , this has

17:15

actually been in the news quite a lot , so if

17:17

you haven't been under a rock , you've probably seen

17:19

something about this . Um , but , yeah , gpt

17:21

have released a . Um , it's actually

17:23

a phenomenal image open ai open

17:26

.

17:26

Ai have released . Sorry open .

17:27

Ai have released a new

17:30

method of image generation which takes

17:32

much longer but is actually pretty

17:34

phenomenal . It's like much like I mean . Previous

17:37

image generation stuff couldn't get text

17:39

right . It couldn't spell things right .

17:42

You can edit these now like , completely

17:44

, perfectly , the text . You can write like a 22

17:46

line text prompt and it will take everything

17:48

into account Like it's . It's another level , isn't

17:51

it ? It's a completely different level . Yeah , take everything into account , like it's

17:53

it's another level , isn't it ?

17:53

It's a completely different level , yeah , totally Totally . If you haven't had a chance to

17:55

go with it , I think I think it's only

17:57

on paid accounts , um , but it's

17:59

, um , it's pretty awesome , uh

18:02

, and some of the images that can produce like proper

18:05

, like just like next level . It

18:07

doesn't actually a diffusion

18:09

model , as far as I can tell , as

18:11

far as anyone can tell , because it's not . It's

18:13

actually closed um source

18:15

as usual from open ai . Uh

18:18

, so , so , like you , but

18:20

apparently it doesn't use the standard diffusion

18:22

model that's been used in almost almost

18:25

all image generation , um , for the last

18:27

couple of years since this stuff came about . So

18:29

, um , yeah , I'd say

18:31

, just check it out , like , really , really , really

18:33

impressive . If you haven't seen all the

18:35

studio Ghibli stuff online , um , I'm

18:38

not saying that , right , you are , you are

18:40

now . I am now Okay . Studio

18:42

Ghibli . Anyway , apparently the

18:44

, the creator of the studio Ghibli stuff , is

18:46

absolutely horrified .

18:48

But I was going to say this is the irony

18:50

of it , the fact that that's what's gone viral , is that

18:52

he is literally the person who hates

18:54

AI more than anyone in the world

18:56

, yeah , and thinks it's destroying

18:58

the entire sort of industry

19:01

and art that he loves . And

19:03

it makes sense because he draws everything from hand and

19:11

that's why you know this Jujo Ghibli stuff is . I mean

19:14

, if you I don't know if most

19:16

people listening will know what it is Things

19:19

that spirited away , how's moving castle

19:21

, totoro , I guess , the sort of famous ones

19:23

, but it's everything is

19:25

kind of hand drawn . The amount of frames that

19:27

they have to draw to do it , it's like so

19:29

, so labor intensive , like

19:32

kind of have have just

19:34

pushed back against any kind of technology to do

19:36

these really really kind of um

19:38

, you know , authentic vintage

19:40

style of of cartoons , and now ai

19:43

is basically just recreating it all . So it

19:45

does seem kind of cruel that the viral

19:47

thing that is . I mean it is really cool , like people

19:50

are doing it . The main thing is that family pictures , so

19:52

people get family pictures and do them , and it's not

19:54

just studio ghibli style . You

19:56

can actually like , you

19:58

can actually choose specific films , and it will

20:00

do that and it's not just that like you can do it for anything

20:02

, can't you can do like 1980s baseball

20:05

.

20:05

um yeah , people have been doing

20:08

baseball game . You can do any kind of theme that

20:10

you want and it's amazing , south park

20:12

, like yeah , whatever style you want . And

20:14

the difference is like

20:16

previous image generation models would sort of

20:18

have a go at this , but they would . They would like

20:20

mess around with the picture , like so

20:22

, if you know , like you say , if you had a family

20:25

photo , it would like

20:27

they would . Previous models would sort of do

20:30

some kind of really rough approximation , but

20:32

generally like wouldn't really have any

20:34

past , any resemblance . These ones

20:36

look like it's genuinely

20:39

like . It's genuinely

20:41

like an artist has taken that photo and produced

20:43

it in a different style .

20:44

Basically , I don't know if you're aware of this , but because

20:46

I was doing a bit of research for this earlier

20:48

and um , because so

20:51

previously when you used chat

20:53

gbt and you tried to create an image and

20:55

use what's called dali at the time , basically

20:57

dali wasn't integrated . So you'd give a prompt and

21:00

what it would essentially do is go and send that prompt into

21:02

dali and then pull the image back in like it wasn't

21:04

integrated . Now it's integrated

21:06

4o , I think , is the model that it uses

21:08

. So it's it's not using the most up-to-date model

21:10

, but it now has access to all of 4.0's

21:13

training data to help it to create the image

21:15

. And that's one of the reasons why

21:17

this is better . And there was a couple of examples I'd given

21:19

. Someone said create an image that

21:21

shows why san francisco

21:24

is foggy . And it created an image

21:26

. You know it was able to reference 4.0

21:28

to find out why san francisco is foggy and

21:30

then to put that into then a kind of educational

21:33

image , whereas before you would have had to say

21:35

you could say make me an image of san francisco

21:37

that is foggy , but if you wanted to explain

21:39

why it couldn't because the image couldn't reference

21:41

training data the image could just follow your text

21:43

prompt right . There was another example which I

21:46

I wasn't that blown away by this until

21:48

they kind of explained it that they asked

21:50

it to create this comic book , um

21:53

, based on a unicorn

21:55

with an upside down horn , and

21:57

they said every other image creation

21:59

tool out there will turn

22:02

the horn the right way up , just can't do it

22:04

. We couldn't do it , but this one with this whole

22:06

thing kept the horn the wrong way . Now

22:08

that doesn't sound that phenomenal , but the point here is

22:10

that you know when you're prompting it to

22:12

do things . It's not

22:14

kind of limited . In the same way , you

22:16

gave the example of text . I think for me that has been

22:18

the biggest thing with image generation is even the

22:20

best ones that promise that the text will be perfect

22:23

. It wasn't perfect , you know it was . It

22:25

had got better , but it would still make mistakes . With text

22:27

you still saw the kind

22:30

of not so much the kind of six fingers

22:32

on girls , but you know there was still a bit of that

22:34

to some degree , the making mistakes . Now

22:36

these images are like you can

22:38

kind of create what you want to

22:40

do . It sort of feels sad in a way

22:42

that you know I think

22:45

this is it diminishes art even more . It diminishes art even

22:47

more and it's better , but you know

22:49

, it is incredibly fun and it's good

22:52

enough that you can use it , I guess , for

22:54

actual practical uses . So it's not now

22:56

just a pure novelty tool . The quality

22:58

of it is good enough you can use it for , I think

23:00

, things which are , you know , genuinely kind of worthwhile

23:03

and will will make time savings

23:05

or , you know , make things better than they were previously

23:07

. So , although I have a kind of sadness

23:09

for what this is doing to

23:12

art , because I think it is , you know , taking

23:14

away yeah , I don't think it's destroying you , but I think it's

23:16

taking away . Um , it does also

23:18

have like a lot of merits , yeah for sure

23:20

.

23:20

And like , like I mean so you could create

23:22

logos previously , but not if they

23:24

had text in them . Really like now you

23:27

can just create whatever logo you want and

23:29

it'll , it'll work we

23:31

said we need to update our logo .

23:32

This is our 50th episode , so maybe we'll

23:35

use the image generation to

23:37

you know , sex up our current

23:39

logo rather than creating a completely

23:41

new one yeah , yeah , just pimp it out and pimp

23:43

it up a little bit with , yeah , a bit of bit of gpt4o

23:46

love yeah so

23:52

deep seek .

23:53

I know literally nothing about this , so

23:55

I'm quite intrigued .

23:56

We did . We did an episode on deep seek . Then

23:59

we did an episode which that episode on

24:01

deep seek was our most popular episode

24:03

since the first episode

24:05

, which is , which is pretty awful I wish people would stop

24:07

listening to the first episode and then

24:09

we said we'd stop talking about deep seek , so we need a . We

24:11

did a second episode about deep seek

24:14

and then I think we did another episode just after

24:16

where we also talked a

24:18

lot about deep seek . So you know

24:20

, I don't want to talk about deep seek anymore , but

24:22

we need to talk about deep seek again

24:24

. Um , because their

24:26

impact I think is still kind of resonating

24:29

and is still creating kind of effects

24:32

that this is about deep sea but it's also a

24:34

kind of china and chips and kind

24:36

of huawei piece . So the first bit , deep

24:39

seek have brought out um v3.1

24:42

, um , which is

24:44

um , a non-reasoning

24:47

model . So this is not the kind of

24:49

there is going to be , I think r3

24:53

or r2 , r2 is

24:55

it r1 or r2

24:57

, so the first one was r1 . Yeah , well , anyway

25:00

, there's gonna be a new reasoning model which apparently is

25:02

going to kind of potentially be at the top

25:04

again , and there's all this kind of expectation

25:06

. V3.1 is not that . V3.1

25:09

is an upgrade on their original model

25:11

which was released in December when no one actually heard

25:13

about it , which was version three For

25:17

the technical stuff . So it's got transformer-based

25:19

architecture , 560 billion parameters

25:22

. It uses that mixture

25:24

of experts model . It has support

25:27

for a context window of a million tokens , so

25:29

it matches Gemini

25:31

2.5 . Like I said , this is not a

25:33

reasoning model , um , but it shows

25:35

, you know , really really high performance

25:37

um support for 100 languages

25:39

with near native proficiency

25:42

. It has apparently demonstrated

25:44

38 reduction in hallucinations

25:46

. So again talking about reduction hallucinations , but

25:49

those of you who listen to the model episode

25:51

will remember me saying that um , deep

25:53

seek was the sort of worst hallucinating

25:55

so that is , I like I like

25:57

the statistic a 38 percent reduction

25:59

in hallucinations .

26:00

That sounds like a hallucination

26:03

in itself . It does , doesn't it ? Yeah ?

26:05

but also like . So 62 percent of hallucinations

26:07

are still there . That also sounds not that

26:09

impressive , to be honest

26:11

yeah , I suppose . So it depends how what

26:14

percentage it had before but then I researched

26:16

this on perplexity so it could be a perplexity

26:18

hallucination yeah , exactly , um

26:20

, anyway , I'll believe it , but

26:23

yeah , um , like yeah , enterprise

26:25

customers have api

26:27

use of it . It has apparently

26:30

a chrome extension that's coming out . And the one thing

26:32

that I kind of wanted to talk about here , because it's something I've

26:34

noticed in the in the last few weeks . So

26:36

I talked a few times on the podcast

26:38

about how you're being in china

26:41

, about the difference in the way that ai

26:43

is integrated with stuff and that ai

26:45

is integrated with a lot of things , um

26:47

, that is not integrated within

26:49

other countries , but that we didn't really have

26:51

people using chatbots on their

26:53

phones or iPads

26:55

or whatever in the same way as we did in the

26:57

West with sort of ChatGPT and

27:00

Claude and things like that , and how DeepSeek

27:02

was what had kind of really pushed that into the mainstream

27:04

. The other thing that I've noticed recently

27:07

is like DeepSeek in China

27:09

is integrated in everything

27:11

, like everything . Like

27:13

you go into Baidu Maps Baidu Maps is this

27:15

sort of equivalent of Google Maps here and

27:18

you've got a DeepSeek logo

27:20

and you click on it and there's an AI feature . I

27:23

mean it is like it is crazy

27:25

and talking to friends who

27:27

are , you know , relatively

27:30

senior in companies or who run their own businesses or

27:33

people who are working in sort of accounting and stuff

27:35

like that and they're all talking about yeah

27:37

, we've got , you know , we've got ai integrated

27:39

, they've got deep seek integrated in this and we're like it only

27:41

came out two months ago and it's just

27:43

integrated in everything , like I

27:45

. It more and more makes me question . Like

27:48

this idea that deep seek is this you know

27:50

bunch of guys in a cupboard with a few

27:52

gpus I mean that was sort

27:54

of proven to be nonsense . They had a lot of money

27:56

, but that this was a side project , like I'm

27:58

pretty sure the you

28:00

know chinese communist party is . You

28:03

know , if they weren't backing them now this , then they're certainly

28:05

backing them now . But they've got a lot more behind them

28:07

than we thought , because the way in which

28:09

they bought that model out was kind of shocking

28:12

. It shook the foundations of the american

28:14

companies . It's kind of bought open source

28:17

forward . It's changed

28:19

, which we're going to talk about in a minute , but it's changed

28:21

the way open ai are potentially going to do things . They're

28:23

potentially going to go back to being a bit more open . You

28:25

know it's been phenomenal , but we kind of said that with

28:27

the models themselves . Well , actually

28:30

, like it's not that the model is that much better

28:32

. That's the key point . It's the economics . It's

28:34

how , um , how cheap it is . It's

28:36

how um , efficient it is

28:38

. I think the way you're seeing this integration

28:41

kind of shows that the examples

28:43

that I was giving are maybe the kind of business

28:46

like some of these , like I say , are big businesses

28:48

. Some of them are not that big and they're talking about

28:50

we've got ai integrated with deep

28:52

seek is like it's so much cheaper

28:54

that maybe this is the thing that businesses

28:56

that previously would be like you

28:58

know they might have thought about it , but like they're

29:01

not sure , they're not quite sure if it's the right time now

29:03

. It's so cheap with deep seek and

29:05

it seems that deep seat must have just gone on this

29:07

massive charm offensive again to

29:09

do all this stuff . I think they've got a lot more

29:11

. You know people working with them and sales

29:14

and etc . To get that integrated . But it's

29:16

phenomenal like have a look . You maybe

29:18

don't use many chinese apps , as me , but

29:20

if you look on chinese apps now you'll

29:22

just see deep seek integrated all over the place . We

29:25

chat deep seek integration . I'm not

29:27

sure if it's there now , but there's supposed to be a

29:29

way you can use it through the , through the

29:31

mini app . I'm pretty sure they're using it on the background

29:33

of , you know , delivery apps , everything like

29:35

that . It's it's . It's crazy

29:37

and they've got this new model coming , which you

29:40

know , even if it's not the best model

29:42

, I think it will be the like up there and

29:44

it will be cheap and it will have some that there's

29:46

going to be something about it oh yeah , and I think

29:48

the way these things work is they they obviously

29:50

develop the standard model first

29:52

and then they add the reasoning later , which is

29:55

the same for the other companies .

29:56

It's weird to talk about that , because reasoning

29:59

in itself is something that's only a few

30:01

months old . On that point

30:03

, I do wonder . It's something

30:06

I don't think we'll ever know , I wonder

30:08

. I think

30:10

there's three possible scenarios . I think one

30:12

of them is that the Western

30:15

companies , like OpenAI , were already looking

30:17

at reasoning but were basically

30:19

for want of a better word sandbagging .

30:23

And like they were just they were just keeping it behind

30:26

.

30:26

They were keeping it back so they could , like

30:28

, release things more slowly . And

30:31

then the release of R1 has kind of made all

30:33

these other companies release these thinking models straight

30:36

away . The other option , which I think

30:38

is less likely , is that it just

30:40

completely caught them by surprise and they've scrambled

30:42

to catch up and hadn't even thought of thinking

30:44

model , reasoning models , which I think

30:46

is less likely . I think they probably but they got it

30:48

.

30:48

They got their act together afterwards pretty quick .

30:50

I think they were sad . It was like within a week or something .

30:52

I think they were sad .

30:54

And I don't know what the third option

30:56

was . I'm tired and jittery

30:58

.

30:59

You've mentioned that . Yeah , yeah

31:04

, the thing with DeepSeek that

31:06

is kind of for me is so

31:08

I

31:10

don't want to say amazing , but it

31:13

is still amazing that they did it without

31:16

access to the best hardware , which is

31:18

what all like they don't have ?

31:21

I mean , maybe they do have the best chips

31:23

and you know they've just got them through back

31:25

channels .

31:25

Yeah , that's the thing They've

31:28

innovated , in a way

31:30

that , even if it involves some degree of copying

31:32

from chat , gpt and you know there are various

31:34

kind of accusations of how they did it even

31:37

if they did all that , even if they

31:39

still had to innovate to find the way to do

31:42

it cheaper and to do this inference

31:44

cheaper and , to you know , find

31:46

a way to make it more efficient . I mean

31:48

, you know they're also like in just

31:50

in terms of how they're advancing . So v3

31:52

that was released in December that's the one just

31:54

before that had 128,000

31:57

tokens and now it's a million

31:59

Like , just like the

32:01

leap forward in terms of how

32:04

quickly they've kind of made that

32:07

leap is just pretty staggering

32:09

.

32:10

Yeah , they seem to have basically levelled

32:12

the playing field almost immediately

32:15

. And who

32:17

knows , I mean it's possible that

32:20

R2 , the next reasoning

32:22

model they release , which probably won't be that

32:24

far away , might leapfrog everything

32:26

again . I don't know . It's

32:29

really interesting like it like for

32:32

a while it looked like open ai

32:34

were going to have the very best models , like almost

32:36

all the time , and they don't seem to anymore

32:38

. Like it's gemini is the best model now

32:41

. Claude was for a while like it

32:43

just seems to be switching hands all

32:45

the time right now and I think open ai

32:47

well , maybe they didn't have the most

32:49

resources available to them , like clearly xai

32:53

and and google obviously do have

32:55

enormous resources available

32:57

to them . I haven't heard from llama for a while . It's

32:59

possible we'll get a new open source model from

33:01

llama tomorrow . That is better

33:03

than all these models .

33:04

I don't know so I was listening to

33:06

nate whitmore's podcast

33:08

the other day and he was talking about a

33:11

kind of theory that what china

33:13

is potentially doing here is , you

33:16

know where the

33:18

us had the lead in terms of those closed

33:20

source models and the kind of software side

33:22

of it that they could sell to people is by

33:25

open sourcing and kind of just throwing that out

33:27

, it kind of gets rid of the

33:30

ability for china , for us to kind

33:32

of lead and generate all the revenue through

33:34

that , because it's all kind of open source , and then china

33:36

at some point later can pick up what

33:38

it does well , which is creating the kind of hardware

33:40

and not necessarily like the top end chips , but , you

33:43

know , selling the actual physical stuff and

33:45

creating it . And it makes a lot of sense . Like

33:47

it absolutely makes a lot of sense , like you know how

33:49

they did things previously is follow the same kind

33:52

of model , because it does feel like the closed

33:54

source models , like maybe

33:56

the maybe it was never going to . You

33:59

know it was never in the long run going to work

34:01

, because you've always said , like you know they're

34:03

not that far behind . But deep

34:05

seek was the thing that has kind of shaken

34:07

that up . And there was also like and

34:09

this is not just in one place like there's a lot of this

34:12

. Um , even people like mark andreessen

34:14

, who you know were

34:17

saying previously that the us was like two years

34:19

ahead of china , and now they're saying , oh

34:22

, it's three to six months in some areas

34:24

, and in some areas it's probably level

34:26

and in some very specific areas

34:28

, china may even be ahead like it's

34:30

like how quickly it's made that up

34:33

is absolutely crazy and

34:35

a lot of this is driven and you know , credit to

34:37

kapila gray shaw , you know christy loke

34:39

, who've had on this podcast for for kind of talking

34:41

about this stuff . You know well , before we thought about

34:44

it but a lot of this stuff has happened because

34:46

of chip controls and because you know the us

34:48

has kind of driven this innovation

34:51

yeah , they've pushed .

34:52

They've pushed them into a corner where they've had to innovate and

34:55

there are um , there are

34:57

a lot of smart people , like you have to . It's

35:00

been talked about a lot recently . Like you people

35:03

have still got . I think there's

35:05

a misconception that china is

35:07

still in this place where it's copying the

35:10

West , where that was the

35:12

case and it still is in some areas

35:14

, like it definitely still is in some areas but they're also innovating

35:16

and starting to really really

35:18

catch up and even lead in some

35:21

areas . I mean , the classic example

35:23

right now is electric cars

35:25

. It's got nothing to do with AI

35:27

, but , like electric cars , china went

35:29

from catching up to now leading

35:31

. Basically , they have the some of the cheapest

35:34

and best electric cars on the planet but

35:36

I can give you ai examples .

35:37

I mean , I said to you I was in wuhan a few months

35:39

ago and there are , you know , plenty

35:41

of self-driving taxis on the road

35:43

there . I know there are some in the us , but you

35:46

know that there's there's a lot of them in

35:48

Wuhan . It's not the only city that has that

35:51

. There are examples in Shenzhen

35:53

, which is the sort of tech hub

35:55

just over the border from Hong Kong

35:57

. Drone delivery services are

36:00

apparently just kind of the norm

36:02

. There they're happening . I

36:04

mean , when I say the norm , I'm not saying like every delivery

36:06

is taking place via a drone

36:09

, but it's there . That kind of

36:11

the low altitude economy

36:13

is a big part of China's five

36:15

year plan . It's a massive

36:17

thing . I mean , they're already talking about autonomous

36:21

helicopter transport

36:23

in the next five helicopter taxis , autonomous

36:25

helicopter taxis in the next five

36:27

years or so in parts of China . Like there's some

36:29

phenomenal stuff . I want to talk about

36:31

Huawei , actually , because I

36:34

think this is kind of important in terms of , like

36:36

it's Huawei's chip technology

36:39

that is contributing a lot towards

36:41

this . So , because of those sanctions , huawei are

36:43

basically you know , I mean they're not

36:45

an SOE , but they're basically the national

36:47

tech company of china . Now let's be honest . Um

36:50

, yeah , they've got these chips the ascend 910c

36:53

, which is designed to rival

36:55

nvidia h100s . They're

36:57

not as good as , obviously , the very top chip , but

37:00

I think the point is where we thought

37:02

previously is like you have to have the absolute top chip

37:04

. With the improvements in the architecture that

37:06

that we've seen from the likes of deep seek , it's like

37:08

they need chips , but they don't necessarily need

37:10

the very top chip . There's also some

37:13

talk 40 of nvidia's revenues

37:15

apparently may come from china

37:18

, not necessarily directly , but because a lot

37:20

of chips are sold to vietnam and singapore

37:22

and they're then illegally kind of exported

37:24

into china that way . So I

37:26

don't know , I mean , maybe that means like there are more nvidia

37:28

top end chips in china than there's

37:30

supposed to be . Yeah , um , but it definitely

37:32

feels like this sort of huawei have really

37:35

really been pushed to

37:37

innovate and and create these kind of better

37:39

chips . And I , I like , I'm wondering for the

37:41

first time it's like it's not like it's china gonna win

37:43

, because I don't think there is like a winner . It's not

37:45

like going to the moon , it's like one gets there , no

37:48

, okay , if we got to the moon , but whatever , no , but are

37:50

they can , then that's it . But it's like can they be

37:52

ahead at some point and can they be neck and neck and

37:54

pushing forward and going backwards ?

37:56

you know , I think probably they can

37:58

now they seem to , yeah , in general in

38:00

ai they seem to be , they seem to be

38:02

roughly in that kind of space . I

38:04

don't know , like being three months behind doesn't

38:06

seem that .

38:07

They're collaborating with DeepSea

38:09

and it feels like it's the DeepSea-Huawei collaboration

38:11

.

38:16

That is the kind of key thing in terms of what China is potentially going to do . But also

38:18

, if you're three months behind , that's basically effectively not

38:21

really behind at all . Two years behind

38:23

is significant . Two years behind

38:25

could mean , by the time you've caught up

38:27

, like the game's over okay but is

38:29

it because two or three ?

38:30

months behind . But is it because

38:33

six months ago no , not

38:35

six months . Three months ago china was two years

38:37

behind . Apparently now it's three months behind or

38:39

not ? So like they were the . The

38:41

point is like , when people say they're two years behind

38:43

or they're six months is like well , if they were two years behind

38:45

, but then two months later they were three months

38:47

behind . They were obviously never two years behind

38:50

no , no , it's all .

38:50

It's all kind of like they . Clearly it's just a figure thrown

38:53

out there isn't it ?

38:53

they clearly were they clearly weren't , but

38:55

I guess they were they're two years behind in what

38:57

people knew about . yeah , exactly exactly

39:00

like at that point deep seek hadn't come out

39:02

and no one has seen anything like this . So I

39:04

think I think it was just more that they showed

39:06

their hands , so to speak . But

39:10

what I mean is like

39:13

pundits thought they were two years behind

39:15

, and the difference between being two

39:17

years behind and three months behind

39:19

is like basically

39:22

you're out of the game versus like

39:24

you're neck and neck your

39:33

neck and neck right .

39:34

So , in terms of like the choreography of this episode , this is a bit of a balls up , because I'm

39:36

going to talk about open ai again . Um , which I probably should have talked about

39:38

. We talked about the open ai model , but anyway

39:40

, it's a different .

39:41

It's a different point , but you could just move this

39:43

bit before the other bit and then the

39:45

bit you've just said won't make any sense yeah

39:47

, so if I've done that and this makes

39:49

no sense , that's why we've done it .

39:51

If not , then then well

39:53

, either way it won't make complete sense , but

39:55

good idea , correct ? So

39:57

yeah , open ai . Um . Apparently

40:00

I haven't actually looked at the um

40:02

tweet or x . What

40:04

is a tweet called ? If it's on x , is it called an x

40:07

?

40:08

yeah , I think it's an x .

40:09

Is it you say I , I did an x , you

40:12

say I x'd it , or do you still say

40:14

I tweeted it , but on x ?

40:16

I don't know .

40:17

Actually , I think it's an x okay , well

40:20

, on x , someone either

40:22

tweeted or x . No , sorry , someone didn't

40:24

. The devil himself , sam outman , did

40:26

um to say

40:28

that open ai plans to

40:30

release a new open weight model , um

40:33

. Open weight is basically

40:36

an open source model um we've

40:38

discussed . It's like the difference in semantics between the

40:40

two . But , for for clarity

40:42

, an open source model . It will be the first

40:44

open source model since gpt2 in

40:47

2019 . I then looked this up

40:49

because I thought let me check that that actually was

40:51

an open source model . It wasn't an open

40:53

source model when they first released it , so it seems

40:55

like OpenAI have never actually been that open

40:57

, but they did make it open afterwards

41:00

. So anyway , they are going

41:02

to release a open source

41:04

model . We don't know if it'll be the best model . We don't

41:06

know why're releasing a a model and

41:08

not saying what it is . But this is in line with their usual

41:10

bullshit let's release 20 models at the

41:12

same time , thing that they were supposed to not

41:14

do . But this is a massive thing

41:16

because it is and I guess

41:18

, is why maybe the deep seat coming before this

41:21

kind of makes sense is open . Ai , the

41:23

company who were all about being open , that

41:25

were the least open company , have now

41:27

been pushed to the point where they are going to release open

41:29

source models . Is this the end of closed

41:32

source models ?

41:33

um , oh , I , I think so

41:35

. Yeah , I think so , like I . I I

41:38

mean , I think google's model

41:40

is still proprietary in that sense , and claude

41:42

is as well , but it feels like

41:44

it's moving more in the direction

41:46

of open source . I don't know why open

41:48

AI are doing this , but I think

41:50

it feels like they're just confused at the moment

41:52

, like everything they do is just

41:55

a bit odd and a bit

41:57

sort of disjointed . Like GPT

42:00

4.5 was clearly released

42:02

hastily in

42:05

my opinion , it wasn't the best at anything

42:07

and the argument for releasing

42:09

it was , it's like a bit more personable

42:11

or something like that um , yeah

42:13

, it was bullshit , wasn't it ? uh , they obviously

42:16

like released all their . You know , deep

42:18

seat came out and then they released . Well , they

42:20

, they already had a thinking model , but they released

42:22

the . They opened it up so you

42:24

could see what it was saying , see what it was thinking um

42:28

, but then not as much as Deep Seek did

42:30

. So they just seem sort of a

42:32

bit baffled at the moment , a bit lost

42:35

. The most exciting thing for

42:37

me is the new image generation

42:39

model , which actually genuinely seems cool but

42:42

then hasn't had a massive hype

42:44

around it .

42:44

Yeah , it hasn't had a big fanfare around its release

42:47

in a way , has it .

42:48

No , and

42:52

they still have this weird thing where , which they said they were going

42:54

to tidy up , but they still have this weird thing where you , when you try and use it , you have to choose

42:56

between eight , six different models , or eight different models

42:59

, and depending on which one you choose , you

43:01

can use different features

43:03

. So you , you can use deep research and

43:05

search and image generation

43:07

. They only work with certain versions and it's not . It's

43:09

really not clear . Ironically , the deep research and search and image generation , they only work with certain versions and it's not . It's really not clear .

43:11

I run it . The deep research is probably the most phenomenal

43:13

thing and that seems to have like . Obviously

43:15

it has been announced but that seems to have kind

43:17

of flown under the radar bit . We talked about

43:19

it a little bit but that's pretty amazing

43:22

. Um , I just I just had a quick look

43:24

at a bit more detail . So it's

43:26

the open weight model is a reasoning

43:28

model , so it's going to be similar to open

43:30

ai's 03 mini model , apparently so

43:33

well , also similar to 01 . So

43:35

a reasoning model , um , multiple

43:37

multilingual problem solving

43:39

, so it will be in multiple languages . Developers

43:42

can access and modify the model's trained parameters

43:45

, the weights , without needing the original training data

43:47

, which facilitates customization . And

43:49

just to clarify , because I actually wasn't 100 sure

43:51

, but open weights apparently is halfway between

43:53

open source and closed , so it gives transparency

43:56

in how the models make

43:58

connections , but it doesn't reveal all

44:00

of the code or the training data so that's

44:03

the sort of main difference , I guess .

44:04

Yeah , that's the whole thing about being able to make

44:06

so basically open weight , change

44:09

the underlying . You can't change the sort

44:11

of base you can't change the base

44:13

model in

44:15

that sense , but you you well , you can't see

44:17

what it's been trained on , but you can . You

44:19

can fine tune it right , because the

44:21

open weights means you can fine tune it .

44:24

I was just thinking we keep slagging off um open

44:27

. Well , I just slag off open ai because of Sam

44:29

Altman . I don't actually dislike open

44:31

AI itself , but we keep slagging

44:33

them off for being closed source . But

44:35

then Anthropic Claude , which we

44:38

love , is also closed source

44:40

. So Anthropic , if you're listening

44:42

, do the right thing .

44:45

Actually , yeah , that was my question . Is

44:48

there any indication as to why they've done this

44:50

?

44:50

from what you saw , I

44:52

mean , there is an indication in the tweet from

44:54

what I can see , but when deep seek came out

44:57

um sorry , when deep

44:59

seek r1 came out , sam outman , literally

45:02

within a day or two , said yeah , we

45:04

don't want to be on the wrong side of history , we think

45:06

open source is the way forward . Which was like

45:08

absolutely like quite obviously just a purely

45:10

reactionary thing . So I think it is directly

45:13

a response to deep seek

45:15

, to be honest .

45:16

Yeah , I mean , I'm I'm not sure that all

45:19

the all these companies do need to open

45:21

source . I don't know what the benefit to open AI , open

45:23

sourcing , their publicity

45:25

models are , but like yeah

45:30

, and their um publicity models are , but like yeah , maybe publicity like I don't

45:32

. In some ways , I don't really mind whether they are or not . I think for me , the argument

45:35

of open open source is almost as much about

45:37

having competition

45:39

as it is about having

45:42

open source models , and my

45:44

fear , like if you went back like a year

45:46

or two ago , like 18

45:48

months ago , my fear was that

45:50

everything was going to be concentrated

45:52

in like one or two big tech companies

45:55

and it feels like

45:57

with the release of deep seek

45:59

, that's just blown that out of the water I

46:01

still don't buy that the open air model that they

46:03

are giving to , you know , the dod

46:05

is not a better model , or the

46:07

deep seat model they're giving to the pla in

46:09

china is not a better model so when we

46:12

say when we say that these are , like , you

46:14

know , they're open sourcing instead

46:16

of closed source , like they're not

46:19

open sourcing . The best models even grok , you

46:21

know have said that they will open source six

46:23

months after they've released it . So by that point

46:25

they're basically waiting to release a new one . So it

46:28

does feel like , even officially

46:30

, they are not open sourcing from day

46:32

one . And I think we know now you

46:34

know we talked to what I'd go about how it

46:37

was the only technology in history where the kind of commercial

46:39

public operation was

46:42

ahead of the military one . I'm pretty sure we're

46:44

not in that space anymore no , no

46:46

, I don't think so .

46:47

I think it'd be it think so . I think it'd

46:49

be quite delusional

46:51

to think that really .

46:57

So let's finish off with an honourable mention

46:59

Jimmy .

47:01

Yeah , so Runway , I think this came out

47:03

. This is the most recent drop actually , it

47:05

was in the last 24 hours or so but

47:08

the Runway

47:10

ml4 has come

47:12

out , which is the latest

47:14

version of a video

47:16

generation um generating

47:19

generation model , um

47:21

, which , uh , so runway

47:24

has always been like one of the best and

47:27

the latest one . I mean I'll

47:29

be honest , like video generation is still

47:31

one of the areas where

47:33

it's I mean , it's obviously really difficult to do

47:35

and very computationally

47:37

expensive If you imagine creating

47:39

an image and then multiplying that by , you know , 30

47:42

frames a second , 60 frames a second but

47:45

it is making

47:48

really good progress . Um , I think

47:50

the the best way

47:52

to check this out is definitely not listening to

47:54

a podcast , um , because it's video generation

47:56

. So I think , if you want to have a look , check

47:59

out some of the latest stuff that can be done

48:01

with um runway 4 um

48:03

, it does look really cool and I think

48:05

you have to pay for it . But if you want to have a go with

48:07

it , some of

48:09

the stuff it can do now .

48:11

Is it the best ?

48:12

So it looks . It's

48:15

really weird because Sora came out ages ago

48:17

but then never actually came out . This was a

48:20

sort of side project of OpenAI

48:22

. It has come out now . Runway

48:26

is definitely comparable , if not

48:28

better , than sora . Um , it

48:30

still can only generate like 20

48:33

second clips . The people I've seen talking

48:35

about it are doing that whole . You know

48:37

this is the worst it will ever be thing , and they're , and they're

48:39

right . Um , I don't know how excited

48:42

to get about this because they're still .

48:43

It's only moved from 10 seconds to 20

48:45

in a year , which like okay , that's a year , it's doubled

48:47

. But it also is like yeah , I'm pretty sure

48:50

we talked to one point about . Oh . I think

48:52

you said at one point like in two years time

48:54

you'll be able to make a whole star wars film it

48:56

might be a little bit it might be a bit longer yeah

48:59

, and that's what people are still talking about .

49:00

So I think , I think I think there will

49:02

crack video generation at some point and

49:05

then you will be able to do things like that

49:07

. Um , I mean just to elaborate on

49:09

a little bit . So one of the things that previous

49:11

video generation models were bad

49:13

at was like the consistency between

49:15

frames . So you would have like

49:17

. So , for example , with the new model

49:20

you can have something like if you've got your

49:22

, if you're , if you're , if your main subject

49:24

, um is in the background and you've got

49:26

stuff passing in front of them in the foreground , like

49:29

imagine you've like got um , it's like

49:31

panning and you've got trees or , or

49:33

lampposts or other stuff like

49:35

passing in the foreground . Um , the

49:38

new model can retain

49:40

like the consistency . The example I saw

49:42

the other day was like it was a bloke who

49:44

had a like a crease on his shirt

49:46

in a specific pattern and

49:49

like a lamppost passed between you , passed

49:51

between the camera and the subject

49:53

, and after it

49:55

passed there was like total , total

49:58

consistency with the before and after

50:00

. And again

50:02

, this is much easier to like watch a

50:04

video of it online to see what I'm talking about . But

50:07

that kind of thing was something that video

50:09

generation models previously really struggled

50:11

with that kind of temporal consistency

50:14

, and so it's

50:16

a . It's just another step in the right direction

50:18

. I'll be honest , like if you watch the videos

50:20

, it's still . There's still obvious

50:23

mistakes in places , there's still

50:25

like of things that are just like clearly wrong

50:27

and messed up , a bit like the six

50:29

fingers in photos like a year ago

50:31

, um , but I want

50:33

to keep an eye on still cool

50:36

.

50:36

Well , uh , less than an hour

50:38

. That's good going for us . I think

50:40

we thought this would be half an hour

50:42

but it never is so . Uh

50:45

, thanks for listening everyone . Uh , as usual

50:47

, take care , have a good week there's

50:53

nobody like her .

50:55

She drives me crazy . Yeah

50:59

, she's my baby

51:02

. She's my baby

51:04

. There's

51:07

nobody like her . She drives

51:09

me crazy . Yeah

51:13

, she's my Gemini baby

51:16

. She's my Gemini

51:18

baby . There's

51:21

nobody like her . She drives

51:23

me crazy

51:25

. Yeah , she's

51:27

my Gemini baby . She's my Gemini baby

51:29

. There's

51:34

no party like her . She drives

51:37

me crazy

51:39

. Yeah , she's

51:41

my Gemini baby . But

51:44

remember , it's only

51:46

a matter of time until she

51:48

gets her dreams . But

52:18

remember , it's only a matter of time until she

52:20

gets replaced . But remember , it's only a matter of

52:22

time until she gets replaced . But

52:26

remember , it's only a matter

52:28

of time until she gets replaced

52:31

. But remember

52:33

, it's only a matter of time

52:36

until she gets replaced . But

52:39

remember , it's

52:45

only a matter of time until she gets her place . Thank

52:48

you .

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features