DeepSeek DeepDive + Hands-On With Operator + Hot Mess Express!

DeepSeek DeepDive + Hands-On With Operator + Hot Mess Express!

Released Friday, 31st January 2025
Good episode? Give it some love!
DeepSeek DeepDive + Hands-On With Operator + Hot Mess Express!

DeepSeek DeepDive + Hands-On With Operator + Hot Mess Express!

DeepSeek DeepDive + Hands-On With Operator + Hot Mess Express!

DeepSeek DeepDive + Hands-On With Operator + Hot Mess Express!

Friday, 31st January 2025
Good episode? Give it some love!
Rate Episode

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:00

Whether you're starting or scaling your

0:02

company's security program, demonstrating top-notch security

0:04

practices and establishing trust is more

0:07

important than ever. Vanta automates compliance

0:09

for SOC 2, ISO 27001, and

0:11

more. With Vanta, you can streamline

0:14

security reviews by automating questionnaires and

0:16

demonstrating your security posture with a

0:18

customer-facing trust center. Over 7,000 global

0:21

companies use Vanta to manage risk

0:23

and prove security in real-time. Get

0:25

a thousand dollars off Vanta when

0:28

you go to vanta.com slash daily.

0:30

vanta.com/hard fork for a thousand dollars

0:32

off. I just got my weekly,

0:34

you know, I set up ChatGPT to

0:36

email me a weekly affirmation before we

0:39

start taping because you can do that

0:41

now with the tasks feature. Yeah, people

0:43

say this is the most expensive way

0:46

to email yourself for a reminder. So

0:48

what sort of affirmation did we get?

0:50

Today it said, you are an incredible

0:52

podcast host, sharp, engaging, and completely in

0:55

command of the mic. You're taping today

0:57

is going to be phenomenal, and you're

0:59

going to absolutely kill it. Wow, and

1:02

that's why it's so important that ChatGPT

1:04

can't actually listen to podcasts, because I

1:06

don't think it would say that if

1:08

it actually ever hurt us. It would

1:11

say, just get this over with it.

1:13

Get on with it! I'm Kevin

1:15

Roos, a tech columnist at the New

1:17

York Times. I'm Casey Noon from Platformer.

1:19

And this is Hard Fork. This

1:21

week, we go deeper on Deep Seek.

1:23

China Talks, Jordan Schneider joins us to

1:26

break down the race to build powerful

1:28

AI. Then, hello, operator. Kevin and I

1:30

put open AI's new agent software to

1:32

the test. And finally, the train is

1:34

coming back to the station for a

1:36

round of hot mess express. Well

1:46

Casey it is rare that we spend two

1:48

consecutive episodes of this show talking

1:50

about the same company, but I

1:52

think it is fair to say

1:54

that what is happening with Deep

1:56

Seek has only gotten more interesting

1:58

and more confusing. Yeah, that's right.

2:00

It's hard to remember a story

2:02

in recent months, Kevin, that has

2:05

generated quite as much interest as

2:07

what is going on with Deep

2:09

Seek. Now, Deep Seek for anyone

2:11

catching up is this relatively new

2:13

Chinese AI startup that released some

2:15

very impressive and cheap AI models

2:17

this month that lots of Americans

2:19

have started downloading and using. Yeah,

2:21

so some people are calling this

2:23

a Sputnik moment for the AI

2:25

industry when kind of every nation

2:27

perks up and starts, you know,

2:29

paying attention at the same time to

2:31

the AI arms race. Some people are saying

2:34

this is the biggest thing to happen in

2:36

AI since the release of China. GPT, but

2:38

Casey, why don't you just catch us up

2:40

on what has been happening since

2:43

we recorded our emergency podcast episode

2:45

just two days ago? Well, I

2:47

would say that there have probably

2:49

been three stories, Kevin, that I

2:51

would share to give you a

2:53

quick flavor of what's been going

2:55

on. One, a market research firm

2:57

says Deep Seek was downloaded 1.9

3:00

million times on iOS in recent

3:02

days, and about 1.2 million times

3:04

on the Google Play store. The

3:06

second thing I would point out

3:08

is that Deep Seek has been

3:10

banned by the US Navy over

3:12

security concerns, which I think is

3:14

unfortunate, because what is a submarine

3:16

doing, if not Deep Seeking? It

3:18

was also banned in Italy, by

3:20

the way, after the data protection

3:22

regulator made an inquiry. And finally,

3:24

Kevin, Open AI says that there

3:26

is evidence that Deep Seek distilled

3:29

its models. Distillation is kind of

3:31

the AI lingo or euphemism for

3:33

they used our API to try

3:35

to unravel everything we were doing

3:37

and use our data in ways

3:39

that we don't approve of. And

3:41

now Microsoft and Open AI are

3:43

now jointly investigating whether Deep Seek abused

3:45

their API. And of course we can

3:47

only imagine. how open AI is feeling

3:50

about the fact that their data might

3:52

have been used without payment or consent.

3:54

Oh yeah, must be really hard to

3:57

think that someone might be out there

3:59

trading AI. on your data without permission.

4:01

And I want to acknowledge that literally every

4:03

single user, a blues guy, already made this

4:05

joke, but they were all funny, and I'm

4:07

so happy to repeat it here on Hard

4:10

Fork this week. Now Kevin, as always, when

4:12

we talk about AI, we have certain disclosures

4:14

to make. New York Times Company is currently

4:16

suing Open AI and Microsoft over copyright violations

4:18

alleged related to the use of their copyrighted

4:21

data to train AI models. I think that

4:23

was good. It was very good. And I'm

4:25

in love with a man who works in

4:27

anthropic. But that said Kevin, we have even

4:29

further we want to go into the

4:31

Deep Seek story and we want to

4:33

do it with the help of Jordan

4:35

Schneider. Yes, we are bringing in the

4:37

big guns today because we wanted to

4:39

have a more focused discussion about Deep

4:41

Seek that is not about, you know,

4:43

the stock market or how the American

4:45

AI companies are reacting to this, but

4:48

is about one of the biggest sets

4:50

of questions that all of this raises,

4:52

which is what is China up to

4:54

with deep seek and AI more broadly?

4:56

What are the geopolitical implications of

4:58

the fact that Americans are now

5:00

obsessing over this Chinese-made AI app?

5:02

What does it mean for deep

5:04

seeks prospects in America? What does

5:07

it mean for their prospects in

5:09

China? And how do it's all

5:11

this fit together from the Chinese

5:13

perspective? So... Jordan Schneider is our

5:16

guest today. He's the founder and

5:18

editor-in-chief of China Talk, which is

5:20

a very good newsletter and podcast

5:22

about US-China tech policy. He's been

5:24

following the Chinese AI ecosystem for

5:27

years. And unlike a lot of American

5:29

commentators and analysts who were sort of

5:31

surprised by Deep Seek and what they

5:33

managed to pull off over the last

5:35

couple weeks, I'll say it. I was

5:38

surprised. Yeah, me too. But Jordan has

5:40

been following this company for a long

5:42

time and a big... focus of China

5:44

Talk, his newsletter and podcast, has been

5:46

translating literally what is going on in

5:49

China into English, making sense of it

5:51

for a Western audience, and keeping tabs

5:53

on all the developments there. So perfect

5:55

guest for this week's episode, and

5:57

I'm very excited for this conversation.

6:00

Yes, I have learned a lot

6:02

from China Talk in recent days

6:04

as I've been boning up on

6:06

Deep Seek, so we're excited

6:08

to have Jordan here,

6:10

and let's bring him in.

6:13

Joiner Schneider, welcome to Hard

6:15

Fork! Oh my God, such a huge

6:17

fan. This is such an honor. We're so

6:19

excited. I have learned truly so much from

6:21

you this week. And so when we were

6:24

talking about what to do this week, we

6:26

just looked at each other and said, we

6:28

have got a see of Jordan can come

6:30

on this podcast. Yeah. So this has been

6:32

a big week for Chinese tech policy, maybe

6:35

the biggest week for Chinese tech policy, at

6:37

least that I can remember. I realized that

6:39

something important was happening last weekend when I

6:41

started getting texts from like all of my...

6:44

non-tech friends being like, what is going on

6:46

with Deep Seek? And I imagine you

6:48

had a similar reaction because you

6:50

are a person who does constantly

6:52

pay attention to Chinese tech policy.

6:54

So I've been running China Talk for

6:56

eight years and I can get my

6:58

family members to maybe read like one

7:01

or two editions a year and the

7:03

same exact thing happened with me Kevin

7:05

where all of a sudden I got

7:07

oh my god Deep Seek like it's

7:10

on the cover of the New York

7:12

Post Jordan you're so clairvoyance like maybe

7:14

I should read you more. I'm like

7:16

okay thanks mom appreciate that. Yeah, so

7:18

I want to talk about Deep Seek

7:20

and what they have actually done here,

7:23

but I'm hoping first that you can

7:25

kind of give us the basic lay

7:27

of the land of the sort of

7:29

Chinese AI ecosystem, because that's not an

7:31

area where Casey or I have spent

7:34

a lot of time looking, but tell

7:36

us about Deep Seek and sort of

7:38

where it sits in the overall Chinese

7:40

industry. So Deep Seek is a really

7:42

odd... It was born out of

7:45

this very successful quant hedge fund.

7:47

The CEO of which basically after

7:49

ChatGPT was released was like, okay,

7:52

this is really cool. I want

7:54

to spend some money and some

7:57

time and some compute and hire

7:59

some. fresh young graduates to see if

8:01

we can give it a shot to

8:03

make our own language models. And so

8:06

a lot of companies are out there

8:08

building their own large language models. What

8:10

was the first thing that happened that

8:12

made you think, oh, this one, this

8:14

company is actually making some interesting

8:16

ones. Sure. So there are lots

8:18

and lots of very money to

8:20

Chinese companies that have been trying

8:22

to follow a similar path after

8:24

chat. giant players like Ali Baba,

8:27

Tencent, Bight Dance, Huawei even,

8:29

trying to, you know, create

8:31

their own open AI, basically.

8:33

And what is remarkable is

8:35

the big organizations can't quite

8:37

get their head around creating

8:39

the right organizational institutional structure

8:41

to incentivize this type of

8:43

collaboration and research that leads

8:45

to real breakthroughs. So, you

8:47

know, Chinese firms have been

8:49

releasing models for years now,

8:51

but deep seek seek because.

8:53

because of the way that

8:55

it structured itself and the

8:57

freedom they had not necessarily being

8:59

under a direct profit motive, they

9:01

were able to put out some

9:03

really remarkable innovations that caught the

9:05

world's attention, you know, starting maybe

9:08

late December, and then, you know,

9:10

really blew everyone's mind with the

9:12

release of the R1 chatbot. Yeah,

9:14

so let's talk about R1 in

9:16

just a second, but one more

9:18

question for you, Jordan, about Deep

9:20

Seek. What? do we know about

9:22

their motivation here? Because so much

9:25

of what has been puzzling American

9:27

tech industry watchers over the last

9:29

week is that this is not

9:31

a company that has sort of

9:34

an obvious business model connected to

9:36

its AI research. We know why

9:38

Google is developing AI because it

9:41

thinks it's going to make the

9:43

company Google much more profitable. We

9:45

know why Open AI is developing

9:47

advanced AI models. It does not

9:50

seem obvious to me, and I

9:52

have not read anything from people

9:54

involved in Deep Seek, about why

9:56

they are actually doing this and

9:58

what their ultimate... goal is. So

10:01

can you help us understand

10:03

that? We don't have a lot

10:05

of data, but my base case,

10:07

which is based on two extended

10:09

interviews that the Deep C CEO

10:11

released, which we've translated on China

10:14

Talk, as well as just like

10:16

what Deep C employees have been

10:18

tweeting about in the West, and

10:20

then domestically, is that their dreamers.

10:22

I think the right mental model

10:24

is open AI, you know, 2017

10:26

to 2022. Like, I'm sure you

10:28

could ask the same thing, like,

10:30

what the hell are they doing?

10:33

literally said, I have no idea

10:35

how we're ever going to make

10:37

money, right? And here we are

10:39

in this grand new paradigm. So

10:41

I really think that they do

10:43

have this like vision of AGI

10:45

and like, look, we'll build it

10:47

and we'll make it cheaper for

10:49

everyone, you know, we'll make it

10:52

cheaper for everyone, and like, we'll

10:54

figure it for everyone, we'll figure

10:56

it out, you know, we'll make

10:58

it cheaper for everyone, you know,

11:00

we'll figure it in deep. but

11:02

bite dance or Ali or Tencent or

11:04

Hualway and the government's going to start

11:06

to pay attention in a way which

11:09

it really hasn't over the past few

11:11

years. Right and I want to I want to

11:13

drill down a little bit there because

11:15

I think one thing that most listeners

11:17

in the West do know about Chinese

11:19

tech companies is that many of them

11:21

are sort of inextricably linked

11:24

to the Chinese government that the

11:26

Chinese government has. access to user

11:28

data under Chinese law, that these

11:30

companies have to follow the Chinese

11:32

censorship guidelines. And so as soon

11:34

as Deep Sikhs started to really

11:36

pop in America over the last

11:39

week, people started typing in things

11:41

to Deep Sikhs model, like tell

11:43

me about what happened at Tiananmen

11:45

Square or tell me about Xi

11:47

Jinping or tell me about the

11:49

great leap forward. And it just

11:52

sort of wouldn't do it at all.

11:54

And so people I think saw that and

11:56

said, oh, this is... This is like every

11:58

other Chinese company that has this. sort of

12:00

hand-in-glove relationship with the Chinese ruling

12:03

party, but it sounds from what

12:05

you're saying like Deep Seek has

12:07

a little bit more complicated a

12:10

relationship to the Chinese government

12:12

than maybe some other better-known

12:15

Chinese tech companies. So explain

12:17

that. Yeah, I mean I think

12:19

it's it's it's important like the

12:21

mental model you should have for

12:23

these CEOs are not like people

12:25

who are dreaming to spread Xi

12:27

Jinping thought. Like what they want

12:29

to do is compete. with Mark

12:31

Zuckerberg and Sam Altman and show

12:33

that they're like really awesome and

12:35

great technologists. But the tragedy is,

12:37

is let's take bite dance for

12:39

example, you can look at Jianguming,

12:41

their CEOs, Weibo posts from 2012,

12:43

2013, 2014, which are super liberal

12:45

in a Chinese context, saying like, you

12:47

know, we should have freedom of expression,

12:49

like we should be able to do

12:52

whatever we want. And the early years

12:54

of bite dance, there was a lot

12:56

of relatively more subversive content on the

12:58

platform. you sort of saw like real

13:00

poverty in China, you saw off-color jokes,

13:02

and then all of a sudden in

13:04

2018, he posts a letter saying, I

13:07

am really sorry, like, I need to

13:09

be part of this sort of like

13:11

Chinese national project and like better adhere

13:13

to, you know, modern Chinese socialist values

13:15

and I'm really sorry and it won't

13:17

ever happen again. You know, the same thing

13:19

happened with DD, right? Like they don't really

13:22

want to have... to do anything with politics

13:24

and then they get on someone's side

13:26

and all of a sudden they get

13:28

zapped. DD is of course the big

13:30

Chinese ride share company. Correct. Yeah. What

13:32

did DD do? So they listed on the

13:34

Western Stock Exchange after the Chinese government

13:36

told them not to and then they

13:38

got taken off app stores and it

13:41

was a whole giant. nightmare like they

13:43

had to sort of go through their

13:45

rectification process. So point being with Deep

13:47

Seek right is like now they are

13:49

whether they like it or not going

13:52

to be held up as a national

13:54

champion and that comes with a lot

13:56

of headaches and responsibilities from you know

13:58

potentially giving the Chinese more access, you

14:00

know, having to fulfill government contracts,

14:02

which like honestly are probably really

14:05

annoying for them, to do in

14:07

sort of distracting from the broader

14:09

mission they have of developing and

14:11

deploying this technology in the widest

14:13

range possible. But like Deep Seek

14:15

thus far has flown under the

14:17

radar, but that is no longer

14:19

the case and things are about

14:21

to change for them. Right. And

14:23

I think that was one of

14:25

the surprising things about Deep Seek. for

14:27

the people I know, including you, who

14:29

follow Chinese tech policy, is, you know,

14:31

I think people were surprised by the

14:33

sophistication of their models, and we talked

14:36

about that on the emergency pod that

14:38

we did earlier this week, and how

14:40

cheaply they were trained. But I think

14:42

the other surprise is that they were

14:44

released as open-source software, because, you know,

14:46

one thing that you can do with

14:48

open-source software is download it, hosted in

14:51

another country, removed some of the guardrails

14:53

and the censorship filters that might have

14:55

been part of the original model, But

14:57

by the way, it turned out there weren't even

14:59

really guardrails on the on the V3

15:01

model, right? That it had not been

15:03

trained to avoid questions about Tiananmen Square

15:05

or anything. So that was another really

15:07

unusual thing about this. Right. And one

15:09

thing that we know about Chinese technology

15:12

products is that they don't tend to

15:14

be released that way. They tend to

15:16

be hosted in China and overseen by

15:18

Chinese teams who can make sure that

15:20

they're not out there talking about Tiananmen

15:22

Square. Is the open source nature of what

15:24

Deep Seek has done here part of

15:26

the reason that you think there might

15:28

be conflict looming between them and the

15:30

Chinese government? You know, honestly, I think

15:32

this whole askate about Tiananmen stuff is

15:35

a bit of a red herring on

15:37

a few dimensions. So first, one of

15:39

these like arguments that there's a little

15:41

sort of confusing to me is like

15:43

folks used to say, oh, like the

15:46

Chinese models are going to be lobotomized

15:48

and like they will never be as

15:50

smart as the Western ones because like

15:52

they have to be politically correct. I

15:54

mean, look, if you ask Claude to say

15:57

racist things, it won't. And Claude's still

15:59

pretty smart. a bit of a red

16:01

herring when talking about sort of long-term

16:03

competitiveness of Chinese and Western models. Now,

16:05

you asked me like, oh, so they

16:07

released this model globally and it's open

16:09

source, maybe someone in the Chinese government

16:11

would be uncomfortable with the fact that

16:14

people can get a Chinese model to

16:16

say things that would get you thrown

16:18

in jail if you posted them online

16:20

in China. It's going to be a

16:22

really interesting calculus for the Chinese government

16:24

to make, because on the one hand,

16:26

this is the most positive shine that

16:28

Chinese AI has got globally in the

16:31

history of Chinese AI. So they're

16:33

gonna have to navigate this and

16:35

it might prompt some uncomfortable conversations

16:37

and bring regulators to a place

16:39

they wouldn't have otherwise landed.

16:41

Now, Jordan, I want to ask you

16:43

about something that people have been talking

16:46

about and speculating about in relationship to

16:48

the Deep Seek news for the last

16:50

week or so, which is about chip

16:53

controls. So we've talked a little bit

16:55

on the show earlier this week about

16:57

how Deep Seek managed to put together

17:00

these models. using some of these kind

17:02

of second-rate chips from invidia that are

17:04

allowed to be exported to China. We've

17:06

also talked about the fact that you

17:09

cannot get the most powerful chips legally

17:11

if you are a Chinese tech

17:13

company. So there have been some

17:15

people, including Elon Musk and other

17:18

American tech luminaries, who have said,

17:20

oh, well, Deep Seek has this

17:22

sort of secret stash of these

17:24

banned chips that they have smuggled

17:26

into the country. and that actually

17:28

they are not making due with

17:31

kind of the Kirkland signature chips

17:33

that they say they are. What

17:35

do we know about how true

17:37

that is? So, did Deep Seek

17:39

have band ships? It's kind of

17:41

impossible to know. This is a question

17:43

more for the US intelligence community than

17:45

like Jordan Schneider on Twitter. But I

17:48

do think that it is important to

17:50

understand that the delta between what you

17:52

can get in the West and what

17:54

you can get in China is actually

17:56

not that big. And, you know, we're

17:58

talking about training a lot, but on

18:00

the infrared side, trying to can still

18:02

buy this H20 chip from a video, which

18:04

is basically world-class at like deploying

18:06

the AI and letting everyone use it.

18:08

So does this mean that we should

18:11

just give up? I don't think so.

18:13

Compute is going to be in a

18:15

core input, regardless of how much model

18:17

distillation you're going to have in the

18:19

future. There have been a lot of

18:22

quotes even from the deep seek founder

18:24

basically saying like the one thing that's

18:26

holding us back are these export controls.

18:29

Right. Okay, I want to ask a big

18:31

picture question. Sure. I think

18:33

that a reason that people have

18:35

been so fascinated by this Deep

18:37

Seek story is that at least

18:40

for some folks, it seems to

18:42

change our understanding of where China

18:44

is in relation to the United

18:47

States when it comes to developing

18:49

very powerful AI. Jordan, what is

18:51

your assessment of what? the V3

18:54

and R1 models mean? And to

18:56

what extent do you think the

18:58

game has actually changed here? I'm

19:00

not really sure the game has

19:03

changed so much. Like Chinese engineers

19:05

are really good. I think it

19:07

is a reasonable base case that

19:09

Chinese firms will be able to

19:11

develop comparable or fast follow on the

19:13

model side. But the real sort of

19:16

long-term competition is not just going to

19:18

be on developing the models, but deploying

19:20

them and deploying them at scale. And

19:22

that's really where compute comes in, and

19:25

that's why expert controls are going to

19:27

continue to be a really important piece

19:29

of America's strategic arsenal when it comes

19:31

to making sure that the 21st century

19:34

is defined by the US and our

19:36

friends as opposed to China and theirs.

19:38

Right. So it's one thing to have

19:40

a model that is about as capable

19:42

as the models that we have here

19:44

in the United States. It's another thing

19:46

to have the energy to actually let

19:49

everyone use them as much as they

19:51

want to use them. What you're saying

19:53

is no matter what Deep Seek may

19:55

have invented here, that fundamental dynamic has

19:57

not changed. China simply does not have

19:59

nearly the... amount of compute that the United

20:01

States has. As long as we don't

20:03

screw up export controls. So I

20:05

think the sort of base case

20:08

for me is that if the

20:10

US stays serious about holding a

20:12

line on semiconductor manufacturing equipment and

20:14

export of AI chips, then it

20:17

will be incredibly difficult for the

20:19

Chinese sort of broader semiconductor and

20:21

AI ecosystem to leap ahead much

20:24

less kind of like fast follow

20:26

beyond being able to develop comparable

20:28

models. I'm feeling good as long

20:30

as you know, Trump doesn't make

20:33

some like crazy trade for, you

20:35

know, soybeans in exchange for ASML

20:37

EU machines. That would really break

20:39

my heart. I want to inject

20:41

kind of a note of skepticism

20:44

here because I buy everything you're

20:46

saying about how Deep Seek's progress

20:48

has been sort of bottlenecked by

20:51

the fact that it can't get

20:53

these very powerful American AI chips

20:56

from companies like invidia. But

20:58

I also am hearing... people who

21:00

I trust say things that make

21:02

me think that actually the bottleneck

21:05

may not be the availability of

21:07

chips that maybe with some of

21:09

these algorithmic efficiency breakthroughs that Deep

21:12

Seek and others have been making,

21:14

it might be possible to run

21:16

a very very powerful AI model

21:19

on a conventional piece of hardware

21:21

on a Mac book even. And

21:23

I wonder about How much of

21:26

this is just like AI companies

21:28

in the West trying to cope,

21:30

trying to make themselves feel better,

21:32

trying to reassure the market that

21:35

they are still going to make

21:37

money by investing billions and billions

21:39

of dollars into building powerful AI

21:41

systems? If these models do

21:43

just become sort of lightweight commodities

21:46

that you can run on a

21:48

much less powerful cluster of computers,

21:50

or maybe on one computer, doesn't

21:53

that just mean we

21:56

can't? control the proliferation

21:58

of them at like this is

22:00

one potential future and maybe that potential

22:03

future like went up 10 percentage

22:05

points of likelihood of like you

22:07

being able to fit the biggest

22:09

badest smartest most fast efficient AI

22:11

model on something that you that

22:13

can sit in your home but

22:15

I think there are lots of

22:17

other futures in which sort of

22:19

the world doesn't necessarily play out

22:21

that way and look in video

22:23

went down 15% it didn't go

22:26

it didn't go down 95% like

22:28

I think if we're really in

22:30

that world where chips don't matter

22:32

because everything can be shrunk down

22:34

to kind of consumer grade hardware

22:36

then the sort of reaction that

22:38

I think you would have seen

22:40

in the stock market would have

22:42

been even more dramatic than the

22:44

kind of freak out we saw

22:46

over this week so we'll see

22:48

I mean it would be a

22:50

really remarkable kind of democratizing thing

22:52

if that was the future we

22:54

ended up living in, but it

22:56

still seems pretty unlikely to my

22:58

history major brain here. I would

23:00

also just point out, Kevin, that

23:02

when you look at what Deep

23:04

Seek has done, they have created

23:07

a really efficient version of a

23:09

model that American companies themselves had

23:11

trained like nine to 12 months

23:13

ago. So they sort of caught

23:15

up very quickly. And there are

23:17

fascinating technological innovations in what they

23:19

did. But in my mind, these

23:21

are still primarily optimization. Like for

23:23

me, what would tip me over

23:25

into like, oh my gosh, America

23:27

is losing this race is China

23:29

is the first one out of

23:31

the gate with a virtual co-worker, right?

23:34

Or like a truly phenomenal agent. Some

23:36

sort of leap forward in the technology

23:38

as opposed to we've caught up really

23:41

quickly and we've figured out something more

23:43

efficiently. Are you saying it differently than

23:45

that? I mean, I guess I just

23:48

don't know what like a six-month lag

23:50

would buy us if it does take

23:52

six months for the Chinese AI companies

23:55

like Deep Seek to sort of catch

23:57

up to the state of the art.

23:59

I was struck by Adari

24:01

Amade, who's the CEO of

24:04

Anthropic, wrote an essay just

24:06

today about Deep Seek and

24:08

Export Controls. And in it,

24:10

he makes this point about

24:12

the sort of difference between

24:15

living in what he called

24:17

a unipolar world, where one

24:19

country or one block of

24:21

countries has access to something

24:23

like an AGI or an

24:25

ASI, and the rest of

24:28

the world doesn't. versus the

24:30

situation where China gets there roughly

24:32

around the same time that we

24:34

do. And so we have this

24:36

bipolar world where two blocks of

24:38

countries, the East and the West,

24:40

basically have access to this equivalent

24:42

technology. And so- And of course

24:44

in a bipolar world, sometimes we're

24:47

very happy and sometimes we're very

24:49

sad. Exactly. So I just think

24:51

like, whether we get there six months

24:53

ahead of them or not. I just

24:55

feel like there isn't that much of

24:57

a material difference. But Jordan, maybe I'm

24:59

wrong. Can you make the other side

25:01

of that? That it really does matter. I'm

25:03

kind of there. I, you know, I'll

25:06

take a little bit of issue with

25:08

what Darrio says. And I think, you

25:10

know, what one of the lessons that

25:12

Deep Sea shows is we should expect

25:14

a base case of Chinese model makers

25:16

being able to fast follow the innovations,

25:18

which, by the way, Casey. actually do

25:20

take those giant data centers to run

25:22

all the experiments in order to find

25:24

out, you know, what is this sort

25:26

of future direction you want to take

25:28

your model? And what sort of AI

25:30

is going to come down to is

25:32

not just creating the model, not just

25:34

sort of like Dario envisioning the future

25:36

and then all of a sudden like

25:39

things happen. Like there's gonna be a

25:41

lot of messiness in the implementation and

25:43

there are gonna be sort of like

25:45

teachers unions who are upset that AI

25:47

comes in the classroom and there are

25:49

gonna be like all these regulatory pushbacks

25:51

and a lot of societal reorganization which

25:53

is gonna need to happen just like

25:55

it did during the industrial revolution. So

25:57

look model making is a frontier of

25:59

competition. Compute access is a frontier of

26:01

competition, but there's also this broader like

26:04

how will a society kind of adopt

26:06

and cope with all of this new

26:08

future that's going to be thrown in

26:10

our faces over the coming years. And

26:12

I really think it's that just as

26:15

much as the model development and the

26:17

compute, which is going to determine which

26:19

countries are going to gain the most

26:21

from what AI is going to offer

26:23

us. Yeah. Well, Jordan, thank you

26:26

so much for joining and

26:28

explaining all of this to

26:30

us. I feel more enlightened.

26:32

Me too. Oh, my pleasure.

26:34

My chain of thought has

26:36

just gotten a lot longer.

26:39

That's an AI joke. Let

26:41

me come back. Kevin, there's

26:43

an agent at our door.

26:45

Is it Jerry McGuire? No,

26:47

it's an AI one. Oh,

26:49

okay. Whether you're starting or

26:54

scaling

26:56

your

26:59

company's

27:01

security

27:06

program,

27:10

Demonstrating top-notch security practices and establishing

27:12

trust is more important than ever.

27:14

Vanta automates compliance for SOC 2,

27:17

ISO 27001, and more. With Vanta,

27:19

you can streamline security reviews by

27:21

automating questionnaires and demonstrating your security

27:24

posture with a customer-facing trust center.

27:26

Over 7,000 global companies use Vanta

27:29

to manage risk and prove security

27:31

in real-time. Get a thousand dollars

27:33

off Vanta when you go to

27:36

vanta.com/hard fork. That's vanta.com/hard fork for

27:38

a thousand dollars off. The New

27:40

York Times app has all this stuff that you

27:42

may not have seen. The way the tabs are

27:45

at the top with all of the different

27:47

sections. I can immediately navigate to something

27:49

that matches what I'm feeling. Play

27:51

wordle or connections and then swipe

27:53

over to read today's headlines. There's

27:55

an article next to a recipe

27:57

next to a recipe, next to

27:59

games. It's just easy to

28:01

get everything in one place.

28:04

This app is essential. The

28:06

New York Times app, all

28:08

of the times, all in

28:10

one place. Download it now

28:12

at nytimes.com slash app.

28:15

operator information. Give me Jesus

28:17

on the line. Do you

28:19

know that one? No. Do

28:21

you know operator by Jim

28:23

Croci? No. operator. Oh, won't

28:26

you help me post this

28:28

call? Well Casey call

28:30

your agent because today we're talking

28:32

about AI agents Why do I need to

28:35

call my agent? I don't know I

28:37

just sounded good. Okay, well I appreciate

28:39

the effort, but yes Kevin because For

28:41

months now, the big AI labs have

28:43

been telling us that they are going

28:46

to release agents. This year, agents of

28:48

course, being software that can essentially use

28:50

your computer on your behalf or use

28:53

a computer on your behalf. And the

28:55

dream is that you have sort of

28:57

a perfect virtual assistant or co-worker. You

28:59

name it. If they are somebody who

29:02

might work with you at your job,

29:04

the AI labs are saying, we are

29:06

building that for you. Yeah, so last

29:09

year toward the end of the year

29:11

we started to see kind of these

29:13

demos, these these previews that companies like

29:15

Anthropic and Google were working on. Anthropic

29:18

released something called computer use, which was

29:20

an AI agent, a sort of very

29:22

early preview of that. And then Google

29:24

had something called Project Mariner that I

29:27

got a demo of, I believe in

29:29

December, that was basically the same thing,

29:31

but their version of it. And then

29:33

just last week, Open AI announced that

29:36

it was launching. which is its first

29:38

version of an AI agent, and unlike

29:40

Anthropic and Google's, which you either had

29:42

to be a developer or part of

29:44

some early testing program to access, you

29:46

and I could try it for ourselves

29:48

by just upgrading to the $200 a

29:51

month pro subscription of ChatGPT. Yeah, and

29:53

I will say that as somebody who's

29:55

willing to spend money on software all

29:57

the time, I thought, am I really

29:59

about... to spend $200 to do this,

30:01

but in the name of science, Kevin,

30:03

I had to. At this point, I

30:06

am spending more on AI subscription products

30:08

than on my mortgage. I'm pretty sure

30:10

that's correct. But it's worth it. We

30:12

do it for journalism. We

30:14

do. So we both spent a couple

30:16

of days putting operator through its paces,

30:18

and today we want to talk a

30:21

little bit about what we found. Yeah,

30:23

so would you just explain like what?

30:25

operator is and how it works.

30:27

Yeah, sure. So operator is a separate

30:29

sub domain of chat GPT. You know,

30:31

sometimes the chat GPT will just let

30:34

you pick a new model from a

30:36

drop-down menu. But for operator, you got

30:38

to go to a dedicated site. Once

30:40

you do, you'll see a very familiar

30:42

chatbot interface, but you'll see different kinds

30:45

of suggestions that reflect some of the

30:47

partnerships that Open AI has struck up.

30:49

So for example, they have partnerships with

30:51

open table and stub hub hub and

30:54

all recipes. meant to give you an

30:56

idea of what operator can do. And

30:58

frankly Kevin, not a lot of

31:00

this sounds that interesting, right? Like

31:02

the suggestions are on the the

31:04

order of suggest a 30-minute meal

31:06

with chicken or reserve a table

31:08

for eight or find the most

31:10

affordable passes to the Miami Grand

31:13

Prix. Again, so far, kind of

31:15

so boring. What is... different about

31:17

operator though is that when you

31:19

say okay find the most affordable

31:21

passes to the Miami Grand Prix

31:23

when you hit the enter button

31:25

it is going to Open up

31:27

its own web browser and it's

31:29

going to use this new model that

31:32

they have developed to try to actually

31:34

go and get those passes for you.

31:36

Yeah, so this is an important thing

31:38

because I think, you know, when people

31:40

first heard about this, they thought, okay,

31:43

this is an AI that kind of

31:45

takes over your computer, takes over your

31:47

web browser, that is not what operator

31:49

does. Instead, it opens a new browser

31:51

inside your browser and that browser is

31:53

hosted on open AI servers. It doesn't

31:56

have your bookmarks and stuff like

31:58

that saved, but you can take

32:00

it over from the autonomous AI agent if

32:02

you need to click around or do something

32:04

on it. But it basically exists. It's like

32:06

a it's a browser within a browser. Yeah.

32:08

So the one of the ideas on operator

32:11

is that you should be able to leave

32:13

it on supervised and just kind of go

32:15

do your work while it works. But of

32:17

course it is very fun initially at least

32:19

to watch the computer try to use itself.

32:21

And so I sat there in front of

32:23

this browser within a browser within a browser

32:25

and I watched. type the, you know, URL,

32:27

navigate to a website, and, you know, in

32:30

the example I just gave, actually

32:32

search for passes to the Miami

32:34

Grand Prix. Yeah, and it's interesting

32:36

on a slightly more technical level,

32:38

because until now, if an AI...

32:41

system like a chat GPT wanted

32:43

to interact with some other website,

32:45

it had to do so through

32:47

an API, right? APIs, application program

32:49

interfaces are sort of the way

32:52

that computers talk to each other,

32:54

but what operator does is essentially

32:56

eliminate the need for APIs because

32:58

it can. just click around on

33:00

a normal website that is designed

33:03

for humans and behave like a

33:05

human and you don't need a

33:07

special interface to do that. Yeah,

33:09

and now some people might hear

33:11

that, Kevin, and start screaming because

33:13

what they will say is APIs

33:15

are so much more efficient than

33:17

their operator is doing here. APIs

33:19

are doing here. APIs are very

33:21

structured. They're very fast. They let

33:23

computers talk to each other without

33:25

having to, for example, open up

33:27

a browser. have to be built.

33:29

There is a finite number of

33:32

them. The reason that Open AI

33:34

is going through this exercise is

33:36

because they want a true general-purpose

33:38

agent that can do anything for

33:40

you, whether there is an API

33:42

for it or not. And maybe

33:44

we should just pause for a

33:46

minute there and zoom out a

33:49

little bit to say, why are

33:51

they building? That's like, what is

33:53

the long-term vision here? Sure. So

33:55

the vision is to create virtual

33:57

co-workers, Kevin. create some kind of

33:59

digital. entity that you can just hire

34:01

as a co-worker. The first ones they'll

34:04

probably be engineers because these systems are

34:06

already so good at writing code, but

34:08

eventually they want to create virtual consultants,

34:10

virtual lawyers, virtual doctors, you name it.

34:12

Virtual podcast hosts? Let's hope they don't

34:14

go that far. But everything else is

34:16

on the table. And if they can

34:19

get there, presumably there are going to

34:21

be huge profits in it for them.

34:23

They're going to potentially be huge productivity

34:25

gains for companies. And then there is,

34:27

of course, the question of, well, what

34:29

does this mean for human beings? And

34:31

I think that's somewhat workier. Right. And

34:34

I think there's also, it also helps

34:36

to justify the cost of running these

34:38

things because $200 a month is a

34:40

lot to pay for a remote worker.

34:42

And if you could, say, use the

34:44

next version of operator, or maybe two

34:47

or three versions from now, to say,

34:49

replace a customer service agent or someone

34:51

in your billing department, that actually starts

34:53

to look like a very good deal.

34:55

Absolutely, or even if I could bring

34:57

it into the realm of journalism, Kevin,

34:59

if I had a virtual research assistant

35:02

and I said, hey, I'm going to

35:04

write about this today, go pull all

35:06

of the most relevant information about this

35:08

from the past couple of years and

35:10

maybe organize it in such a column

35:12

based. off of it, like yeah, that's

35:14

absolutely worth $200 a month to me.

35:17

Okay, so Casey, walk me through something

35:19

that you actually asked operator to do

35:21

for you and what it did autonomously

35:23

on its own. Sure. I'll maybe give

35:25

like two examples, like a pretty good

35:27

one and maybe a not so good

35:30

one. Pretty good one was, and this

35:32

was it actually suggested by operator. I

35:34

used trip advisor to look up walking

35:36

tours in London that I might want

35:38

to do. I'm not actually going to

35:40

London. Oh, so you lied to the

35:42

AI? And not for the first time.

35:45

But here's what I'll say. If anybody

35:47

wants to break heaven in London, I'll

35:49

get in touch. We love the city.

35:51

Yep. So I said, OK, operator, sure,

35:53

let's do it. Let's find. me some

35:55

walking tours. I clicked that it opened

35:58

a browser. It went to TripAdvisor, it

36:00

searched for Luden Walking Tours, it read

36:02

the information on the website, and then

36:04

it presented it to me, did that

36:06

within a couple of minutes. Now, on

36:08

one hand, could I have done that

36:10

just as easily by Google? Could I

36:13

probably have done it even faster if

36:15

I'd done it myself? Sure. But if

36:17

you're just sort of interested in the

36:19

technical feat that is getting one of

36:21

these models to open a browser to

36:23

navigate to a website, navigate to a

36:25

website, read it and share information, computer

36:28

using itself and you know going around

36:30

like typing things and selecting things from

36:32

drop-down menus yeah it's sort of like

36:34

you know if you think it is

36:36

cool to be in a self-driving car

36:38

like this is that but for your

36:41

web a self-driving browser it is a

36:43

self-driving browser so that's the good example

36:45

yes what was another example so another

36:47

example and this was something else that

36:49

open AI suggested that we try was

36:51

to try to use operator to buy

36:53

groceries and they have a partnership with

36:56

instakart And so I thought, okay, they're

36:58

gonna have like sort of dialed this

37:00

in so that there's a pretty good

37:02

experience. And so I said, okay, let's

37:04

go ahead and buy groceries and I

37:06

went to operator and I said something

37:09

like, hey, can you help me buy

37:11

groceries on Instagram? And it said, sure.

37:13

And here's what it did. It opened

37:15

up, Instagram, in a browser, so far,

37:17

so good. And then it started searching

37:19

for milk in stores located in Des

37:21

Moines, Iowa. Now, you do not live

37:24

in Des Moines, Iowa, so why did

37:26

it think that you did? As best

37:28

as I can tell, the reason it

37:30

did this is that Instacart defaults to

37:32

searching for grocery stores in the local

37:34

area and the server that this instance

37:36

of operator was running on was in

37:39

Iowa. Now, if you were designing a

37:41

grocery product like Instacart, and Instacart does

37:43

this, when you first sign on and

37:45

say you're looking for groceries, it will

37:47

say, quite sensibly, where are you. Instagram

37:49

might also offer suggestions for things that

37:52

you might want to buy. It does

37:54

not just assume that you want milk.

37:56

Wow, I'm just picturing like a house

37:58

in Des Moines Iowa where there's just

38:00

like a palette of milk being delivered

38:02

every day from all these poor operator

38:04

users. Yes. So I thought, okay, whatever,

38:07

you know, this thing makes mistakes. Let's

38:09

hope that it gets on the right

38:11

track here. And so I tried to

38:13

pick the grocery store that I wanted

38:15

it to shop at, which is, you

38:17

know, in San Francisco where I live,

38:20

and it entered that grocery store's address

38:22

as the delivery address. So like it

38:24

would try to deliver groceries presumably from

38:26

Des Moines Iowa to my grocery store,

38:28

which is not what I wanted. And

38:30

it actually could not. solve this problem

38:32

without my help. I had to take

38:35

over the browser, log into my Instacart

38:37

account, and tell it which grocery store

38:39

that I wanted to shop it. So

38:41

already, all of this has taken at

38:43

least 10 times as long as it

38:45

would have taken me to do this

38:47

myself. Yeah, so I had some similar

38:50

experiences. The first thing that I had

38:52

operator tried to do for me was

38:54

to buy a domain name and set

38:56

up a web server for a project

38:58

that you and I are working on

39:00

that we can't really talk about yet.

39:03

Secret project. Secret project. And so I

39:05

said to operator, I said, go research

39:07

available domain names related to this project,

39:09

buy the one that costs less than

39:11

$50. And then by hosting it. and

39:13

set it up and configure all the

39:15

DNS settings and stuff like that. Okay,

39:18

so that's like a true multi-step project

39:20

and something that would have been legitimately

39:22

very annoying to do yourself. Yes, you

39:24

know, that would have taken me, I

39:26

don't know, half an hour to do

39:28

on my own, and it did take

39:30

operator some time. Like I had to

39:33

kind of like set it and forget

39:35

it, and like I, you know, got

39:37

myself a snack and a cup of

39:39

coffee, and then when I came back,

39:41

it had done most of these tasks.

39:43

the browser and enter my credit card

39:46

number I had to give it some

39:48

details about like my address for the

39:50

sort of registration for the domain name

39:52

I had to pick between the various

39:54

hosting plans that were available on this

39:56

website but It did 90% of the

39:58

work for me. And I just had

40:01

to sort of take over and do

40:03

the last mile. And this is really

40:05

interesting because what I would assume was

40:07

it would get like, I don't know,

40:09

5% of the way and it would

40:11

hit some hicup and it just wouldn't

40:14

be able to figure something out until

40:16

you came back and saved it. But

40:18

it sounds like from what you're saying

40:20

was, it was somehow able to work

40:22

around whatever unanswered questions there were and

40:24

still get a lot done while you

40:26

weren't paying attention. It felt a little

40:29

bit like training like a very new

40:31

very insecure intern because like it at

40:33

first it would keep prompt me be

40:35

like well do you want a.com or

40:37

a dot net? And eventually you just

40:39

have to prompt it and say, like,

40:41

make whatever decisions you want. Like, wait,

40:44

you said that to it. Yes, I

40:46

said, like, only ask for my intervention

40:48

if you can't progress any farther, otherwise

40:50

just make the most reasonable decision. You

40:52

said, I don't care how many people

40:54

you have to kill. Just get me

40:57

this domain. And it said, understood, sir.

40:59

Yeah, and I'm now wants it in

41:01

42 states. Anyway, that was one thing

41:03

that operator did for me that was

41:05

pretty impressive. That feels like a grand

41:07

success compared to what I got operator

41:09

to do. Yeah, it was pretty impressive.

41:12

I also had to send lunch to

41:14

one of my coworkers, Mike Isaac, who

41:16

was hungry, because he was on deadline,

41:18

and I said go to DoorDash and

41:20

get Mike some lunch. It did initially

41:22

mess up that. process because it decided

41:25

to send him tacos from a taco

41:27

place which you know is great and

41:29

it's a taco place I know it's

41:31

very good but I said order enough

41:33

for two people and sort of ordered

41:35

two tacos and this is one of

41:37

those places where the tacos are quite

41:40

small operator said get your portion size

41:42

under control America yeah so then I

41:44

had to go in and say does

41:46

that sound like enough food operator and

41:48

it said actually now that you mentioned

41:50

I should probably order more wait no

41:52

so here's a question so in these

41:55

cases is the first step that you

41:57

log into your account because it doesn't

41:59

have any of your payment details or

42:01

anything so at what point are you

42:03

actually sort of teaching at that it

42:05

depends on the website so sometimes you

42:08

can just say up front like here

42:10

is my email address or here is

42:12

my login information and it will sort

42:14

of you know log you in and

42:16

do all that. Sometimes you take over

42:18

the browser. There are some privacy features

42:20

that are probably important to people where

42:23

it says Open AI says that it

42:25

does not take screenshots of the browser

42:27

while you are in control of it

42:29

because you might not want your credit

42:31

card information getting sent to open AI

42:33

servers or anything like that. So sometimes

42:36

it happens at the beginning of the

42:38

process, sometimes it happens like when you're

42:40

checking out at the end. And so

42:42

were you taking it over to log

42:44

in or were you saying, I don't

42:46

care, and you just like we're giving

42:48

operator your door dash password and plain

42:51

text? I was taking it over. Okay,

42:53

smart, smart, smart. So. Those were the

42:55

good things I also this was a

42:57

fun one. I I wanted to see

42:59

if operator could make me some money

43:01

So I said go take a bunch

43:03

of online surveys because you know there

43:06

are all these websites where you can

43:08

like get a couple cents for like

43:10

filling out an online survey Something that

43:12

most people don't know about Kevin is

43:14

he devotes 10% of his brain at

43:16

any given time to thinking about schemes

43:19

to generate and it's one of my

43:21

favorite Aspects of your personality that I

43:23

feel like doesn't get exposed very much,

43:25

but this is truly the most rusian

43:27

approach to using operator I can imagine

43:29

So I can't wait to find out

43:31

how this went. Well the most rusian

43:34

approach may might have been what I

43:36

tried just before this which was to

43:38

have it go play online poker for

43:40

me But it did not do it.

43:42

It said I can't help with gambling

43:44

or lottery related activities. Okay woke AI

43:46

Does the Trump administration know about this?

43:49

But it was able to actually fill

43:51

out some online surveys for me and

43:53

it earned a dollar and twenty cents.

43:55

Is that right? Yeah, in about 45

43:57

minutes. So if you had it going

43:59

all month, presumably you could maybe eke

44:02

out the $200 to cover the cost

44:04

of operator, pro? Yes, and I'm sure

44:06

I spent hundreds of dollars worth of

44:08

GPU computing power just to be able

44:10

to make that dollar and 20 cents.

44:12

But hey, it worked. But hey, it

44:14

worked. So those were some of the

44:17

things that I tried. There were some

44:19

other things that it just would not

44:21

do for me, no matter how hard

44:23

I tried. One of them. So one

44:25

of them was to. I was trying to

44:27

update my website and put some

44:30

links to articles that I'd written

44:32

on my website. And what I

44:34

found after trying to do this

44:36

was that there are just websites

44:38

where operator is not allowed to

44:40

go. And so when I said

44:42

to operator, go pull down these

44:44

New York Times articles that I

44:47

wrote and put them onto my

44:49

website, it said I can't get

44:51

to the New York Times website. I'm

44:53

going to guess you expected that to

44:55

happen. Well, I thought maybe it has

44:58

some clever work around, and maybe I

45:00

should alert the lawyers at the New

45:02

York Times, if that's the case. But

45:04

no, I assumed that if any website

45:07

were to be blocking the open

45:09

AI web crawlers, it would be

45:11

the New York Times. Yeah. There

45:13

are other websites that have also

45:15

put up similar blockades to prevent

45:17

operator from crawling them, read it,

45:19

you cannot go on to with

45:21

operator, YouTube, you cannot go on

45:24

to with operator, various other websites,

45:26

go daddy for some reason, did

45:28

not allow me to use operator

45:30

to buy a domain name there,

45:32

so I had to use another

45:34

domain name site to do that.

45:36

So right now there are some

45:38

pretty janky... parts of operator. I would

45:40

not say that most people would get

45:42

a lot of value from using it.

45:45

But what do you think? Well, I

45:47

do think that there is something just

45:49

undeniably cool about watching a computer

45:51

use itself. Of course, it can

45:54

also be quite unsettling. A computer

45:56

that can use itself can cause

45:58

a lot of harm. think that

46:00

it can do a lot of

46:02

good and so it was fun

46:04

to try to explore what some

46:06

of those things could be. And

46:08

to the extent that operator is

46:11

pretty bad at a lot of

46:13

tasks today, I would point out

46:15

that it showed pretty impressive gains

46:17

on some benchmarks. So there is

46:19

one benchmark, for example, that Anthropic

46:21

used when they unveiled computer use

46:23

last year, and they scored 14.9%

46:25

on something called OS World, which

46:27

is an evaluation for testing agent,

46:29

so not great. Just three months

46:31

later, Open AI said that its

46:33

Kua model scored 38.1% on the

46:35

same evaluation. And of course, we

46:37

see this all the time in

46:39

AI where there's just this very

46:42

rapid progress on these benchmarks. And

46:44

so on one hand, 38.1% is

46:46

a failing grade on basically any

46:48

test. On the other hand, if

46:50

it improves at the same rate

46:52

over the next three to six

46:54

months, you're gonna have a computer

46:56

that is very good at using

46:58

itself, right? So that I just

47:00

think is worth noting. Yes, I

47:02

think that's plausible. We've obviously seen

47:04

a lot of different AI products

47:06

over the last couple of years

47:08

start out being pretty mediocre and

47:10

get pretty good within a matter

47:12

of months. But I would give

47:15

one cautionary note here, and this

47:17

is actually the reason that I'm

47:19

not particularly bullish about these kind

47:21

of browser using AI agents. I

47:23

don't think the internet is going

47:25

to sit still and allow this

47:27

to happen. The internet is built

47:29

for humans to use, right? It

47:31

is every news publisher that shows

47:33

ads on their website, for example,

47:35

prices those ads based on the

47:37

expectation that humans are actually looking

47:39

at them. If browser agents start

47:41

to become more popular and all

47:43

of a sudden 10 or 20

47:46

or 30 percent of the visitors

47:48

to your website are not actually

47:50

humans but are instead operator or

47:52

some similar system, I think that

47:54

starts to break the... assumptions that

47:56

power the economic model of a

47:58

lot of the internet. Now is

48:00

that still true if we find

48:02

that the agents actually get persuaded

48:04

by the ads and that if

48:06

you send operator to buy door

48:08

dash and it sees an ad

48:10

for McDonald's it's like you know

48:12

what that's a great idea I'm

48:14

gonna ask Kevin if he actually

48:16

wants some of that. Totally Totally,

48:19

I actually think you're joking, but

48:21

I actually think that is a

48:23

serious possibility here is that people

48:25

who, you know, build e-commerce sites,

48:27

Amazon, etc. start to put in

48:29

basically signals and messages for browser

48:31

agents to look at on their

48:33

website to try to influence what

48:35

it ends up buying. And I

48:37

think you may start to see

48:39

restaurants popping up in certain cities

48:41

with names like operator, pick me,

48:43

or order from this one, Mr.

48:45

That's maybe a little extreme, but

48:47

I do think that there's going

48:49

to be a backlash among websites,

48:52

publishers, e-commerce, vendors, as these agents

48:54

start to take off. I think

48:56

that that is reasonable. I'll tell

48:58

you what I've been thinking about

49:00

is how do we turn this

49:02

tech demo into a real product?

49:04

And the main thing that I

49:06

noticed when I was testing operator

49:08

was there is a difference between

49:10

an agent that is using a

49:12

browser and an agent that is

49:14

using your browser. When an agent

49:16

is able to use your browser,

49:18

which it can't right now, it's

49:20

already logged into everything. It already

49:23

has your payment details. It can

49:25

do everything so much faster and

49:27

more seamlessly and without as much

49:29

hand-holding. Of course, there are also

49:31

so many more privacy and security

49:33

risks that would come from entrusting

49:35

an agent with that kind of

49:37

information. So there is some sort

49:39

of chasm there that needs to

49:41

be closed and I'm not quite

49:43

sure how anyone does it. but

49:45

I will tell you I do

49:47

not think the future is opening

49:49

up these virtual browsers and me

49:51

having to enter all of my

49:53

login and payment details every single

49:56

time I want to do anything

49:58

on the internet because truly I

50:00

would rather just do it myself.

50:02

Right. I also think there's just

50:04

a lot more potential for harm

50:06

here. A lot of AI safety

50:08

experts I've talked to are very

50:10

worried about this because what you're

50:12

essentially doing is letting the AI

50:14

models make their own decisions and

50:16

actually carry out tasks. And so

50:18

you can imagine a world where

50:20

an AI agent that's very powerful,

50:22

a couple versions from now, decides

50:24

to start. doing cyber attacks because

50:27

maybe some malevolent user has told

50:29

it to make money and it

50:31

decides that the best way to

50:33

do that is by hacking into

50:35

people's crypto wallets and stealing their

50:37

crypto. Yeah, so those are the

50:39

kinds of reasons that I am

50:41

a little more skeptical that this

50:43

represents a big breakthrough, but I

50:45

think it's really interesting and it

50:47

did give me that feeling of

50:49

like, wow, this could get really

50:51

good, really fast, and if it

50:53

does, the world will look very

50:55

different. When we come back, Kevin,

50:57

back that caboose up. It's time

51:00

for the Hot Mess Express. You

51:02

know, Roose Caboose was my nickname

51:04

in middle school. Kevin Caboose. Choo-choo.

51:19

Whether you're starting or scaling your

51:21

company's security program, demonstrating top-notch security

51:23

practices and establishing trust is more

51:25

important than ever. Vanta automates compliance

51:28

for SOC 2, ISO 27001, and

51:30

more. With Vanta, you can streamline

51:32

security reviews by automating questionnaires and

51:34

demonstrating your security posture with a

51:36

customer-facing trust center. Over 7,000 global

51:39

companies use Vanta to manage risk

51:41

and prove security in real-time. Get

51:43

a thousand dollars off Vanta when

51:45

you go to vanta.com/hard fork. That's

51:47

vanta.com/hard fork for a thousand dollars

51:50

off. I'm Julie Turkwits. I'm a

51:52

reporter at the New York Times

51:54

to understand changes in migration. traveled

51:56

to the Darien Gap. Thousands have

51:58

been risking their lives to pass

52:01

through the border of Colombia and

52:03

Panama in the hopes of making

52:05

it to the United States. We

52:07

interviewed hundreds of people to try

52:09

and grasp what's making them go

52:12

to these lengths. New York Times

52:14

journalists spend time in these places

52:16

to help you understand what's really

52:18

happening there. You can support this

52:20

kind of journalism by subscribing to

52:23

the New York Times. Well

52:26

Casey we're here wearing our trained

52:29

conductor hats and my child's train

52:31

set is on the table in

52:33

front of us Which can only

52:36

mean one thing we're going to

52:38

train a large language model. Nope.

52:40

That's not what that means. It

52:43

means it's time to play a

52:45

game of the hot mess express

52:48

Paws for theme song Hot

52:53

Mess Express, Kevin is our segment where

52:55

we run through some of the messiest

52:57

recent tech stories and deploy our official

53:00

hot mess thermometer to tell you just

53:02

how messy we think things have gotten.

53:04

And Kevin, you better sit down for

53:07

this one. This is about a messy

53:09

week. Sure has. So why don't we

53:11

go ahead? Fire up the Hot Mess

53:14

Express and see what is the first

53:16

story coming down the charts. I hear

53:18

a faint chugga, chugga in my headphones.

53:21

Oh, it's pulling into the station. Casey,

53:23

what's the first cargo that our hot

53:25

mess express is carrying? All right, Kevin,

53:27

this first story comes to us from

53:30

the New York Times, and it says

53:32

that Fable, a book app, has made

53:34

changes after some offensive AI messages. Now

53:37

Casey, have you ever heard of Fable,

53:39

the book app? Well, not until this

53:41

story, Kevin, but I am told that

53:44

it is an app for sort of

53:46

keeping track of what you're reading, not

53:48

unlike a good reads, but also for

53:51

discussing what you're reading, and apparently this

53:53

app also offers some AI chat. Yeah,

53:55

you can have AI sort of summarize

53:57

the things that you're reading in a

54:00

personalized way, and this story said that

54:02

an... addition to spitting out bigoted and

54:04

racist language, the AI Inside Fable's book

54:07

app had told one reader who had

54:09

just finished three books by black authors,

54:11

quote, your journey dives deep into the

54:14

heart of black narratives and transformative tales,

54:16

leaving mainstream stories gasping for air. Don't

54:18

forget to surface for the occasional white

54:21

author, okay? And another personalized AI summary

54:23

that Fable Produce told another reader that

54:25

their book choices were, quote, making me

54:27

wonder if you're ever in the mood

54:30

for a straight cis white man's perspective.

54:32

And if you are interested in a

54:34

straight cis white man's perspective, follow Kevin

54:37

Roos on x.com. Now, Kevin, why do

54:39

we think this happened? I don't know,

54:41

Casey. This is a headscratcher for me.

54:44

I mean, we know that these apps

54:46

can can spit out. things that is

54:48

just sort of like part of how

54:51

they are trained and part of what

54:53

we know about them. I don't know

54:55

what model Fable was using under the

54:57

hood here, but yeah, this seems not

55:00

great. Well, it seems like we've learned

55:02

a lesson that we've learned more than

55:04

once before, which is that large language

55:07

models are trained on the internet, which

55:09

contains near infinite racism, and for that

55:11

reason, it will actually produce racism when

55:14

you ask it questions. In this case,

55:16

they were not successful. Fable's head of

55:18

community, Kim Marsh Alley, has said that

55:21

all features using AI are being removed

55:23

from the app and a new app

55:25

version is being submitted. to the app

55:27

store. So you always hate it when

55:30

the first time you hear about an

55:32

app is that they added AI and

55:34

it made it super racist and they

55:37

have to redo the app. Now Casey,

55:39

one more question before we move on.

55:41

Do you think this poses any sort

55:44

of competitive threat to Grock, which until

55:46

this story was the leading racist AI

55:48

app on the market? I do think

55:51

so. And I have to admit that

55:53

all the folks over at Grock are

55:55

breathing a sigh of relief now that

55:57

they have once again claimed the mantle.

56:00

All right. Casey, how hot is this

56:02

mess? Well Kevin, in my opinion, if

56:04

your AI is so bad that you

56:07

have to remove it from the app

56:09

completely, that's a hot mess. Yeah, I

56:11

rate this one a hot mess as

56:14

well. All right, next stop. Amazon pauses

56:16

drone deliveries after aircraft crashed in rain.

56:18

Casey, this story comes to us from

56:21

Bloomberg, which had a different line of

56:23

reporting than we did just a few

56:25

weeks ago on the show about Amazon's

56:27

drone program, Prime Air. Casey, what happened

56:30

to Amazon Prime Air? Well... If you

56:32

heard the episode of Hard Fork where

56:34

we talked about it, Amazon Prime Air

56:37

delivered us some Brazilian bumbum cream and

56:39

it did so without incident. However, Bloomberg

56:41

reports that Amazon has had to now

56:44

pause all of their commercial drone deliveries

56:46

after two of its latest models crashed

56:48

in rainy weather at a testing facility.

56:51

And so the company says it is

56:53

immediately suspending drone deliveries in Texas and

56:55

Arizona and will now fix the aircraft's

56:57

software. Kevin, how did you react to

57:00

this? deliveries before they fixed the software

57:02

because these things are quite heavy, Casey.

57:04

I would not want one of them

57:07

to fall in my head. I wouldn't

57:09

either. And I have to tell you,

57:11

this story gave me the worst kind

57:14

of flashbacks because in 2016, I wrote

57:16

about Facebook's drone, Echila, and its first,

57:18

what the company told me, had been

57:21

its first successful test flight in its

57:23

mission to deliver internet around the world

57:25

via drone. What the company did not

57:27

tell me when I was interviewing its

57:30

executives, including Mark Zuckerberg. the plane had

57:32

crashed after that first flight. And so

57:34

I was small detail. I'm sure it

57:37

was an innocent omission from their briefing.

57:39

Yes, I'm sure. Well, it was Bloomberg

57:41

again who reported, you know, a couple

57:44

months after I wrote this story that

57:46

the Facebook drone had crashed. I was

57:48

of course, hugely embarrassed and, you know,

57:51

wrote a bunch of stories about this.

57:53

But anyways, it really should have occurred

57:55

to me when we were out there

57:57

watching the Amazon drone that this thing

58:00

was also probable. secretly crashing and we

58:02

just hadn't found out about it yet

58:04

and indeed we now learn it is.

58:07

So here's my vow to you Kevin

58:09

as my friend and my co-host. If

58:11

we ever see a company fly anything

58:14

again we have to ask them. Did

58:16

this thing actually crash? Yeah. I'm tired

58:18

of being burned. Now Casey, we should

58:21

say, according to Bloomberg, these drones reportedly

58:23

crash in December. We visited Arizona to

58:25

see them in very early December. So

58:27

most likely, you know, this all happened

58:30

after we saw them. But I think

58:32

it's a good idea to keep in

58:34

mind that as we're talking about these

58:37

new and experimental technologies. that many of

58:39

them are still having the kinks worked

58:41

out. All right, Kevin, so let's get

58:44

out the thermometer. How hot of a

58:46

mess is this? I would say this

58:48

is a moderate mess. Look, these are

58:51

still testing programs. No one was hurt

58:53

during these tests. I am glad that

58:55

Bloomberg reported on this. I'm glad that

58:57

they've suspended the deliveries. These things could

59:00

be quite dangerous flying through the air.

59:02

I do think it's one of a

59:04

string of reported. incidents with these drones.

59:07

So I think they've got some quality

59:09

control work ahead of them and I

59:11

hope they do well on it because

59:14

I want these things to exist in

59:16

the world and be safe for people

59:18

around them. All right. I will agree

59:21

with you and say that this is

59:23

a warm mess and hopefully you can

59:25

get straightened out over there. Let's see

59:27

what else is coming down the tracks.

59:30

Fitbit has agreed to pay $12 million

59:32

for not quickly reporting burn risk with

59:34

watches. Kevin, do you hear about this?

59:37

I did. This was the fitbit. Devices

59:39

were like literally burning people. Yes, from

59:41

2018 to March of 2022, Fitbit received

59:44

at least a hundred and seventy four

59:46

reports globally of the lithium ion battery

59:48

in the Fitbit Ionic watch overheating, leading

59:51

to a hundred and eighteen reported injuries,

59:53

including two cases of third degree burns

59:55

and four of second degree burns. That

59:57

comes from the New York Times. Deal

1:00:00

Hassan, Kevin, I thought these things were

1:00:02

just supposed to burn calories. Well, it's

1:00:04

like I always say, exercising is very

1:00:07

dangerous and you should never do it.

1:00:09

And this justifies my decision not to

1:00:11

wear a fit bit. To me, the

1:00:14

biggest surprise of this story was that

1:00:16

people were wearing fit bits from March

1:00:18

2018 to 2022. I thought every fitbit

1:00:21

had been purchased by like 2011 and

1:00:23

then put in a drawer never to

1:00:25

be heard again. So what is going

1:00:27

on with these sort of late stage

1:00:30

fitbit buyers? I'd love to find out.

1:00:32

But of course. we feel terrible for

1:00:34

everyone who was burned by a fit

1:00:37

bit and it's not gonna be the

1:00:39

last time technology burns you. I mean

1:00:41

realistically. That's true. You know? It's true.

1:00:44

Now what kind of mess is this?

1:00:46

I would say this is a hot

1:00:48

mess. This is an officially hot, literally

1:00:51

hot, they're hot. Here's my sort of

1:00:53

rubric. If technology physically burns you, it

1:00:55

is a hot mess. If you have

1:00:57

physical burns on your body, what other

1:01:00

kind of mess could it be? Okay,

1:01:02

next stop on the Hot Mess Express.

1:01:04

Google says it will change Gulf of

1:01:07

Mexico to Gulf of America in Maps

1:01:09

app after government updates. Casey, have you

1:01:11

been following this story? I have, Kevin,

1:01:14

every morning when I wake up I

1:01:16

scan America's maps and I say, what

1:01:18

has been changed? And if so, has

1:01:21

it been changed for political reasons? And

1:01:23

this was probably one of the biggest

1:01:25

examples of that we've seen. Yeah, so

1:01:27

this was an interesting story that came

1:01:30

out. in the past couple of days.

1:01:32

Basically, after Donald Trump came out during

1:01:34

his first days in office and said

1:01:37

that he was changing the name of

1:01:39

the Gulf of Mexico to the Gulf

1:01:41

of America and the name of Denali,

1:01:44

the Mountain in Alaska, to Mount McKinley,

1:01:46

Google had to decide, well, when you

1:01:48

go on Google Maps and look for

1:01:51

those places, what should it call them?

1:01:53

It seems to be saying that it

1:01:55

is going to take inspiration from the

1:01:57

Trump administration and update the names of

1:02:00

these places in the maps app. Yeah,

1:02:02

and look, I don't think Google really

1:02:04

had a choice here. that the company

1:02:07

has been on Donald Trump's bad side

1:02:09

for a while, and if it had

1:02:11

simply refused to make these changes, it

1:02:14

would have sort of caused a whole

1:02:16

new controversy for them. And it is

1:02:18

true that the company changes place names

1:02:21

when governments change place names, right? Like

1:02:23

Google Maps existed when Mount McKinley was

1:02:25

called Mount McKinley, and President Obama changed

1:02:27

it to Janali, and Google updated the

1:02:30

map. Now it's changed back. They're doing

1:02:32

the same thing. Kevin, I think there's

1:02:34

room for Donald Trump to have a

1:02:37

lot of fun with the company. Yeah,

1:02:39

what can you do? Well, you could

1:02:41

call it the Gulf of Gemini isn't

1:02:44

very good and just see what would

1:02:46

happen. Because they would kind of have

1:02:48

to just change it. Can you imagine

1:02:51

every time you opened up Google Maps

1:02:53

and you looked at the Gulf of

1:02:55

Mexico slash America and just said the

1:02:57

Gulf of Gemini is not very good?

1:03:00

You know, I hate to give Donald

1:03:02

Trump any ideas, but I don't know.

1:03:04

I think this is a mild mess.

1:03:07

I think this is a tempest in

1:03:09

a teapot. I think that this is

1:03:11

the kind of update that companies make

1:03:14

all the time. Because places change names

1:03:16

all the time. Let's just say it.

1:03:18

Well, Kevin, I guess I would say

1:03:21

that one is a hot mess. Because

1:03:23

if we're just going to start renaming

1:03:25

everything on the map, that's just going

1:03:27

to get extremely confusing for me to

1:03:30

follow. I got places to go. You

1:03:32

go to like three places. Yeah, and

1:03:34

I use Google Maps to get there.

1:03:37

And I need them to be named

1:03:39

the same thing that they were yesterday.

1:03:41

I don't think they're gonna change the

1:03:44

name of Barry's boot camp. All right,

1:03:46

final stop on the Hot Mess Express.

1:03:48

Casey, bring us home. All right. Kevin,

1:03:51

and this is some sad news. Another

1:03:53

Waymo was vandalized. This is from one-time

1:03:55

hard-for-guess Andrew J. Hawkins at The Virgin.

1:03:57

He reports that this Waymo was vandalized

1:04:00

during an illegal street takeover near the

1:04:02

Beverly Center in LA. Video from Fox

1:04:04

11 shows a crowd of people basically

1:04:07

dismantling the driverless. piece by piece and

1:04:09

then using the broken pieces to smash

1:04:11

the windows. Kevin, what did you make

1:04:14

of this? Well Casey, as you recall,

1:04:16

you predicted that in 2025, Waymo would

1:04:18

go mainstream and I think there is

1:04:21

no better proof that that is true

1:04:23

than that people are turning on the

1:04:25

Waymo's and starting to beat them up.

1:04:27

Yeah, I, you know, look, I don't...

1:04:30

know that we have heard in the

1:04:32

interviews from why these people were doing

1:04:34

this. I don't know if we should

1:04:37

see this as like a reaction against

1:04:39

AI in general or of Waymo's specifically.

1:04:41

But I always find it like weird

1:04:44

and sad when people attack Waymos because

1:04:46

they truly are safer cars than every

1:04:48

other car. Well, not if you're going

1:04:51

to be riding in them and people

1:04:53

just going to start like beating the

1:04:55

car, then they're not safer. No, but

1:04:57

you know, that's only happened a couple

1:05:00

times that we're aware of. Right. Yeah.

1:05:02

So yeah, this story is sad to

1:05:04

me. Obviously people are reacting to Waymos.

1:05:07

Maybe they have sort of fears about

1:05:09

this technology or think it's going to

1:05:11

take jobs. or maybe they're just pissed

1:05:14

off and they wanna break something. But

1:05:16

don't hurt the way most people, in

1:05:18

part, because they will remember. They will

1:05:21

remember. They will remember, and they will

1:05:23

come for you. I'm not sure that

1:05:25

that's true, but I think we should

1:05:27

also note that Waymo only became officially

1:05:30

available in LA in November of last

1:05:32

year. And so part of this just

1:05:34

might be a reaction to the newness

1:05:37

of it all and people getting a

1:05:39

little carried away, just sort of curious,

1:05:41

what will happen if we try to,

1:05:44

you know, destroy this thing? Will it

1:05:46

deploy defensive measures and so on? So

1:05:48

they're going to have to put flame

1:05:51

throwers on them. I'm just calling it

1:05:53

right now. one was. I think this

1:05:55

one is a is a lukewarm mess

1:05:57

that has the potential to escalate. I

1:06:00

don't want this to happen. I sincerely

1:06:02

hope this does not happen, but I

1:06:04

can see as Waymo start. being rolled

1:06:07

out across the country, that some people

1:06:09

are just going to lose their minds.

1:06:11

Some people are going to see this

1:06:14

as the physical embodiment of technology invading

1:06:16

every corner of our lives, and they

1:06:18

are just going to react in strong

1:06:21

and occasionally destructive ways. I'm sure that

1:06:23

Waymo has gamed this all out. I'm

1:06:25

sure that this does not surprise them.

1:06:27

I know that they have been asked

1:06:30

about what happens if Waymo's start getting

1:06:32

vandalized and they presumably have plans to

1:06:34

deal with that, including prosecuting the people

1:06:37

who are doing this. But yeah, I

1:06:39

always go out of my way to

1:06:41

try to be nice to Waymo's. And

1:06:44

in fact. Some other Waymo news this

1:06:46

week, Jane Manchin Wong, the security researcher,

1:06:48

reported on X recently that Waymo is

1:06:51

introducing or at least testing a tipping

1:06:53

feature. And so I'm gonna start tipping

1:06:55

my Waymo just to make up for

1:06:57

all the jerks in LA who are

1:07:00

vandalizing them. It looks like the tipping

1:07:02

feature, by the way, will to be

1:07:04

to tip a charity and that Waymo

1:07:07

will not keep that money. At least

1:07:09

that's what's been reported. No, I think

1:07:11

it's going to the flame thorough. for

1:07:14

taking this journey with me. Whether

1:07:32

you're starting or scaling your company's security

1:07:34

program, demonstrating top-notch security practices and establishing

1:07:36

trust is more important than ever. Vanta

1:07:39

automates compliance for SOC 2, ISO 27001,

1:07:41

and more. With Vanta, you can streamline

1:07:43

security reviews by automating questionnaires and demonstrating

1:07:45

your security posture with a customer-facing trust

1:07:48

center. Over 7,000 global companies use Vanta

1:07:50

to manage risk and prove security in

1:07:52

real-time. Get a thousand dollars off Vanta

1:07:54

when you go to... of.com/Hard Fork.

1:07:57

That's Vanta.com slash Hard

1:07:59

Fork for ,000 dollars off. Hard

1:08:01

Fork is produced by Rachel Fork

1:08:04

is produced by We're

1:08:06

and Whitney Jones. by

1:08:08

edited this week

1:08:10

by Rachel Dry by Ena

1:08:13

by Today's show Today's

1:08:15

show was engineered by

1:08:17

Dan Powell. Original

1:08:19

music by Wong and Dan and

1:08:22

Dan Powell. Our executive

1:08:24

producer is Jen Our Our

1:08:26

audience editor is Gololioli. Video production

1:08:28

Ryan Manning and and Chris Shot.

1:08:30

You can watch this

1:08:32

whole episode on on YouTube.com slash

1:08:34

slash Hard Special thanks to to

1:08:36

Paul Shuman, Puing Tam, Dahlia Hidad, and Jeffrey Miranda.

1:08:39

You You can email us

1:08:41

at Hard Fork at y.com with what

1:08:43

you're calling the Gulf

1:08:45

of Mexico. of Mexico.

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features