Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
Whether you're starting or scaling your
0:02
company's security program, demonstrating top-notch security
0:04
practices and establishing trust is more
0:07
important than ever. Vanta automates compliance
0:09
for SOC 2, ISO 27001, and
0:11
more. With Vanta, you can streamline
0:14
security reviews by automating questionnaires and
0:16
demonstrating your security posture with a
0:18
customer-facing trust center. Over 7,000 global
0:21
companies use Vanta to manage risk
0:23
and prove security in real-time. Get
0:25
a thousand dollars off Vanta when
0:28
you go to vanta.com slash daily.
0:30
vanta.com/hard fork for a thousand dollars
0:32
off. I just got my weekly,
0:34
you know, I set up ChatGPT to
0:36
email me a weekly affirmation before we
0:39
start taping because you can do that
0:41
now with the tasks feature. Yeah, people
0:43
say this is the most expensive way
0:46
to email yourself for a reminder. So
0:48
what sort of affirmation did we get?
0:50
Today it said, you are an incredible
0:52
podcast host, sharp, engaging, and completely in
0:55
command of the mic. You're taping today
0:57
is going to be phenomenal, and you're
0:59
going to absolutely kill it. Wow, and
1:02
that's why it's so important that ChatGPT
1:04
can't actually listen to podcasts, because I
1:06
don't think it would say that if
1:08
it actually ever hurt us. It would
1:11
say, just get this over with it.
1:13
Get on with it! I'm Kevin
1:15
Roos, a tech columnist at the New
1:17
York Times. I'm Casey Noon from Platformer.
1:19
And this is Hard Fork. This
1:21
week, we go deeper on Deep Seek.
1:23
China Talks, Jordan Schneider joins us to
1:26
break down the race to build powerful
1:28
AI. Then, hello, operator. Kevin and I
1:30
put open AI's new agent software to
1:32
the test. And finally, the train is
1:34
coming back to the station for a
1:36
round of hot mess express. Well
1:46
Casey it is rare that we spend two
1:48
consecutive episodes of this show talking
1:50
about the same company, but I
1:52
think it is fair to say
1:54
that what is happening with Deep
1:56
Seek has only gotten more interesting
1:58
and more confusing. Yeah, that's right.
2:00
It's hard to remember a story
2:02
in recent months, Kevin, that has
2:05
generated quite as much interest as
2:07
what is going on with Deep
2:09
Seek. Now, Deep Seek for anyone
2:11
catching up is this relatively new
2:13
Chinese AI startup that released some
2:15
very impressive and cheap AI models
2:17
this month that lots of Americans
2:19
have started downloading and using. Yeah,
2:21
so some people are calling this
2:23
a Sputnik moment for the AI
2:25
industry when kind of every nation
2:27
perks up and starts, you know,
2:29
paying attention at the same time to
2:31
the AI arms race. Some people are saying
2:34
this is the biggest thing to happen in
2:36
AI since the release of China. GPT, but
2:38
Casey, why don't you just catch us up
2:40
on what has been happening since
2:43
we recorded our emergency podcast episode
2:45
just two days ago? Well, I
2:47
would say that there have probably
2:49
been three stories, Kevin, that I
2:51
would share to give you a
2:53
quick flavor of what's been going
2:55
on. One, a market research firm
2:57
says Deep Seek was downloaded 1.9
3:00
million times on iOS in recent
3:02
days, and about 1.2 million times
3:04
on the Google Play store. The
3:06
second thing I would point out
3:08
is that Deep Seek has been
3:10
banned by the US Navy over
3:12
security concerns, which I think is
3:14
unfortunate, because what is a submarine
3:16
doing, if not Deep Seeking? It
3:18
was also banned in Italy, by
3:20
the way, after the data protection
3:22
regulator made an inquiry. And finally,
3:24
Kevin, Open AI says that there
3:26
is evidence that Deep Seek distilled
3:29
its models. Distillation is kind of
3:31
the AI lingo or euphemism for
3:33
they used our API to try
3:35
to unravel everything we were doing
3:37
and use our data in ways
3:39
that we don't approve of. And
3:41
now Microsoft and Open AI are
3:43
now jointly investigating whether Deep Seek abused
3:45
their API. And of course we can
3:47
only imagine. how open AI is feeling
3:50
about the fact that their data might
3:52
have been used without payment or consent.
3:54
Oh yeah, must be really hard to
3:57
think that someone might be out there
3:59
trading AI. on your data without permission.
4:01
And I want to acknowledge that literally every
4:03
single user, a blues guy, already made this
4:05
joke, but they were all funny, and I'm
4:07
so happy to repeat it here on Hard
4:10
Fork this week. Now Kevin, as always, when
4:12
we talk about AI, we have certain disclosures
4:14
to make. New York Times Company is currently
4:16
suing Open AI and Microsoft over copyright violations
4:18
alleged related to the use of their copyrighted
4:21
data to train AI models. I think that
4:23
was good. It was very good. And I'm
4:25
in love with a man who works in
4:27
anthropic. But that said Kevin, we have even
4:29
further we want to go into the
4:31
Deep Seek story and we want to
4:33
do it with the help of Jordan
4:35
Schneider. Yes, we are bringing in the
4:37
big guns today because we wanted to
4:39
have a more focused discussion about Deep
4:41
Seek that is not about, you know,
4:43
the stock market or how the American
4:45
AI companies are reacting to this, but
4:48
is about one of the biggest sets
4:50
of questions that all of this raises,
4:52
which is what is China up to
4:54
with deep seek and AI more broadly?
4:56
What are the geopolitical implications of
4:58
the fact that Americans are now
5:00
obsessing over this Chinese-made AI app?
5:02
What does it mean for deep
5:04
seeks prospects in America? What does
5:07
it mean for their prospects in
5:09
China? And how do it's all
5:11
this fit together from the Chinese
5:13
perspective? So... Jordan Schneider is our
5:16
guest today. He's the founder and
5:18
editor-in-chief of China Talk, which is
5:20
a very good newsletter and podcast
5:22
about US-China tech policy. He's been
5:24
following the Chinese AI ecosystem for
5:27
years. And unlike a lot of American
5:29
commentators and analysts who were sort of
5:31
surprised by Deep Seek and what they
5:33
managed to pull off over the last
5:35
couple weeks, I'll say it. I was
5:38
surprised. Yeah, me too. But Jordan has
5:40
been following this company for a long
5:42
time and a big... focus of China
5:44
Talk, his newsletter and podcast, has been
5:46
translating literally what is going on in
5:49
China into English, making sense of it
5:51
for a Western audience, and keeping tabs
5:53
on all the developments there. So perfect
5:55
guest for this week's episode, and
5:57
I'm very excited for this conversation.
6:00
Yes, I have learned a lot
6:02
from China Talk in recent days
6:04
as I've been boning up on
6:06
Deep Seek, so we're excited
6:08
to have Jordan here,
6:10
and let's bring him in.
6:13
Joiner Schneider, welcome to Hard
6:15
Fork! Oh my God, such a huge
6:17
fan. This is such an honor. We're so
6:19
excited. I have learned truly so much from
6:21
you this week. And so when we were
6:24
talking about what to do this week, we
6:26
just looked at each other and said, we
6:28
have got a see of Jordan can come
6:30
on this podcast. Yeah. So this has been
6:32
a big week for Chinese tech policy, maybe
6:35
the biggest week for Chinese tech policy, at
6:37
least that I can remember. I realized that
6:39
something important was happening last weekend when I
6:41
started getting texts from like all of my...
6:44
non-tech friends being like, what is going on
6:46
with Deep Seek? And I imagine you
6:48
had a similar reaction because you
6:50
are a person who does constantly
6:52
pay attention to Chinese tech policy.
6:54
So I've been running China Talk for
6:56
eight years and I can get my
6:58
family members to maybe read like one
7:01
or two editions a year and the
7:03
same exact thing happened with me Kevin
7:05
where all of a sudden I got
7:07
oh my god Deep Seek like it's
7:10
on the cover of the New York
7:12
Post Jordan you're so clairvoyance like maybe
7:14
I should read you more. I'm like
7:16
okay thanks mom appreciate that. Yeah, so
7:18
I want to talk about Deep Seek
7:20
and what they have actually done here,
7:23
but I'm hoping first that you can
7:25
kind of give us the basic lay
7:27
of the land of the sort of
7:29
Chinese AI ecosystem, because that's not an
7:31
area where Casey or I have spent
7:34
a lot of time looking, but tell
7:36
us about Deep Seek and sort of
7:38
where it sits in the overall Chinese
7:40
industry. So Deep Seek is a really
7:42
odd... It was born out of
7:45
this very successful quant hedge fund.
7:47
The CEO of which basically after
7:49
ChatGPT was released was like, okay,
7:52
this is really cool. I want
7:54
to spend some money and some
7:57
time and some compute and hire
7:59
some. fresh young graduates to see if
8:01
we can give it a shot to
8:03
make our own language models. And so
8:06
a lot of companies are out there
8:08
building their own large language models. What
8:10
was the first thing that happened that
8:12
made you think, oh, this one, this
8:14
company is actually making some interesting
8:16
ones. Sure. So there are lots
8:18
and lots of very money to
8:20
Chinese companies that have been trying
8:22
to follow a similar path after
8:24
chat. giant players like Ali Baba,
8:27
Tencent, Bight Dance, Huawei even,
8:29
trying to, you know, create
8:31
their own open AI, basically.
8:33
And what is remarkable is
8:35
the big organizations can't quite
8:37
get their head around creating
8:39
the right organizational institutional structure
8:41
to incentivize this type of
8:43
collaboration and research that leads
8:45
to real breakthroughs. So, you
8:47
know, Chinese firms have been
8:49
releasing models for years now,
8:51
but deep seek seek because.
8:53
because of the way that
8:55
it structured itself and the
8:57
freedom they had not necessarily being
8:59
under a direct profit motive, they
9:01
were able to put out some
9:03
really remarkable innovations that caught the
9:05
world's attention, you know, starting maybe
9:08
late December, and then, you know,
9:10
really blew everyone's mind with the
9:12
release of the R1 chatbot. Yeah,
9:14
so let's talk about R1 in
9:16
just a second, but one more
9:18
question for you, Jordan, about Deep
9:20
Seek. What? do we know about
9:22
their motivation here? Because so much
9:25
of what has been puzzling American
9:27
tech industry watchers over the last
9:29
week is that this is not
9:31
a company that has sort of
9:34
an obvious business model connected to
9:36
its AI research. We know why
9:38
Google is developing AI because it
9:41
thinks it's going to make the
9:43
company Google much more profitable. We
9:45
know why Open AI is developing
9:47
advanced AI models. It does not
9:50
seem obvious to me, and I
9:52
have not read anything from people
9:54
involved in Deep Seek, about why
9:56
they are actually doing this and
9:58
what their ultimate... goal is. So
10:01
can you help us understand
10:03
that? We don't have a lot
10:05
of data, but my base case,
10:07
which is based on two extended
10:09
interviews that the Deep C CEO
10:11
released, which we've translated on China
10:14
Talk, as well as just like
10:16
what Deep C employees have been
10:18
tweeting about in the West, and
10:20
then domestically, is that their dreamers.
10:22
I think the right mental model
10:24
is open AI, you know, 2017
10:26
to 2022. Like, I'm sure you
10:28
could ask the same thing, like,
10:30
what the hell are they doing?
10:33
literally said, I have no idea
10:35
how we're ever going to make
10:37
money, right? And here we are
10:39
in this grand new paradigm. So
10:41
I really think that they do
10:43
have this like vision of AGI
10:45
and like, look, we'll build it
10:47
and we'll make it cheaper for
10:49
everyone, you know, we'll make it
10:52
cheaper for everyone, and like, we'll
10:54
figure it for everyone, we'll figure
10:56
it out, you know, we'll make
10:58
it cheaper for everyone, you know,
11:00
we'll figure it in deep. but
11:02
bite dance or Ali or Tencent or
11:04
Hualway and the government's going to start
11:06
to pay attention in a way which
11:09
it really hasn't over the past few
11:11
years. Right and I want to I want to
11:13
drill down a little bit there because
11:15
I think one thing that most listeners
11:17
in the West do know about Chinese
11:19
tech companies is that many of them
11:21
are sort of inextricably linked
11:24
to the Chinese government that the
11:26
Chinese government has. access to user
11:28
data under Chinese law, that these
11:30
companies have to follow the Chinese
11:32
censorship guidelines. And so as soon
11:34
as Deep Sikhs started to really
11:36
pop in America over the last
11:39
week, people started typing in things
11:41
to Deep Sikhs model, like tell
11:43
me about what happened at Tiananmen
11:45
Square or tell me about Xi
11:47
Jinping or tell me about the
11:49
great leap forward. And it just
11:52
sort of wouldn't do it at all.
11:54
And so people I think saw that and
11:56
said, oh, this is... This is like every
11:58
other Chinese company that has this. sort of
12:00
hand-in-glove relationship with the Chinese ruling
12:03
party, but it sounds from what
12:05
you're saying like Deep Seek has
12:07
a little bit more complicated a
12:10
relationship to the Chinese government
12:12
than maybe some other better-known
12:15
Chinese tech companies. So explain
12:17
that. Yeah, I mean I think
12:19
it's it's it's important like the
12:21
mental model you should have for
12:23
these CEOs are not like people
12:25
who are dreaming to spread Xi
12:27
Jinping thought. Like what they want
12:29
to do is compete. with Mark
12:31
Zuckerberg and Sam Altman and show
12:33
that they're like really awesome and
12:35
great technologists. But the tragedy is,
12:37
is let's take bite dance for
12:39
example, you can look at Jianguming,
12:41
their CEOs, Weibo posts from 2012,
12:43
2013, 2014, which are super liberal
12:45
in a Chinese context, saying like, you
12:47
know, we should have freedom of expression,
12:49
like we should be able to do
12:52
whatever we want. And the early years
12:54
of bite dance, there was a lot
12:56
of relatively more subversive content on the
12:58
platform. you sort of saw like real
13:00
poverty in China, you saw off-color jokes,
13:02
and then all of a sudden in
13:04
2018, he posts a letter saying, I
13:07
am really sorry, like, I need to
13:09
be part of this sort of like
13:11
Chinese national project and like better adhere
13:13
to, you know, modern Chinese socialist values
13:15
and I'm really sorry and it won't
13:17
ever happen again. You know, the same thing
13:19
happened with DD, right? Like they don't really
13:22
want to have... to do anything with politics
13:24
and then they get on someone's side
13:26
and all of a sudden they get
13:28
zapped. DD is of course the big
13:30
Chinese ride share company. Correct. Yeah. What
13:32
did DD do? So they listed on the
13:34
Western Stock Exchange after the Chinese government
13:36
told them not to and then they
13:38
got taken off app stores and it
13:41
was a whole giant. nightmare like they
13:43
had to sort of go through their
13:45
rectification process. So point being with Deep
13:47
Seek right is like now they are
13:49
whether they like it or not going
13:52
to be held up as a national
13:54
champion and that comes with a lot
13:56
of headaches and responsibilities from you know
13:58
potentially giving the Chinese more access, you
14:00
know, having to fulfill government contracts,
14:02
which like honestly are probably really
14:05
annoying for them, to do in
14:07
sort of distracting from the broader
14:09
mission they have of developing and
14:11
deploying this technology in the widest
14:13
range possible. But like Deep Seek
14:15
thus far has flown under the
14:17
radar, but that is no longer
14:19
the case and things are about
14:21
to change for them. Right. And
14:23
I think that was one of
14:25
the surprising things about Deep Seek. for
14:27
the people I know, including you, who
14:29
follow Chinese tech policy, is, you know,
14:31
I think people were surprised by the
14:33
sophistication of their models, and we talked
14:36
about that on the emergency pod that
14:38
we did earlier this week, and how
14:40
cheaply they were trained. But I think
14:42
the other surprise is that they were
14:44
released as open-source software, because, you know,
14:46
one thing that you can do with
14:48
open-source software is download it, hosted in
14:51
another country, removed some of the guardrails
14:53
and the censorship filters that might have
14:55
been part of the original model, But
14:57
by the way, it turned out there weren't even
14:59
really guardrails on the on the V3
15:01
model, right? That it had not been
15:03
trained to avoid questions about Tiananmen Square
15:05
or anything. So that was another really
15:07
unusual thing about this. Right. And one
15:09
thing that we know about Chinese technology
15:12
products is that they don't tend to
15:14
be released that way. They tend to
15:16
be hosted in China and overseen by
15:18
Chinese teams who can make sure that
15:20
they're not out there talking about Tiananmen
15:22
Square. Is the open source nature of what
15:24
Deep Seek has done here part of
15:26
the reason that you think there might
15:28
be conflict looming between them and the
15:30
Chinese government? You know, honestly, I think
15:32
this whole askate about Tiananmen stuff is
15:35
a bit of a red herring on
15:37
a few dimensions. So first, one of
15:39
these like arguments that there's a little
15:41
sort of confusing to me is like
15:43
folks used to say, oh, like the
15:46
Chinese models are going to be lobotomized
15:48
and like they will never be as
15:50
smart as the Western ones because like
15:52
they have to be politically correct. I
15:54
mean, look, if you ask Claude to say
15:57
racist things, it won't. And Claude's still
15:59
pretty smart. a bit of a red
16:01
herring when talking about sort of long-term
16:03
competitiveness of Chinese and Western models. Now,
16:05
you asked me like, oh, so they
16:07
released this model globally and it's open
16:09
source, maybe someone in the Chinese government
16:11
would be uncomfortable with the fact that
16:14
people can get a Chinese model to
16:16
say things that would get you thrown
16:18
in jail if you posted them online
16:20
in China. It's going to be a
16:22
really interesting calculus for the Chinese government
16:24
to make, because on the one hand,
16:26
this is the most positive shine that
16:28
Chinese AI has got globally in the
16:31
history of Chinese AI. So they're
16:33
gonna have to navigate this and
16:35
it might prompt some uncomfortable conversations
16:37
and bring regulators to a place
16:39
they wouldn't have otherwise landed.
16:41
Now, Jordan, I want to ask you
16:43
about something that people have been talking
16:46
about and speculating about in relationship to
16:48
the Deep Seek news for the last
16:50
week or so, which is about chip
16:53
controls. So we've talked a little bit
16:55
on the show earlier this week about
16:57
how Deep Seek managed to put together
17:00
these models. using some of these kind
17:02
of second-rate chips from invidia that are
17:04
allowed to be exported to China. We've
17:06
also talked about the fact that you
17:09
cannot get the most powerful chips legally
17:11
if you are a Chinese tech
17:13
company. So there have been some
17:15
people, including Elon Musk and other
17:18
American tech luminaries, who have said,
17:20
oh, well, Deep Seek has this
17:22
sort of secret stash of these
17:24
banned chips that they have smuggled
17:26
into the country. and that actually
17:28
they are not making due with
17:31
kind of the Kirkland signature chips
17:33
that they say they are. What
17:35
do we know about how true
17:37
that is? So, did Deep Seek
17:39
have band ships? It's kind of
17:41
impossible to know. This is a question
17:43
more for the US intelligence community than
17:45
like Jordan Schneider on Twitter. But I
17:48
do think that it is important to
17:50
understand that the delta between what you
17:52
can get in the West and what
17:54
you can get in China is actually
17:56
not that big. And, you know, we're
17:58
talking about training a lot, but on
18:00
the infrared side, trying to can still
18:02
buy this H20 chip from a video, which
18:04
is basically world-class at like deploying
18:06
the AI and letting everyone use it.
18:08
So does this mean that we should
18:11
just give up? I don't think so.
18:13
Compute is going to be in a
18:15
core input, regardless of how much model
18:17
distillation you're going to have in the
18:19
future. There have been a lot of
18:22
quotes even from the deep seek founder
18:24
basically saying like the one thing that's
18:26
holding us back are these export controls.
18:29
Right. Okay, I want to ask a big
18:31
picture question. Sure. I think
18:33
that a reason that people have
18:35
been so fascinated by this Deep
18:37
Seek story is that at least
18:40
for some folks, it seems to
18:42
change our understanding of where China
18:44
is in relation to the United
18:47
States when it comes to developing
18:49
very powerful AI. Jordan, what is
18:51
your assessment of what? the V3
18:54
and R1 models mean? And to
18:56
what extent do you think the
18:58
game has actually changed here? I'm
19:00
not really sure the game has
19:03
changed so much. Like Chinese engineers
19:05
are really good. I think it
19:07
is a reasonable base case that
19:09
Chinese firms will be able to
19:11
develop comparable or fast follow on the
19:13
model side. But the real sort of
19:16
long-term competition is not just going to
19:18
be on developing the models, but deploying
19:20
them and deploying them at scale. And
19:22
that's really where compute comes in, and
19:25
that's why expert controls are going to
19:27
continue to be a really important piece
19:29
of America's strategic arsenal when it comes
19:31
to making sure that the 21st century
19:34
is defined by the US and our
19:36
friends as opposed to China and theirs.
19:38
Right. So it's one thing to have
19:40
a model that is about as capable
19:42
as the models that we have here
19:44
in the United States. It's another thing
19:46
to have the energy to actually let
19:49
everyone use them as much as they
19:51
want to use them. What you're saying
19:53
is no matter what Deep Seek may
19:55
have invented here, that fundamental dynamic has
19:57
not changed. China simply does not have
19:59
nearly the... amount of compute that the United
20:01
States has. As long as we don't
20:03
screw up export controls. So I
20:05
think the sort of base case
20:08
for me is that if the
20:10
US stays serious about holding a
20:12
line on semiconductor manufacturing equipment and
20:14
export of AI chips, then it
20:17
will be incredibly difficult for the
20:19
Chinese sort of broader semiconductor and
20:21
AI ecosystem to leap ahead much
20:24
less kind of like fast follow
20:26
beyond being able to develop comparable
20:28
models. I'm feeling good as long
20:30
as you know, Trump doesn't make
20:33
some like crazy trade for, you
20:35
know, soybeans in exchange for ASML
20:37
EU machines. That would really break
20:39
my heart. I want to inject
20:41
kind of a note of skepticism
20:44
here because I buy everything you're
20:46
saying about how Deep Seek's progress
20:48
has been sort of bottlenecked by
20:51
the fact that it can't get
20:53
these very powerful American AI chips
20:56
from companies like invidia. But
20:58
I also am hearing... people who
21:00
I trust say things that make
21:02
me think that actually the bottleneck
21:05
may not be the availability of
21:07
chips that maybe with some of
21:09
these algorithmic efficiency breakthroughs that Deep
21:12
Seek and others have been making,
21:14
it might be possible to run
21:16
a very very powerful AI model
21:19
on a conventional piece of hardware
21:21
on a Mac book even. And
21:23
I wonder about How much of
21:26
this is just like AI companies
21:28
in the West trying to cope,
21:30
trying to make themselves feel better,
21:32
trying to reassure the market that
21:35
they are still going to make
21:37
money by investing billions and billions
21:39
of dollars into building powerful AI
21:41
systems? If these models do
21:43
just become sort of lightweight commodities
21:46
that you can run on a
21:48
much less powerful cluster of computers,
21:50
or maybe on one computer, doesn't
21:53
that just mean we
21:56
can't? control the proliferation
21:58
of them at like this is
22:00
one potential future and maybe that potential
22:03
future like went up 10 percentage
22:05
points of likelihood of like you
22:07
being able to fit the biggest
22:09
badest smartest most fast efficient AI
22:11
model on something that you that
22:13
can sit in your home but
22:15
I think there are lots of
22:17
other futures in which sort of
22:19
the world doesn't necessarily play out
22:21
that way and look in video
22:23
went down 15% it didn't go
22:26
it didn't go down 95% like
22:28
I think if we're really in
22:30
that world where chips don't matter
22:32
because everything can be shrunk down
22:34
to kind of consumer grade hardware
22:36
then the sort of reaction that
22:38
I think you would have seen
22:40
in the stock market would have
22:42
been even more dramatic than the
22:44
kind of freak out we saw
22:46
over this week so we'll see
22:48
I mean it would be a
22:50
really remarkable kind of democratizing thing
22:52
if that was the future we
22:54
ended up living in, but it
22:56
still seems pretty unlikely to my
22:58
history major brain here. I would
23:00
also just point out, Kevin, that
23:02
when you look at what Deep
23:04
Seek has done, they have created
23:07
a really efficient version of a
23:09
model that American companies themselves had
23:11
trained like nine to 12 months
23:13
ago. So they sort of caught
23:15
up very quickly. And there are
23:17
fascinating technological innovations in what they
23:19
did. But in my mind, these
23:21
are still primarily optimization. Like for
23:23
me, what would tip me over
23:25
into like, oh my gosh, America
23:27
is losing this race is China
23:29
is the first one out of
23:31
the gate with a virtual co-worker, right?
23:34
Or like a truly phenomenal agent. Some
23:36
sort of leap forward in the technology
23:38
as opposed to we've caught up really
23:41
quickly and we've figured out something more
23:43
efficiently. Are you saying it differently than
23:45
that? I mean, I guess I just
23:48
don't know what like a six-month lag
23:50
would buy us if it does take
23:52
six months for the Chinese AI companies
23:55
like Deep Seek to sort of catch
23:57
up to the state of the art.
23:59
I was struck by Adari
24:01
Amade, who's the CEO of
24:04
Anthropic, wrote an essay just
24:06
today about Deep Seek and
24:08
Export Controls. And in it,
24:10
he makes this point about
24:12
the sort of difference between
24:15
living in what he called
24:17
a unipolar world, where one
24:19
country or one block of
24:21
countries has access to something
24:23
like an AGI or an
24:25
ASI, and the rest of
24:28
the world doesn't. versus the
24:30
situation where China gets there roughly
24:32
around the same time that we
24:34
do. And so we have this
24:36
bipolar world where two blocks of
24:38
countries, the East and the West,
24:40
basically have access to this equivalent
24:42
technology. And so- And of course
24:44
in a bipolar world, sometimes we're
24:47
very happy and sometimes we're very
24:49
sad. Exactly. So I just think
24:51
like, whether we get there six months
24:53
ahead of them or not. I just
24:55
feel like there isn't that much of
24:57
a material difference. But Jordan, maybe I'm
24:59
wrong. Can you make the other side
25:01
of that? That it really does matter. I'm
25:03
kind of there. I, you know, I'll
25:06
take a little bit of issue with
25:08
what Darrio says. And I think, you
25:10
know, what one of the lessons that
25:12
Deep Sea shows is we should expect
25:14
a base case of Chinese model makers
25:16
being able to fast follow the innovations,
25:18
which, by the way, Casey. actually do
25:20
take those giant data centers to run
25:22
all the experiments in order to find
25:24
out, you know, what is this sort
25:26
of future direction you want to take
25:28
your model? And what sort of AI
25:30
is going to come down to is
25:32
not just creating the model, not just
25:34
sort of like Dario envisioning the future
25:36
and then all of a sudden like
25:39
things happen. Like there's gonna be a
25:41
lot of messiness in the implementation and
25:43
there are gonna be sort of like
25:45
teachers unions who are upset that AI
25:47
comes in the classroom and there are
25:49
gonna be like all these regulatory pushbacks
25:51
and a lot of societal reorganization which
25:53
is gonna need to happen just like
25:55
it did during the industrial revolution. So
25:57
look model making is a frontier of
25:59
competition. Compute access is a frontier of
26:01
competition, but there's also this broader like
26:04
how will a society kind of adopt
26:06
and cope with all of this new
26:08
future that's going to be thrown in
26:10
our faces over the coming years. And
26:12
I really think it's that just as
26:15
much as the model development and the
26:17
compute, which is going to determine which
26:19
countries are going to gain the most
26:21
from what AI is going to offer
26:23
us. Yeah. Well, Jordan, thank you
26:26
so much for joining and
26:28
explaining all of this to
26:30
us. I feel more enlightened.
26:32
Me too. Oh, my pleasure.
26:34
My chain of thought has
26:36
just gotten a lot longer.
26:39
That's an AI joke. Let
26:41
me come back. Kevin, there's
26:43
an agent at our door.
26:45
Is it Jerry McGuire? No,
26:47
it's an AI one. Oh,
26:49
okay. Whether you're starting or
26:54
scaling
26:56
your
26:59
company's
27:01
security
27:06
program,
27:10
Demonstrating top-notch security practices and establishing
27:12
trust is more important than ever.
27:14
Vanta automates compliance for SOC 2,
27:17
ISO 27001, and more. With Vanta,
27:19
you can streamline security reviews by
27:21
automating questionnaires and demonstrating your security
27:24
posture with a customer-facing trust center.
27:26
Over 7,000 global companies use Vanta
27:29
to manage risk and prove security
27:31
in real-time. Get a thousand dollars
27:33
off Vanta when you go to
27:36
vanta.com/hard fork. That's vanta.com/hard fork for
27:38
a thousand dollars off. The New
27:40
York Times app has all this stuff that you
27:42
may not have seen. The way the tabs are
27:45
at the top with all of the different
27:47
sections. I can immediately navigate to something
27:49
that matches what I'm feeling. Play
27:51
wordle or connections and then swipe
27:53
over to read today's headlines. There's
27:55
an article next to a recipe
27:57
next to a recipe, next to
27:59
games. It's just easy to
28:01
get everything in one place.
28:04
This app is essential. The
28:06
New York Times app, all
28:08
of the times, all in
28:10
one place. Download it now
28:12
at nytimes.com slash app.
28:15
operator information. Give me Jesus
28:17
on the line. Do you
28:19
know that one? No. Do
28:21
you know operator by Jim
28:23
Croci? No. operator. Oh, won't
28:26
you help me post this
28:28
call? Well Casey call
28:30
your agent because today we're talking
28:32
about AI agents Why do I need to
28:35
call my agent? I don't know I
28:37
just sounded good. Okay, well I appreciate
28:39
the effort, but yes Kevin because For
28:41
months now, the big AI labs have
28:43
been telling us that they are going
28:46
to release agents. This year, agents of
28:48
course, being software that can essentially use
28:50
your computer on your behalf or use
28:53
a computer on your behalf. And the
28:55
dream is that you have sort of
28:57
a perfect virtual assistant or co-worker. You
28:59
name it. If they are somebody who
29:02
might work with you at your job,
29:04
the AI labs are saying, we are
29:06
building that for you. Yeah, so last
29:09
year toward the end of the year
29:11
we started to see kind of these
29:13
demos, these these previews that companies like
29:15
Anthropic and Google were working on. Anthropic
29:18
released something called computer use, which was
29:20
an AI agent, a sort of very
29:22
early preview of that. And then Google
29:24
had something called Project Mariner that I
29:27
got a demo of, I believe in
29:29
December, that was basically the same thing,
29:31
but their version of it. And then
29:33
just last week, Open AI announced that
29:36
it was launching. which is its first
29:38
version of an AI agent, and unlike
29:40
Anthropic and Google's, which you either had
29:42
to be a developer or part of
29:44
some early testing program to access, you
29:46
and I could try it for ourselves
29:48
by just upgrading to the $200 a
29:51
month pro subscription of ChatGPT. Yeah, and
29:53
I will say that as somebody who's
29:55
willing to spend money on software all
29:57
the time, I thought, am I really
29:59
about... to spend $200 to do this,
30:01
but in the name of science, Kevin,
30:03
I had to. At this point, I
30:06
am spending more on AI subscription products
30:08
than on my mortgage. I'm pretty sure
30:10
that's correct. But it's worth it. We
30:12
do it for journalism. We
30:14
do. So we both spent a couple
30:16
of days putting operator through its paces,
30:18
and today we want to talk a
30:21
little bit about what we found. Yeah,
30:23
so would you just explain like what?
30:25
operator is and how it works.
30:27
Yeah, sure. So operator is a separate
30:29
sub domain of chat GPT. You know,
30:31
sometimes the chat GPT will just let
30:34
you pick a new model from a
30:36
drop-down menu. But for operator, you got
30:38
to go to a dedicated site. Once
30:40
you do, you'll see a very familiar
30:42
chatbot interface, but you'll see different kinds
30:45
of suggestions that reflect some of the
30:47
partnerships that Open AI has struck up.
30:49
So for example, they have partnerships with
30:51
open table and stub hub hub and
30:54
all recipes. meant to give you an
30:56
idea of what operator can do. And
30:58
frankly Kevin, not a lot of
31:00
this sounds that interesting, right? Like
31:02
the suggestions are on the the
31:04
order of suggest a 30-minute meal
31:06
with chicken or reserve a table
31:08
for eight or find the most
31:10
affordable passes to the Miami Grand
31:13
Prix. Again, so far, kind of
31:15
so boring. What is... different about
31:17
operator though is that when you
31:19
say okay find the most affordable
31:21
passes to the Miami Grand Prix
31:23
when you hit the enter button
31:25
it is going to Open up
31:27
its own web browser and it's
31:29
going to use this new model that
31:32
they have developed to try to actually
31:34
go and get those passes for you.
31:36
Yeah, so this is an important thing
31:38
because I think, you know, when people
31:40
first heard about this, they thought, okay,
31:43
this is an AI that kind of
31:45
takes over your computer, takes over your
31:47
web browser, that is not what operator
31:49
does. Instead, it opens a new browser
31:51
inside your browser and that browser is
31:53
hosted on open AI servers. It doesn't
31:56
have your bookmarks and stuff like
31:58
that saved, but you can take
32:00
it over from the autonomous AI agent if
32:02
you need to click around or do something
32:04
on it. But it basically exists. It's like
32:06
a it's a browser within a browser. Yeah.
32:08
So the one of the ideas on operator
32:11
is that you should be able to leave
32:13
it on supervised and just kind of go
32:15
do your work while it works. But of
32:17
course it is very fun initially at least
32:19
to watch the computer try to use itself.
32:21
And so I sat there in front of
32:23
this browser within a browser within a browser
32:25
and I watched. type the, you know, URL,
32:27
navigate to a website, and, you know, in
32:30
the example I just gave, actually
32:32
search for passes to the Miami
32:34
Grand Prix. Yeah, and it's interesting
32:36
on a slightly more technical level,
32:38
because until now, if an AI...
32:41
system like a chat GPT wanted
32:43
to interact with some other website,
32:45
it had to do so through
32:47
an API, right? APIs, application program
32:49
interfaces are sort of the way
32:52
that computers talk to each other,
32:54
but what operator does is essentially
32:56
eliminate the need for APIs because
32:58
it can. just click around on
33:00
a normal website that is designed
33:03
for humans and behave like a
33:05
human and you don't need a
33:07
special interface to do that. Yeah,
33:09
and now some people might hear
33:11
that, Kevin, and start screaming because
33:13
what they will say is APIs
33:15
are so much more efficient than
33:17
their operator is doing here. APIs
33:19
are doing here. APIs are very
33:21
structured. They're very fast. They let
33:23
computers talk to each other without
33:25
having to, for example, open up
33:27
a browser. have to be built.
33:29
There is a finite number of
33:32
them. The reason that Open AI
33:34
is going through this exercise is
33:36
because they want a true general-purpose
33:38
agent that can do anything for
33:40
you, whether there is an API
33:42
for it or not. And maybe
33:44
we should just pause for a
33:46
minute there and zoom out a
33:49
little bit to say, why are
33:51
they building? That's like, what is
33:53
the long-term vision here? Sure. So
33:55
the vision is to create virtual
33:57
co-workers, Kevin. create some kind of
33:59
digital. entity that you can just hire
34:01
as a co-worker. The first ones they'll
34:04
probably be engineers because these systems are
34:06
already so good at writing code, but
34:08
eventually they want to create virtual consultants,
34:10
virtual lawyers, virtual doctors, you name it.
34:12
Virtual podcast hosts? Let's hope they don't
34:14
go that far. But everything else is
34:16
on the table. And if they can
34:19
get there, presumably there are going to
34:21
be huge profits in it for them.
34:23
They're going to potentially be huge productivity
34:25
gains for companies. And then there is,
34:27
of course, the question of, well, what
34:29
does this mean for human beings? And
34:31
I think that's somewhat workier. Right. And
34:34
I think there's also, it also helps
34:36
to justify the cost of running these
34:38
things because $200 a month is a
34:40
lot to pay for a remote worker.
34:42
And if you could, say, use the
34:44
next version of operator, or maybe two
34:47
or three versions from now, to say,
34:49
replace a customer service agent or someone
34:51
in your billing department, that actually starts
34:53
to look like a very good deal.
34:55
Absolutely, or even if I could bring
34:57
it into the realm of journalism, Kevin,
34:59
if I had a virtual research assistant
35:02
and I said, hey, I'm going to
35:04
write about this today, go pull all
35:06
of the most relevant information about this
35:08
from the past couple of years and
35:10
maybe organize it in such a column
35:12
based. off of it, like yeah, that's
35:14
absolutely worth $200 a month to me.
35:17
Okay, so Casey, walk me through something
35:19
that you actually asked operator to do
35:21
for you and what it did autonomously
35:23
on its own. Sure. I'll maybe give
35:25
like two examples, like a pretty good
35:27
one and maybe a not so good
35:30
one. Pretty good one was, and this
35:32
was it actually suggested by operator. I
35:34
used trip advisor to look up walking
35:36
tours in London that I might want
35:38
to do. I'm not actually going to
35:40
London. Oh, so you lied to the
35:42
AI? And not for the first time.
35:45
But here's what I'll say. If anybody
35:47
wants to break heaven in London, I'll
35:49
get in touch. We love the city.
35:51
Yep. So I said, OK, operator, sure,
35:53
let's do it. Let's find. me some
35:55
walking tours. I clicked that it opened
35:58
a browser. It went to TripAdvisor, it
36:00
searched for Luden Walking Tours, it read
36:02
the information on the website, and then
36:04
it presented it to me, did that
36:06
within a couple of minutes. Now, on
36:08
one hand, could I have done that
36:10
just as easily by Google? Could I
36:13
probably have done it even faster if
36:15
I'd done it myself? Sure. But if
36:17
you're just sort of interested in the
36:19
technical feat that is getting one of
36:21
these models to open a browser to
36:23
navigate to a website, navigate to a
36:25
website, read it and share information, computer
36:28
using itself and you know going around
36:30
like typing things and selecting things from
36:32
drop-down menus yeah it's sort of like
36:34
you know if you think it is
36:36
cool to be in a self-driving car
36:38
like this is that but for your
36:41
web a self-driving browser it is a
36:43
self-driving browser so that's the good example
36:45
yes what was another example so another
36:47
example and this was something else that
36:49
open AI suggested that we try was
36:51
to try to use operator to buy
36:53
groceries and they have a partnership with
36:56
instakart And so I thought, okay, they're
36:58
gonna have like sort of dialed this
37:00
in so that there's a pretty good
37:02
experience. And so I said, okay, let's
37:04
go ahead and buy groceries and I
37:06
went to operator and I said something
37:09
like, hey, can you help me buy
37:11
groceries on Instagram? And it said, sure.
37:13
And here's what it did. It opened
37:15
up, Instagram, in a browser, so far,
37:17
so good. And then it started searching
37:19
for milk in stores located in Des
37:21
Moines, Iowa. Now, you do not live
37:24
in Des Moines, Iowa, so why did
37:26
it think that you did? As best
37:28
as I can tell, the reason it
37:30
did this is that Instacart defaults to
37:32
searching for grocery stores in the local
37:34
area and the server that this instance
37:36
of operator was running on was in
37:39
Iowa. Now, if you were designing a
37:41
grocery product like Instacart, and Instacart does
37:43
this, when you first sign on and
37:45
say you're looking for groceries, it will
37:47
say, quite sensibly, where are you. Instagram
37:49
might also offer suggestions for things that
37:52
you might want to buy. It does
37:54
not just assume that you want milk.
37:56
Wow, I'm just picturing like a house
37:58
in Des Moines Iowa where there's just
38:00
like a palette of milk being delivered
38:02
every day from all these poor operator
38:04
users. Yes. So I thought, okay, whatever,
38:07
you know, this thing makes mistakes. Let's
38:09
hope that it gets on the right
38:11
track here. And so I tried to
38:13
pick the grocery store that I wanted
38:15
it to shop at, which is, you
38:17
know, in San Francisco where I live,
38:20
and it entered that grocery store's address
38:22
as the delivery address. So like it
38:24
would try to deliver groceries presumably from
38:26
Des Moines Iowa to my grocery store,
38:28
which is not what I wanted. And
38:30
it actually could not. solve this problem
38:32
without my help. I had to take
38:35
over the browser, log into my Instacart
38:37
account, and tell it which grocery store
38:39
that I wanted to shop it. So
38:41
already, all of this has taken at
38:43
least 10 times as long as it
38:45
would have taken me to do this
38:47
myself. Yeah, so I had some similar
38:50
experiences. The first thing that I had
38:52
operator tried to do for me was
38:54
to buy a domain name and set
38:56
up a web server for a project
38:58
that you and I are working on
39:00
that we can't really talk about yet.
39:03
Secret project. Secret project. And so I
39:05
said to operator, I said, go research
39:07
available domain names related to this project,
39:09
buy the one that costs less than
39:11
$50. And then by hosting it. and
39:13
set it up and configure all the
39:15
DNS settings and stuff like that. Okay,
39:18
so that's like a true multi-step project
39:20
and something that would have been legitimately
39:22
very annoying to do yourself. Yes, you
39:24
know, that would have taken me, I
39:26
don't know, half an hour to do
39:28
on my own, and it did take
39:30
operator some time. Like I had to
39:33
kind of like set it and forget
39:35
it, and like I, you know, got
39:37
myself a snack and a cup of
39:39
coffee, and then when I came back,
39:41
it had done most of these tasks.
39:43
the browser and enter my credit card
39:46
number I had to give it some
39:48
details about like my address for the
39:50
sort of registration for the domain name
39:52
I had to pick between the various
39:54
hosting plans that were available on this
39:56
website but It did 90% of the
39:58
work for me. And I just had
40:01
to sort of take over and do
40:03
the last mile. And this is really
40:05
interesting because what I would assume was
40:07
it would get like, I don't know,
40:09
5% of the way and it would
40:11
hit some hicup and it just wouldn't
40:14
be able to figure something out until
40:16
you came back and saved it. But
40:18
it sounds like from what you're saying
40:20
was, it was somehow able to work
40:22
around whatever unanswered questions there were and
40:24
still get a lot done while you
40:26
weren't paying attention. It felt a little
40:29
bit like training like a very new
40:31
very insecure intern because like it at
40:33
first it would keep prompt me be
40:35
like well do you want a.com or
40:37
a dot net? And eventually you just
40:39
have to prompt it and say, like,
40:41
make whatever decisions you want. Like, wait,
40:44
you said that to it. Yes, I
40:46
said, like, only ask for my intervention
40:48
if you can't progress any farther, otherwise
40:50
just make the most reasonable decision. You
40:52
said, I don't care how many people
40:54
you have to kill. Just get me
40:57
this domain. And it said, understood, sir.
40:59
Yeah, and I'm now wants it in
41:01
42 states. Anyway, that was one thing
41:03
that operator did for me that was
41:05
pretty impressive. That feels like a grand
41:07
success compared to what I got operator
41:09
to do. Yeah, it was pretty impressive.
41:12
I also had to send lunch to
41:14
one of my coworkers, Mike Isaac, who
41:16
was hungry, because he was on deadline,
41:18
and I said go to DoorDash and
41:20
get Mike some lunch. It did initially
41:22
mess up that. process because it decided
41:25
to send him tacos from a taco
41:27
place which you know is great and
41:29
it's a taco place I know it's
41:31
very good but I said order enough
41:33
for two people and sort of ordered
41:35
two tacos and this is one of
41:37
those places where the tacos are quite
41:40
small operator said get your portion size
41:42
under control America yeah so then I
41:44
had to go in and say does
41:46
that sound like enough food operator and
41:48
it said actually now that you mentioned
41:50
I should probably order more wait no
41:52
so here's a question so in these
41:55
cases is the first step that you
41:57
log into your account because it doesn't
41:59
have any of your payment details or
42:01
anything so at what point are you
42:03
actually sort of teaching at that it
42:05
depends on the website so sometimes you
42:08
can just say up front like here
42:10
is my email address or here is
42:12
my login information and it will sort
42:14
of you know log you in and
42:16
do all that. Sometimes you take over
42:18
the browser. There are some privacy features
42:20
that are probably important to people where
42:23
it says Open AI says that it
42:25
does not take screenshots of the browser
42:27
while you are in control of it
42:29
because you might not want your credit
42:31
card information getting sent to open AI
42:33
servers or anything like that. So sometimes
42:36
it happens at the beginning of the
42:38
process, sometimes it happens like when you're
42:40
checking out at the end. And so
42:42
were you taking it over to log
42:44
in or were you saying, I don't
42:46
care, and you just like we're giving
42:48
operator your door dash password and plain
42:51
text? I was taking it over. Okay,
42:53
smart, smart, smart. So. Those were the
42:55
good things I also this was a
42:57
fun one. I I wanted to see
42:59
if operator could make me some money
43:01
So I said go take a bunch
43:03
of online surveys because you know there
43:06
are all these websites where you can
43:08
like get a couple cents for like
43:10
filling out an online survey Something that
43:12
most people don't know about Kevin is
43:14
he devotes 10% of his brain at
43:16
any given time to thinking about schemes
43:19
to generate and it's one of my
43:21
favorite Aspects of your personality that I
43:23
feel like doesn't get exposed very much,
43:25
but this is truly the most rusian
43:27
approach to using operator I can imagine
43:29
So I can't wait to find out
43:31
how this went. Well the most rusian
43:34
approach may might have been what I
43:36
tried just before this which was to
43:38
have it go play online poker for
43:40
me But it did not do it.
43:42
It said I can't help with gambling
43:44
or lottery related activities. Okay woke AI
43:46
Does the Trump administration know about this?
43:49
But it was able to actually fill
43:51
out some online surveys for me and
43:53
it earned a dollar and twenty cents.
43:55
Is that right? Yeah, in about 45
43:57
minutes. So if you had it going
43:59
all month, presumably you could maybe eke
44:02
out the $200 to cover the cost
44:04
of operator, pro? Yes, and I'm sure
44:06
I spent hundreds of dollars worth of
44:08
GPU computing power just to be able
44:10
to make that dollar and 20 cents.
44:12
But hey, it worked. But hey, it
44:14
worked. So those were some of the
44:17
things that I tried. There were some
44:19
other things that it just would not
44:21
do for me, no matter how hard
44:23
I tried. One of them. So one
44:25
of them was to. I was trying to
44:27
update my website and put some
44:30
links to articles that I'd written
44:32
on my website. And what I
44:34
found after trying to do this
44:36
was that there are just websites
44:38
where operator is not allowed to
44:40
go. And so when I said
44:42
to operator, go pull down these
44:44
New York Times articles that I
44:47
wrote and put them onto my
44:49
website, it said I can't get
44:51
to the New York Times website. I'm
44:53
going to guess you expected that to
44:55
happen. Well, I thought maybe it has
44:58
some clever work around, and maybe I
45:00
should alert the lawyers at the New
45:02
York Times, if that's the case. But
45:04
no, I assumed that if any website
45:07
were to be blocking the open
45:09
AI web crawlers, it would be
45:11
the New York Times. Yeah. There
45:13
are other websites that have also
45:15
put up similar blockades to prevent
45:17
operator from crawling them, read it,
45:19
you cannot go on to with
45:21
operator, YouTube, you cannot go on
45:24
to with operator, various other websites,
45:26
go daddy for some reason, did
45:28
not allow me to use operator
45:30
to buy a domain name there,
45:32
so I had to use another
45:34
domain name site to do that.
45:36
So right now there are some
45:38
pretty janky... parts of operator. I would
45:40
not say that most people would get
45:42
a lot of value from using it.
45:45
But what do you think? Well, I
45:47
do think that there is something just
45:49
undeniably cool about watching a computer
45:51
use itself. Of course, it can
45:54
also be quite unsettling. A computer
45:56
that can use itself can cause
45:58
a lot of harm. think that
46:00
it can do a lot of
46:02
good and so it was fun
46:04
to try to explore what some
46:06
of those things could be. And
46:08
to the extent that operator is
46:11
pretty bad at a lot of
46:13
tasks today, I would point out
46:15
that it showed pretty impressive gains
46:17
on some benchmarks. So there is
46:19
one benchmark, for example, that Anthropic
46:21
used when they unveiled computer use
46:23
last year, and they scored 14.9%
46:25
on something called OS World, which
46:27
is an evaluation for testing agent,
46:29
so not great. Just three months
46:31
later, Open AI said that its
46:33
Kua model scored 38.1% on the
46:35
same evaluation. And of course, we
46:37
see this all the time in
46:39
AI where there's just this very
46:42
rapid progress on these benchmarks. And
46:44
so on one hand, 38.1% is
46:46
a failing grade on basically any
46:48
test. On the other hand, if
46:50
it improves at the same rate
46:52
over the next three to six
46:54
months, you're gonna have a computer
46:56
that is very good at using
46:58
itself, right? So that I just
47:00
think is worth noting. Yes, I
47:02
think that's plausible. We've obviously seen
47:04
a lot of different AI products
47:06
over the last couple of years
47:08
start out being pretty mediocre and
47:10
get pretty good within a matter
47:12
of months. But I would give
47:15
one cautionary note here, and this
47:17
is actually the reason that I'm
47:19
not particularly bullish about these kind
47:21
of browser using AI agents. I
47:23
don't think the internet is going
47:25
to sit still and allow this
47:27
to happen. The internet is built
47:29
for humans to use, right? It
47:31
is every news publisher that shows
47:33
ads on their website, for example,
47:35
prices those ads based on the
47:37
expectation that humans are actually looking
47:39
at them. If browser agents start
47:41
to become more popular and all
47:43
of a sudden 10 or 20
47:46
or 30 percent of the visitors
47:48
to your website are not actually
47:50
humans but are instead operator or
47:52
some similar system, I think that
47:54
starts to break the... assumptions that
47:56
power the economic model of a
47:58
lot of the internet. Now is
48:00
that still true if we find
48:02
that the agents actually get persuaded
48:04
by the ads and that if
48:06
you send operator to buy door
48:08
dash and it sees an ad
48:10
for McDonald's it's like you know
48:12
what that's a great idea I'm
48:14
gonna ask Kevin if he actually
48:16
wants some of that. Totally Totally,
48:19
I actually think you're joking, but
48:21
I actually think that is a
48:23
serious possibility here is that people
48:25
who, you know, build e-commerce sites,
48:27
Amazon, etc. start to put in
48:29
basically signals and messages for browser
48:31
agents to look at on their
48:33
website to try to influence what
48:35
it ends up buying. And I
48:37
think you may start to see
48:39
restaurants popping up in certain cities
48:41
with names like operator, pick me,
48:43
or order from this one, Mr.
48:45
That's maybe a little extreme, but
48:47
I do think that there's going
48:49
to be a backlash among websites,
48:52
publishers, e-commerce, vendors, as these agents
48:54
start to take off. I think
48:56
that that is reasonable. I'll tell
48:58
you what I've been thinking about
49:00
is how do we turn this
49:02
tech demo into a real product?
49:04
And the main thing that I
49:06
noticed when I was testing operator
49:08
was there is a difference between
49:10
an agent that is using a
49:12
browser and an agent that is
49:14
using your browser. When an agent
49:16
is able to use your browser,
49:18
which it can't right now, it's
49:20
already logged into everything. It already
49:23
has your payment details. It can
49:25
do everything so much faster and
49:27
more seamlessly and without as much
49:29
hand-holding. Of course, there are also
49:31
so many more privacy and security
49:33
risks that would come from entrusting
49:35
an agent with that kind of
49:37
information. So there is some sort
49:39
of chasm there that needs to
49:41
be closed and I'm not quite
49:43
sure how anyone does it. but
49:45
I will tell you I do
49:47
not think the future is opening
49:49
up these virtual browsers and me
49:51
having to enter all of my
49:53
login and payment details every single
49:56
time I want to do anything
49:58
on the internet because truly I
50:00
would rather just do it myself.
50:02
Right. I also think there's just
50:04
a lot more potential for harm
50:06
here. A lot of AI safety
50:08
experts I've talked to are very
50:10
worried about this because what you're
50:12
essentially doing is letting the AI
50:14
models make their own decisions and
50:16
actually carry out tasks. And so
50:18
you can imagine a world where
50:20
an AI agent that's very powerful,
50:22
a couple versions from now, decides
50:24
to start. doing cyber attacks because
50:27
maybe some malevolent user has told
50:29
it to make money and it
50:31
decides that the best way to
50:33
do that is by hacking into
50:35
people's crypto wallets and stealing their
50:37
crypto. Yeah, so those are the
50:39
kinds of reasons that I am
50:41
a little more skeptical that this
50:43
represents a big breakthrough, but I
50:45
think it's really interesting and it
50:47
did give me that feeling of
50:49
like, wow, this could get really
50:51
good, really fast, and if it
50:53
does, the world will look very
50:55
different. When we come back, Kevin,
50:57
back that caboose up. It's time
51:00
for the Hot Mess Express. You
51:02
know, Roose Caboose was my nickname
51:04
in middle school. Kevin Caboose. Choo-choo.
51:19
Whether you're starting or scaling your
51:21
company's security program, demonstrating top-notch security
51:23
practices and establishing trust is more
51:25
important than ever. Vanta automates compliance
51:28
for SOC 2, ISO 27001, and
51:30
more. With Vanta, you can streamline
51:32
security reviews by automating questionnaires and
51:34
demonstrating your security posture with a
51:36
customer-facing trust center. Over 7,000 global
51:39
companies use Vanta to manage risk
51:41
and prove security in real-time. Get
51:43
a thousand dollars off Vanta when
51:45
you go to vanta.com/hard fork. That's
51:47
vanta.com/hard fork for a thousand dollars
51:50
off. I'm Julie Turkwits. I'm a
51:52
reporter at the New York Times
51:54
to understand changes in migration. traveled
51:56
to the Darien Gap. Thousands have
51:58
been risking their lives to pass
52:01
through the border of Colombia and
52:03
Panama in the hopes of making
52:05
it to the United States. We
52:07
interviewed hundreds of people to try
52:09
and grasp what's making them go
52:12
to these lengths. New York Times
52:14
journalists spend time in these places
52:16
to help you understand what's really
52:18
happening there. You can support this
52:20
kind of journalism by subscribing to
52:23
the New York Times. Well
52:26
Casey we're here wearing our trained
52:29
conductor hats and my child's train
52:31
set is on the table in
52:33
front of us Which can only
52:36
mean one thing we're going to
52:38
train a large language model. Nope.
52:40
That's not what that means. It
52:43
means it's time to play a
52:45
game of the hot mess express
52:48
Paws for theme song Hot
52:53
Mess Express, Kevin is our segment where
52:55
we run through some of the messiest
52:57
recent tech stories and deploy our official
53:00
hot mess thermometer to tell you just
53:02
how messy we think things have gotten.
53:04
And Kevin, you better sit down for
53:07
this one. This is about a messy
53:09
week. Sure has. So why don't we
53:11
go ahead? Fire up the Hot Mess
53:14
Express and see what is the first
53:16
story coming down the charts. I hear
53:18
a faint chugga, chugga in my headphones.
53:21
Oh, it's pulling into the station. Casey,
53:23
what's the first cargo that our hot
53:25
mess express is carrying? All right, Kevin,
53:27
this first story comes to us from
53:30
the New York Times, and it says
53:32
that Fable, a book app, has made
53:34
changes after some offensive AI messages. Now
53:37
Casey, have you ever heard of Fable,
53:39
the book app? Well, not until this
53:41
story, Kevin, but I am told that
53:44
it is an app for sort of
53:46
keeping track of what you're reading, not
53:48
unlike a good reads, but also for
53:51
discussing what you're reading, and apparently this
53:53
app also offers some AI chat. Yeah,
53:55
you can have AI sort of summarize
53:57
the things that you're reading in a
54:00
personalized way, and this story said that
54:02
an... addition to spitting out bigoted and
54:04
racist language, the AI Inside Fable's book
54:07
app had told one reader who had
54:09
just finished three books by black authors,
54:11
quote, your journey dives deep into the
54:14
heart of black narratives and transformative tales,
54:16
leaving mainstream stories gasping for air. Don't
54:18
forget to surface for the occasional white
54:21
author, okay? And another personalized AI summary
54:23
that Fable Produce told another reader that
54:25
their book choices were, quote, making me
54:27
wonder if you're ever in the mood
54:30
for a straight cis white man's perspective.
54:32
And if you are interested in a
54:34
straight cis white man's perspective, follow Kevin
54:37
Roos on x.com. Now, Kevin, why do
54:39
we think this happened? I don't know,
54:41
Casey. This is a headscratcher for me.
54:44
I mean, we know that these apps
54:46
can can spit out. things that is
54:48
just sort of like part of how
54:51
they are trained and part of what
54:53
we know about them. I don't know
54:55
what model Fable was using under the
54:57
hood here, but yeah, this seems not
55:00
great. Well, it seems like we've learned
55:02
a lesson that we've learned more than
55:04
once before, which is that large language
55:07
models are trained on the internet, which
55:09
contains near infinite racism, and for that
55:11
reason, it will actually produce racism when
55:14
you ask it questions. In this case,
55:16
they were not successful. Fable's head of
55:18
community, Kim Marsh Alley, has said that
55:21
all features using AI are being removed
55:23
from the app and a new app
55:25
version is being submitted. to the app
55:27
store. So you always hate it when
55:30
the first time you hear about an
55:32
app is that they added AI and
55:34
it made it super racist and they
55:37
have to redo the app. Now Casey,
55:39
one more question before we move on.
55:41
Do you think this poses any sort
55:44
of competitive threat to Grock, which until
55:46
this story was the leading racist AI
55:48
app on the market? I do think
55:51
so. And I have to admit that
55:53
all the folks over at Grock are
55:55
breathing a sigh of relief now that
55:57
they have once again claimed the mantle.
56:00
All right. Casey, how hot is this
56:02
mess? Well Kevin, in my opinion, if
56:04
your AI is so bad that you
56:07
have to remove it from the app
56:09
completely, that's a hot mess. Yeah, I
56:11
rate this one a hot mess as
56:14
well. All right, next stop. Amazon pauses
56:16
drone deliveries after aircraft crashed in rain.
56:18
Casey, this story comes to us from
56:21
Bloomberg, which had a different line of
56:23
reporting than we did just a few
56:25
weeks ago on the show about Amazon's
56:27
drone program, Prime Air. Casey, what happened
56:30
to Amazon Prime Air? Well... If you
56:32
heard the episode of Hard Fork where
56:34
we talked about it, Amazon Prime Air
56:37
delivered us some Brazilian bumbum cream and
56:39
it did so without incident. However, Bloomberg
56:41
reports that Amazon has had to now
56:44
pause all of their commercial drone deliveries
56:46
after two of its latest models crashed
56:48
in rainy weather at a testing facility.
56:51
And so the company says it is
56:53
immediately suspending drone deliveries in Texas and
56:55
Arizona and will now fix the aircraft's
56:57
software. Kevin, how did you react to
57:00
this? deliveries before they fixed the software
57:02
because these things are quite heavy, Casey.
57:04
I would not want one of them
57:07
to fall in my head. I wouldn't
57:09
either. And I have to tell you,
57:11
this story gave me the worst kind
57:14
of flashbacks because in 2016, I wrote
57:16
about Facebook's drone, Echila, and its first,
57:18
what the company told me, had been
57:21
its first successful test flight in its
57:23
mission to deliver internet around the world
57:25
via drone. What the company did not
57:27
tell me when I was interviewing its
57:30
executives, including Mark Zuckerberg. the plane had
57:32
crashed after that first flight. And so
57:34
I was small detail. I'm sure it
57:37
was an innocent omission from their briefing.
57:39
Yes, I'm sure. Well, it was Bloomberg
57:41
again who reported, you know, a couple
57:44
months after I wrote this story that
57:46
the Facebook drone had crashed. I was
57:48
of course, hugely embarrassed and, you know,
57:51
wrote a bunch of stories about this.
57:53
But anyways, it really should have occurred
57:55
to me when we were out there
57:57
watching the Amazon drone that this thing
58:00
was also probable. secretly crashing and we
58:02
just hadn't found out about it yet
58:04
and indeed we now learn it is.
58:07
So here's my vow to you Kevin
58:09
as my friend and my co-host. If
58:11
we ever see a company fly anything
58:14
again we have to ask them. Did
58:16
this thing actually crash? Yeah. I'm tired
58:18
of being burned. Now Casey, we should
58:21
say, according to Bloomberg, these drones reportedly
58:23
crash in December. We visited Arizona to
58:25
see them in very early December. So
58:27
most likely, you know, this all happened
58:30
after we saw them. But I think
58:32
it's a good idea to keep in
58:34
mind that as we're talking about these
58:37
new and experimental technologies. that many of
58:39
them are still having the kinks worked
58:41
out. All right, Kevin, so let's get
58:44
out the thermometer. How hot of a
58:46
mess is this? I would say this
58:48
is a moderate mess. Look, these are
58:51
still testing programs. No one was hurt
58:53
during these tests. I am glad that
58:55
Bloomberg reported on this. I'm glad that
58:57
they've suspended the deliveries. These things could
59:00
be quite dangerous flying through the air.
59:02
I do think it's one of a
59:04
string of reported. incidents with these drones.
59:07
So I think they've got some quality
59:09
control work ahead of them and I
59:11
hope they do well on it because
59:14
I want these things to exist in
59:16
the world and be safe for people
59:18
around them. All right. I will agree
59:21
with you and say that this is
59:23
a warm mess and hopefully you can
59:25
get straightened out over there. Let's see
59:27
what else is coming down the tracks.
59:30
Fitbit has agreed to pay $12 million
59:32
for not quickly reporting burn risk with
59:34
watches. Kevin, do you hear about this?
59:37
I did. This was the fitbit. Devices
59:39
were like literally burning people. Yes, from
59:41
2018 to March of 2022, Fitbit received
59:44
at least a hundred and seventy four
59:46
reports globally of the lithium ion battery
59:48
in the Fitbit Ionic watch overheating, leading
59:51
to a hundred and eighteen reported injuries,
59:53
including two cases of third degree burns
59:55
and four of second degree burns. That
59:57
comes from the New York Times. Deal
1:00:00
Hassan, Kevin, I thought these things were
1:00:02
just supposed to burn calories. Well, it's
1:00:04
like I always say, exercising is very
1:00:07
dangerous and you should never do it.
1:00:09
And this justifies my decision not to
1:00:11
wear a fit bit. To me, the
1:00:14
biggest surprise of this story was that
1:00:16
people were wearing fit bits from March
1:00:18
2018 to 2022. I thought every fitbit
1:00:21
had been purchased by like 2011 and
1:00:23
then put in a drawer never to
1:00:25
be heard again. So what is going
1:00:27
on with these sort of late stage
1:00:30
fitbit buyers? I'd love to find out.
1:00:32
But of course. we feel terrible for
1:00:34
everyone who was burned by a fit
1:00:37
bit and it's not gonna be the
1:00:39
last time technology burns you. I mean
1:00:41
realistically. That's true. You know? It's true.
1:00:44
Now what kind of mess is this?
1:00:46
I would say this is a hot
1:00:48
mess. This is an officially hot, literally
1:00:51
hot, they're hot. Here's my sort of
1:00:53
rubric. If technology physically burns you, it
1:00:55
is a hot mess. If you have
1:00:57
physical burns on your body, what other
1:01:00
kind of mess could it be? Okay,
1:01:02
next stop on the Hot Mess Express.
1:01:04
Google says it will change Gulf of
1:01:07
Mexico to Gulf of America in Maps
1:01:09
app after government updates. Casey, have you
1:01:11
been following this story? I have, Kevin,
1:01:14
every morning when I wake up I
1:01:16
scan America's maps and I say, what
1:01:18
has been changed? And if so, has
1:01:21
it been changed for political reasons? And
1:01:23
this was probably one of the biggest
1:01:25
examples of that we've seen. Yeah, so
1:01:27
this was an interesting story that came
1:01:30
out. in the past couple of days.
1:01:32
Basically, after Donald Trump came out during
1:01:34
his first days in office and said
1:01:37
that he was changing the name of
1:01:39
the Gulf of Mexico to the Gulf
1:01:41
of America and the name of Denali,
1:01:44
the Mountain in Alaska, to Mount McKinley,
1:01:46
Google had to decide, well, when you
1:01:48
go on Google Maps and look for
1:01:51
those places, what should it call them?
1:01:53
It seems to be saying that it
1:01:55
is going to take inspiration from the
1:01:57
Trump administration and update the names of
1:02:00
these places in the maps app. Yeah,
1:02:02
and look, I don't think Google really
1:02:04
had a choice here. that the company
1:02:07
has been on Donald Trump's bad side
1:02:09
for a while, and if it had
1:02:11
simply refused to make these changes, it
1:02:14
would have sort of caused a whole
1:02:16
new controversy for them. And it is
1:02:18
true that the company changes place names
1:02:21
when governments change place names, right? Like
1:02:23
Google Maps existed when Mount McKinley was
1:02:25
called Mount McKinley, and President Obama changed
1:02:27
it to Janali, and Google updated the
1:02:30
map. Now it's changed back. They're doing
1:02:32
the same thing. Kevin, I think there's
1:02:34
room for Donald Trump to have a
1:02:37
lot of fun with the company. Yeah,
1:02:39
what can you do? Well, you could
1:02:41
call it the Gulf of Gemini isn't
1:02:44
very good and just see what would
1:02:46
happen. Because they would kind of have
1:02:48
to just change it. Can you imagine
1:02:51
every time you opened up Google Maps
1:02:53
and you looked at the Gulf of
1:02:55
Mexico slash America and just said the
1:02:57
Gulf of Gemini is not very good?
1:03:00
You know, I hate to give Donald
1:03:02
Trump any ideas, but I don't know.
1:03:04
I think this is a mild mess.
1:03:07
I think this is a tempest in
1:03:09
a teapot. I think that this is
1:03:11
the kind of update that companies make
1:03:14
all the time. Because places change names
1:03:16
all the time. Let's just say it.
1:03:18
Well, Kevin, I guess I would say
1:03:21
that one is a hot mess. Because
1:03:23
if we're just going to start renaming
1:03:25
everything on the map, that's just going
1:03:27
to get extremely confusing for me to
1:03:30
follow. I got places to go. You
1:03:32
go to like three places. Yeah, and
1:03:34
I use Google Maps to get there.
1:03:37
And I need them to be named
1:03:39
the same thing that they were yesterday.
1:03:41
I don't think they're gonna change the
1:03:44
name of Barry's boot camp. All right,
1:03:46
final stop on the Hot Mess Express.
1:03:48
Casey, bring us home. All right. Kevin,
1:03:51
and this is some sad news. Another
1:03:53
Waymo was vandalized. This is from one-time
1:03:55
hard-for-guess Andrew J. Hawkins at The Virgin.
1:03:57
He reports that this Waymo was vandalized
1:04:00
during an illegal street takeover near the
1:04:02
Beverly Center in LA. Video from Fox
1:04:04
11 shows a crowd of people basically
1:04:07
dismantling the driverless. piece by piece and
1:04:09
then using the broken pieces to smash
1:04:11
the windows. Kevin, what did you make
1:04:14
of this? Well Casey, as you recall,
1:04:16
you predicted that in 2025, Waymo would
1:04:18
go mainstream and I think there is
1:04:21
no better proof that that is true
1:04:23
than that people are turning on the
1:04:25
Waymo's and starting to beat them up.
1:04:27
Yeah, I, you know, look, I don't...
1:04:30
know that we have heard in the
1:04:32
interviews from why these people were doing
1:04:34
this. I don't know if we should
1:04:37
see this as like a reaction against
1:04:39
AI in general or of Waymo's specifically.
1:04:41
But I always find it like weird
1:04:44
and sad when people attack Waymos because
1:04:46
they truly are safer cars than every
1:04:48
other car. Well, not if you're going
1:04:51
to be riding in them and people
1:04:53
just going to start like beating the
1:04:55
car, then they're not safer. No, but
1:04:57
you know, that's only happened a couple
1:05:00
times that we're aware of. Right. Yeah.
1:05:02
So yeah, this story is sad to
1:05:04
me. Obviously people are reacting to Waymos.
1:05:07
Maybe they have sort of fears about
1:05:09
this technology or think it's going to
1:05:11
take jobs. or maybe they're just pissed
1:05:14
off and they wanna break something. But
1:05:16
don't hurt the way most people, in
1:05:18
part, because they will remember. They will
1:05:21
remember. They will remember, and they will
1:05:23
come for you. I'm not sure that
1:05:25
that's true, but I think we should
1:05:27
also note that Waymo only became officially
1:05:30
available in LA in November of last
1:05:32
year. And so part of this just
1:05:34
might be a reaction to the newness
1:05:37
of it all and people getting a
1:05:39
little carried away, just sort of curious,
1:05:41
what will happen if we try to,
1:05:44
you know, destroy this thing? Will it
1:05:46
deploy defensive measures and so on? So
1:05:48
they're going to have to put flame
1:05:51
throwers on them. I'm just calling it
1:05:53
right now. one was. I think this
1:05:55
one is a is a lukewarm mess
1:05:57
that has the potential to escalate. I
1:06:00
don't want this to happen. I sincerely
1:06:02
hope this does not happen, but I
1:06:04
can see as Waymo start. being rolled
1:06:07
out across the country, that some people
1:06:09
are just going to lose their minds.
1:06:11
Some people are going to see this
1:06:14
as the physical embodiment of technology invading
1:06:16
every corner of our lives, and they
1:06:18
are just going to react in strong
1:06:21
and occasionally destructive ways. I'm sure that
1:06:23
Waymo has gamed this all out. I'm
1:06:25
sure that this does not surprise them.
1:06:27
I know that they have been asked
1:06:30
about what happens if Waymo's start getting
1:06:32
vandalized and they presumably have plans to
1:06:34
deal with that, including prosecuting the people
1:06:37
who are doing this. But yeah, I
1:06:39
always go out of my way to
1:06:41
try to be nice to Waymo's. And
1:06:44
in fact. Some other Waymo news this
1:06:46
week, Jane Manchin Wong, the security researcher,
1:06:48
reported on X recently that Waymo is
1:06:51
introducing or at least testing a tipping
1:06:53
feature. And so I'm gonna start tipping
1:06:55
my Waymo just to make up for
1:06:57
all the jerks in LA who are
1:07:00
vandalizing them. It looks like the tipping
1:07:02
feature, by the way, will to be
1:07:04
to tip a charity and that Waymo
1:07:07
will not keep that money. At least
1:07:09
that's what's been reported. No, I think
1:07:11
it's going to the flame thorough. for
1:07:14
taking this journey with me. Whether
1:07:32
you're starting or scaling your company's security
1:07:34
program, demonstrating top-notch security practices and establishing
1:07:36
trust is more important than ever. Vanta
1:07:39
automates compliance for SOC 2, ISO 27001,
1:07:41
and more. With Vanta, you can streamline
1:07:43
security reviews by automating questionnaires and demonstrating
1:07:45
your security posture with a customer-facing trust
1:07:48
center. Over 7,000 global companies use Vanta
1:07:50
to manage risk and prove security in
1:07:52
real-time. Get a thousand dollars off Vanta
1:07:54
when you go to... of.com/Hard Fork.
1:07:57
That's Vanta.com slash Hard
1:07:59
Fork for ,000 dollars off. Hard
1:08:01
Fork is produced by Rachel Fork
1:08:04
is produced by We're
1:08:06
and Whitney Jones. by
1:08:08
edited this week
1:08:10
by Rachel Dry by Ena
1:08:13
by Today's show Today's
1:08:15
show was engineered by
1:08:17
Dan Powell. Original
1:08:19
music by Wong and Dan and
1:08:22
Dan Powell. Our executive
1:08:24
producer is Jen Our Our
1:08:26
audience editor is Gololioli. Video production
1:08:28
Ryan Manning and and Chris Shot.
1:08:30
You can watch this
1:08:32
whole episode on on YouTube.com slash
1:08:34
slash Hard Special thanks to to
1:08:36
Paul Shuman, Puing Tam, Dahlia Hidad, and Jeffrey Miranda.
1:08:39
You You can email us
1:08:41
at Hard Fork at y.com with what
1:08:43
you're calling the Gulf
1:08:45
of Mexico. of Mexico.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More