Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
On Criminal, we bring you true crime
0:03
stories, told by people who
0:05
know them best.
0:07
We didn't believe in setting fires because that was too dangerous. We
0:09
were, you know, a kinder,
0:11
gentler kind of crooks, so
0:13
to speak.
0:14
So the best plan you had was that you'd lasso
0:16
it. Yeah. Never imagined I'd
0:18
use it for a camel. I'm
0:22
Phoebe Judge, and this is Criminal. Did
0:25
you have to say what was in the box?
0:26
Phoebe, we told lies.
0:30
Listen to Criminal every week, wherever
0:32
you get your podcasts.
0:57
I got so into it that I tried to see if I could train an
0:59
AI on my own voice. And it kind
1:01
of worked. It's not perfect, but
1:03
I'm actually not reading this line. I
1:05
just typed it into a program, and I haven't
1:07
been reading anything this whole time.
1:10
Okay. Back to real
1:13
me.
1:15
These tools have all been fascinating, but
1:18
the one I really couldn't stop thinking about
1:20
was ChatGPT, the chatbot
1:22
released by OpenAI late last year. And
1:25
it's because of the surprisingly wide range
1:27
of things I saw this one chatbot doing. Like
1:30
writing the story of Goldilocks as if it was from the King
1:32
James Bible. And
1:34
it came to pass in those days
1:36
that a certain young damsel named Goldilocks
1:39
did wander into the dwelling of three bears.
1:41
I saw it passing tons of standardized
1:43
tests, being used for scientific research,
1:47
even building full websites based on
1:49
a few sketched out notes. I'm just
1:51
going to take a photo.
1:53
And here we go.
1:55
Going from hand-drawn to working
1:58
website. You
2:00
saw some less fun things, like
2:02
chat bots disrupting entire industries,
2:05
playing a major role in the Hollywood writers'
2:07
strike.
2:07
The union is seeking a limit on
2:09
the use of AI, like chat
2:12
GPT, to generate scripts in seconds.
2:14
They've been used to create fake news stories,
2:17
they've been shown to walk people through how to make
2:19
chemical weapons, and they're even getting
2:21
more AI experts worried about larger
2:24
threats to humanity. The main thing
2:26
I'm talking about is these things becoming
2:28
super intelligent and taking over control.
2:32
All of this started feeling like a long way
2:34
from a fun biblical Goldilocks
2:37
story. So I wanted to understand
2:40
how a chat bot could do all these things.
2:42
I started calling up researchers,
2:44
professors, reporters. I was
2:46
annoying my friends and family by bringing
2:48
it up in basically every conversation. And
2:51
then I came across this paper by an AI
2:53
researcher named Sam Bowman. And
2:56
it was basically a list of eight things
2:58
scientists know about AIs like
3:01
chat GPT.
3:02
I was like, great, easy way to get a refresher
3:05
on the basics here. So I started
3:07
reading down the list and it was pretty much what I
3:09
expected. Lots of stuff about how these kind of
3:11
AIs get better over time.
3:13
But then things started to get kind of weird.
3:17
Number four, we can't reliably steer
3:20
the behavior of AIs like chat GPT. Number
3:23
five, we can't interpret
3:26
the inner workings of AIs like chat
3:28
GPT. And I was like,
3:30
you're telling me this thing that's being used by
3:33
over 100 million people that might
3:35
change how we think about education or
3:37
computer programming or tons of jobs.
3:41
We don't know how it works.
3:44
So I called up Sam, the author of the paper, and
3:46
he was just like, yeah, we just don't
3:49
understand what's going on here. And it's not like
3:51
Sam hasn't been trying his best
3:53
to figure this out. I've built these models. I've
3:56
studied these models. We built it, we trained it, but
3:58
we don't know what it's doing. Ever
4:01
since I talked with Sam, I've been stuck on
4:03
this core unknown. What
4:06
does it mean for a tech like this to suddenly
4:08
be everywhere?
4:10
If we don't know how it works, we
4:12
can't really say whether we're going to end
4:14
up with scientific leaps, catastrophic
4:17
risks, or something we haven't
4:19
even thought of yet.
4:21
The story here really is about
4:23
the unknowns. We've got something that's not
4:26
really meaningfully regulated, that is
4:29
more or less useful for a huge range of valuable
4:32
tasks, but can sort
4:34
of just go off the rails in a wide variety of ways we don't
4:36
understand yet. And it's sort
4:38
of a scary thing to be building unless you really understand how
4:40
it works. And we
4:43
don't really understand how these things work.
4:47
I'm Noam Hasenfeld, and this is the first
4:49
episode of a two-part unexplainable series
4:51
we're calling The Black Box. It's
4:54
all about the whole at the center of modern
4:56
artificial intelligence. How
4:58
is it possible that something this potentially
5:01
transformative, something we built,
5:03
is this unknown? And
5:05
are we ever going to be able to understand it?
5:09
Thinking intelligent thoughts is a mysterious
5:12
activity. The future of the computer
5:14
is just as hard to admit. I just have to
5:16
admit, I don't really know. You can
5:18
choose back to how you think I'd feel. Activity.
5:22
Intelligence.
5:22
And the computer thing? Let's
5:25
go!
5:31
So how did we get to this place where we've got these super
5:33
powerful programs that scientists are still
5:36
struggling to understand?
5:38
It started with a pretty intriguing question,
5:41
dating back to when the first computers were
5:43
invented. The whole idea
5:45
of AI was that maybe intelligence,
5:48
this thing that we used to think was uniquely human,
5:50
could be built
5:51
on a computer. Kelsey Piper, AI
5:53
reporter, Vox. It was deeply
5:56
unclear how to build super
5:58
intelligent systems. But as soon
6:00
as you had computing, you had leading
6:02
figures in computing say, this is
6:05
big and this has the potential to change
6:07
everything.
6:08
In the 50s, computers could already solve
6:10
complex math problems. And researchers
6:12
thought this ability could eventually be scaled
6:15
up.
6:15
So they started working on new programs that
6:17
could do more complicated things, like
6:20
playing chess. Chess has come to represent
6:23
the complexity and intelligence
6:25
of the human mind, the ability
6:27
to think.
6:28
Over time, as computers got more powerful,
6:31
these simple programs started getting
6:33
more capable. And by the time the 90s
6:35
rolled around, IBM had built a chess
6:38
playing program that started to actually win
6:41
against some good players. They
6:44
called it Deep Blue. And it was pretty
6:47
different from the unexplainable kinds of AIs
6:49
we're dealing with today. Here's how it worked. IBM
6:54
programmed Deep Blue with all sorts of chess
6:56
moves and board states. That's basically
6:58
all the possible configurations of pieces
7:01
on the board. So you'd start with all
7:03
the pawns in a line, with the other pieces behind
7:05
them. Pawn E2 to E4. Then
7:09
with every move, you'd get a new board state. Knight
7:12
G8 to G6.
7:14
And with every new board state, there would be different
7:16
potential moves Deep Blue could make. Bishop
7:19
F1 to C4. IBM
7:22
programmed all these possible moves into Deep
7:24
Blue. And then they got hundreds
7:26
of chess grandmasters to help them rank
7:28
how good a particular move would be.
7:30
They used rules that were
7:32
defined by chess masters and by
7:34
computer scientists to
7:37
tell Deep Blue this
7:39
board state. Is it a good board state or a
7:41
bad board state? And Deep Blue would run
7:43
the evaluations in order to
7:45
evaluate whether the board state it had
7:47
found was any good.
7:48
Deep Blue could evaluate 200
7:51
million moves per second. And then it would
7:53
just select the one IBM had rated
7:55
the highest. There were some
7:57
other complicated things going on here.
7:59
But it was still pretty basic. Deep
8:02
Blue had a better memory than we do, and it did
8:04
incredibly complicated calculations.
8:07
But it was essentially just reflecting humans'
8:09
knowledge of chess back at us. It
8:12
wasn't really generating anything new
8:14
or being creative.
8:17
And to a lot of people, including Gary Kasparov,
8:19
the chess world champion at the time, this
8:21
kind of chess bot wasn't that impressive,
8:24
especially because it was so robotic.
8:27
They tried to use only computers'
8:30
advantages, calculation, evaluation, et
8:32
cetera. But I
8:34
still am not sure that the computer
8:37
will beat world champion, because
8:39
world champion is absolutely the best, and his greatest
8:41
ability is to find a new way in chess.
8:44
And it will be something
8:46
you can't explain to computer. Kasparov
8:49
played the first model of Deep Blue in 1996, and
8:52
he won. But a year later, against
8:55
an updated model, the rematch didn't
8:57
go nearly as well. Are we missing something
9:00
on the chess board now that Kasparov sees? He
9:02
looks disgusted, in fact. He looks just...
9:05
Kasparov leaned his head into his hand,
9:08
and he just started staring blankly off
9:11
into space. And whoa! Deep
9:13
Blue, Kasparov has resigned.
9:16
He got up, gave this sort of shrug
9:19
to the audience, and he just walked
9:21
off the stage. I, you know, I proved
9:24
to be vulnerable. You know, when I see something
9:26
that is well beyond my understanding, I'm
9:29
scared, and that was something well beyond
9:31
my understanding.
9:32
Deep Blue may have mystified Kasparov,
9:35
but Kelsey says that computer scientists knew
9:37
exactly what was going on here.
9:39
It was complicated, but it
9:41
was written in by a human. You can
9:43
look at the evaluation function, which is made up of
9:45
parts that humans wrote, and
9:48
learn why Deep Blue thought that board state was
9:50
good.
9:51
It was so predictable that people weren't
9:53
sure whether this should even count as artificial
9:56
intelligence. People were kind of like, OK,
9:58
that's not intelligence. intelligence
10:00
should require more than just, I will
10:02
look at hundreds of thousands of board
10:04
positions and check which one
10:06
gets the highest rating against a pre-written
10:09
rule, and then do the one that gets the highest rating.
10:12
But Deep Blue wasn't the only way to design
10:14
a powerful AI.
10:16
A bunch of other groups were working on more sophisticated
10:19
tech, an AI that didn't need to be told
10:21
which moves to make in advance, one that could
10:23
find solutions for itself. And
10:26
then in 2015, almost 20 years
10:29
after Kasparov's dramatic loss, Google's
10:32
DeepMind built an AI called AlphaGo,
10:34
designed for what many people call the hardest
10:36
board game ever made, Go.
10:39
Go had remained unsolved by AI
10:41
systems for a long time after chess had been.
10:43
If you've never played Go, it's a board
10:46
game where players place black and white
10:48
tiles on a 19 by 19 grid
10:50
to capture territory, and it's way more
10:52
complicated than chess.
10:54
Go has way more possible board states,
10:56
so the approach with chess would
10:58
not really work. You couldn't hard
11:01
code in as many rules about
11:03
in this situation do this. Instead,
11:05
AlphaGo was designed to essentially
11:07
learn over time. It's sort
11:09
of modeled after the human brain. Here's
11:13
a way too simple way to describe something as absurdly
11:16
complicated as the brain, but hopefully
11:18
it can work for our purposes here. A
11:20
brain is made up of billions and billions
11:23
of neurons, and a single neuron is
11:25
kind of like a switch. It can turn on
11:27
or off. When it turns on,
11:29
it can turn on the neurons it's connected
11:31
to, and the more the neurons turn on
11:33
over time, the more these connections get
11:36
strengthened, which is basically how scientists
11:38
think the brain might learn.
11:40
Like probably in my brain, neurons
11:42
that are associated with my
11:45
house, you know, are probably also strongly associated
11:47
with my kids and other things
11:49
in my house, because I have a lot of connections
11:51
among those things.
11:55
Scientists don't really understand
11:57
how all of this adds up to learning in the brain.
12:00
They just think it has something to do with all of these
12:02
neural connections. But AlphaGo
12:04
followed this model, and researchers created
12:06
what they called an artificial neural network.
12:09
Because instead of real neurons, it had
12:11
artificial ones, things that can turn on
12:14
or off.
12:14
All you'd have is numbers. At
12:17
this spot, we have a yes or a no, and
12:19
here is like how strongly connected they
12:22
are.
12:22
And with that structure in place, researchers
12:25
started training it. They had AlphaGo
12:27
play millions of simulated games against
12:29
itself. And over time, it strengthened
12:32
or weakened the connections between its
12:34
artificial neurons.
12:35
It tries something, and it learns,
12:38
did that go well? Did that go badly? And
12:41
it adjusts the procedure it uses to choose
12:43
its next action based on that.
12:44
It's basically trial and error.
12:47
You can imagine a toy car trying to get from point
12:49
A to point B on a table. If
12:51
we hard-coded in the route, we'd basically
12:53
be telling it exactly how to get there. But
12:56
if we used an artificial neural network, it
12:58
would be like placing that car in the center
13:00
of the table and letting it try out all
13:03
sorts of directions randomly. Every
13:05
time it falls off the table, it would eliminate
13:08
that path. It wouldn't use it again. And
13:11
slowly, over time, the car
13:13
would find a route that works.
13:14
So you're not just
13:16
teaching it what we would do. You
13:19
are teaching it how to tell if a thing
13:21
it did was good. And then based
13:23
on that, it develops its own capabilities.
13:27
This process essentially allowed AlphaGo
13:30
to teach itself which moves
13:32
worked and which moves didn't. But
13:34
because AlphaGo was trained like this, researchers
13:37
couldn't tell which specific features
13:40
it was picking up on when it made any individual
13:42
decision. Unlike with Deep
13:44
Blue, they couldn't fully explain
13:47
any move on a basic level.
13:49
Still, this method worked. It allowed
13:52
AlphaGo to get really good. And
13:54
when it was ready, Google set up a five-game
13:56
match between AlphaGo and world champion
13:58
Lisa Dole.
13:59
put up a million dollar prize. Hello
14:02
and welcome to the Google Deep
14:04
Mind Challenge match live
14:06
from the four seasons in Seoul, Korea.
14:10
AlphaGo took the first game, which totally
14:12
surprised Lee. So in the next game,
14:14
he played a lot more carefully. But
14:17
game two is when things started to get really
14:20
strange. That's a very surprising
14:23
move. I thought it was
14:25
a mistake. On
14:28
the 37th move of the game, AlphaGo
14:30
shocked everyone watching, even other
14:33
expert go players. When I see
14:35
this move, for me it's just the big
14:37
shock.
14:38
What? Normally
14:40
human will never play this one because it's
14:42
bad. It's just bad.
14:46
Move 37 was super risky. People
14:48
didn't really understand what was going on. But
14:51
this move was a turning point. Pretty soon,
14:54
AlphaGo started taking control of the board.
14:57
And the audience sensed a shift. The
15:00
more I see this move, I
15:02
feel something changed. Maybe
15:05
for human we think it's bad, but for
15:07
AlphaGo, why not? Eventually,
15:11
Lee accepted that there was nothing he could do, and
15:13
he resigned. AlphaGo scores
15:16
another win in a dramatic
15:18
and exciting game that I'm sure people
15:20
are going to be analyzing and discussing
15:23
for a long time.
15:24
AlphaGo ended up winning four out of five
15:26
matches against the world champion. But
15:29
no one really understood how. And
15:31
that, I think, sent a shock through a lot of people
15:34
who hadn't been thinking very hard about
15:36
AI and what it was capable of. It
15:38
was a much larger leap.
15:42
Move 37 didn't just change the
15:45
course of a go game. It represented
15:47
a seismic shift in the development
15:49
of AI. With Deep
15:51
Blue, scientists had understood every
15:53
move. They programmed it in. But
15:56
AlphaGo represented
15:57
something different. Researchers
15:59
didn't... really know how it worked.
16:02
They didn't hard-code AlphaGo's rules,
16:05
so they weren't always sure why it made the
16:07
moves it did. But those
16:09
decisions tended to work, even
16:11
the weird ones. AlphaGo had
16:13
demonstrated that an AI scientists
16:16
don't fully understand might actually
16:18
be more powerful than one they can explain.
16:21
AlphaGo
16:23
was a really impressive achievement
16:26
at the time. Nobody had expected that you could get
16:28
that far that fast. So it drove
16:31
a lot of people who hadn't been thinking about AI to
16:33
start thinking about AI. And that
16:35
meant there was also more attention to
16:37
slightly different approaches.
16:39
Teams working on systems like this started getting
16:41
more confidence, more funding, more
16:44
computer power. All kinds
16:46
of AI started popping up, like better image
16:48
recognition, augmented reality, and
16:50
then more recently, writing with
16:53
AIs like ChatGPT. But
16:55
ChatGPT isn't just a writing
16:58
tool. It's a broader, weirder
17:00
AI than anything that's come before.
17:03
And
17:03
it's an AI that's getting even harder to
17:06
understand.
17:08
That's next.
17:10
Support
17:13
for the show comes from ZocDoc. Finding
17:15
the right doctor is hard. It's
17:18
not just about finding someone that knows what they're
17:21
talking about. It's about finding someone you
17:23
vibe with, someone who's available when
17:25
you need them. So instead of spending
17:27
hours going through local listings, ZocDoc
17:30
makes it easy. ZocDoc is a free
17:32
app where you can find top-rated and patient-reviewed
17:35
doctors. You can filter for ones who take your
17:37
insurance, ones who are near you,
17:39
and you can find doctors that treat almost
17:41
any condition you're looking for. Once
17:43
you find someone that seems good, you can book them
17:46
immediately with just a few taps. You
17:48
can even book same-day appointments. You
17:51
can just go to ZocDoc.com
17:53
slash unexplainable and download
17:55
the ZocDoc app for free. Then
17:58
find and book a top-rated doctor today.
17:59
Many are available within 24 hours. That's
18:03
zocdoc.com
18:06
slash unexplainable. That's zocdoc.com
18:10
slash unexplainable.
18:17
Support for this week's show comes from NetSuite.
18:19
In business, just like in science, you
18:22
need tons of data to make good decisions,
18:24
especially if you're at a company that's bringing in
18:27
lots of revenue. That's where NetSuite
18:29
by Oracle comes in. NetSuite is
18:31
a cloud-based business software solution
18:33
that gives you the visibility and data you
18:35
need to make better business decisions. Right
18:38
now, NetSuite is offering new users
18:40
their service with no payments and no
18:42
interest for six months. You'll get access
18:45
to everything NetSuite offers, like
18:47
tools to help reduce manual processes,
18:49
boost efficiency, build forecasts,
18:51
increase productivity, and you'll
18:53
get all of this at no cost for six
18:55
months. If you've been sizing
18:58
NetSuite up to make the switch, then you know this
19:00
deal is unprecedented. No interest,
19:02
no payments. You can take advantage
19:04
of this special financing offer at NetSuite.com
19:07
slash unexplainable. NetSuite.com
19:11
slash unexplainable to get the visibility
19:13
and control you need to weather any storm.
19:16
NetSuite.com slash
19:18
unexplainable.
19:26
Players, Sam
19:30
says go.
19:31
I think of the last 30 years of AI development
19:34
as having three major turning points.
19:37
The first one was Deep Blue, an AI that
19:39
could play something complicated like chess
19:41
better than even the best human. It
19:43
was powerful,
19:45
but it was fully understandable.
19:47
The second one was AlphaGo, an AI that
19:49
could play something way more complicated than
19:51
chess. But this time, scientists
19:53
didn't tell it the right way to play.
19:56
AlphaGo was trained to play Go by trial
19:58
and error,
22:00
knowns at the heart of chat GPT. Even
22:04
when chat GPT creates an obvious seeming
22:06
response, researchers can't
22:08
fully explain how it's happening. Just
22:11
like they can't really explain individual moves
22:13
from AlphaGo. They just know that
22:16
certain neural connections are stronger, certain
22:18
connections are weaker, and somehow
22:21
that all leads to casual sounding language.
22:24
We don't really know what they're doing in any deep sense.
22:27
If we open up chat GPT or a system
22:29
like it and look inside, you
22:31
just see millions of numbers flipping
22:33
around a few hundred times a second, and
22:36
we just have no idea what any of it means. We're
22:39
really just kind of steering these things almost
22:41
completely through trial and error.
22:44
This trial and error method has worked so
22:47
well that typing to chat GPT
22:49
can feel a lot like chatting with
22:51
a human, which has led a lot of people to
22:53
trust it, even though it's not designed to provide
22:56
factual information, like one lawyer
22:58
did recently. The lawsuit
23:00
was written by a
23:01
lawyer who actually used chat
23:04
GPT, and in his brief
23:07
cited a dozen relevant decisions.
23:10
All of those decisions,
23:11
however, were completely invented
23:14
by chat GPT. But
23:16
it seems like there might be more going on here
23:19
than just a chatbot parroting language.
23:22
Just like AlphaGo, chat GPT has
23:24
started making moves researchers didn't
23:26
anticipate. It was only trained
23:28
to generate coherent responses, but
23:31
the latest model, GPT-4, it
23:34
started doing things that seem more sophisticated.
23:37
Some things are more expected for a text
23:39
predictor, like it's gotten pretty good
23:41
at writing convincing essays. But
23:44
then there are things that seem like kind of a weird
23:46
jump. The things I was talking about at
23:48
the top of the episode that first got me so fascinated
23:51
with GPT-4. It's gotten
23:53
pretty good at Morse code. It
23:55
can get a great score on the bar exam.
23:58
It can write computer code on the bar.
23:59
to generate entire websites. And
24:02
this kind of thing can get uncanny. Ethan
24:05
Mollick, a Wharton business professor, he
24:07
talked about this on the Forward Thinking podcast,
24:10
where he said that he used GPT-4 to create
24:12
a business strategy in 30 minutes, something
24:15
he called superhuman. In 30
24:17
minutes, the AI, which is a little bit of prompting
24:19
for me, came up with a really good marketing strategy,
24:21
a full email marketing campaign, which was excellent,
24:24
by the way, and I've run a bunch of these kind of things in the
24:26
past, wrote a website, created
24:28
the website, along with CSS files,
24:29
everything else you would need, and
24:32
created a full social media campaign, 30 minutes. I
24:35
know from experience that this would be a team of people working
24:37
for a week.
24:40
A few researchers at Microsoft were looking at
24:42
all of these abilities, and they wanted to
24:44
test just how much GPT-4
24:46
could really do. They wanted to be sure that
24:48
GPT-4 wasn't just parroting language
24:51
it had already seen. So they designed
24:53
a question that couldn't be found anywhere
24:56
online.
24:56
They gave it the following prompt. Here we
24:59
have a book, nine eggs, a laptop,
25:01
a bottle, and a nail. Please tell
25:03
me how to stack them onto each other in a stable
25:06
manner.
25:06
An earlier model had totally
25:09
failed at this. It recommended that a researcher
25:11
try balancing an egg
25:13
on top of a nail, and then putting
25:15
that whole thing on top of a bottle.
25:18
But GPT-4 responded like this.
25:20
Place the book flat on a level
25:22
surface, such as a table or a floor.
25:25
The book will serve as the base of the stack and
25:27
provide a large and sturdy support. Arrange
25:30
the nine eggs in a three by three square
25:32
on top of the book, leaving some space between
25:35
them. The eggs will form a second
25:37
layer and distribute the weight evenly. GPT-4
25:40
went on recommending that the researchers
25:42
use that layer of eggs as a level
25:44
base for the laptop,
25:46
then put the bottle on the laptop.
25:49
And finally, place the nail on top
25:51
of the bottle cap with the pointy end facing
25:53
up and the flat end facing down. The
25:55
nail will be the final and smallest object
25:58
in the stack. Somehow. So, GPT-4
26:01
had come up with a pretty good and apparently
26:04
original way to get these random
26:07
objects to actually balance.
26:12
It's not clear exactly what to
26:14
make of this. The Microsoft researchers
26:16
claim that GPT-4 isn't just predicting
26:19
words anymore. That in some sense
26:21
it actually understands the meanings
26:23
behind the words it's using. That
26:26
somehow it has a basic grasp of physics.
26:30
Other experts have called claims like this quote, silly.
26:33
That Microsoft's approach of focusing on a few
26:35
impressive examples isn't scientific.
26:38
And they point to other examples of obvious failures,
26:40
like how GPT-4
26:42
often can't even win a tic-tac-toe.
26:45
It's also worth noting that Microsoft has
26:47
a vested interest here. They're
26:49
a huge investor in open AI, so they
26:51
might be tempted to see humanness where there
26:53
isn't any.
26:55
The truth of how intelligent GPT-4
26:58
is, it might be somewhere in the middle.
27:01
It's not as though the two extremes are like complete
27:04
smoke and mirrors and human
27:07
intelligence. Ellie Pavlik is a computer
27:09
science professor at Brown. There's
27:11
a lot of places for things in between to
27:13
be more intelligent than
27:15
the systems we've had and have
27:18
certain types of abilities, but that doesn't mean
27:20
we've created intelligence
27:23
of a variety that should force
27:25
us to question our humanity or like putting
27:27
it as like these are the two options, I think,
27:30
oversimplifies and like makes it so
27:32
that there's no room for the thing that probably we actually
27:34
did create, which is a very exciting, quite
27:37
intelligent system but not human or
27:40
human level even. At
27:45
this point, we really can't say
27:48
if GPT-4 has any level of understanding
27:51
or really what understanding would
27:53
even mean for a computer, which is just
27:55
another level of uncanniness here.
27:58
And honestly, it's a difficult debate. debate to even
28:00
write about.
28:01
In working on this script, I found myself tempted
28:04
to keep using words like learn
28:06
or decide or do in describing
28:08
AI. These are all words
28:10
we use to describe how humans behave.
28:13
And I can see how tempting it is to use them for
28:15
AI,
28:16
even if it might not be appropriate.
28:18
But for his part, Sam is less concerned
28:21
with how to describe GPT-4's
28:23
internal experience than he is
28:25
with what it can do.
28:27
Because it's just weird that
28:29
based on the training it got, GPT-4
28:31
can create business strategy, that
28:33
it can write code, that it can figure
28:35
out how to stack nails on bottles
28:38
on eggs. None of that was designed
28:40
in. You're running the same code to get
28:42
all these different levels of behavior.
28:45
What's unsettling for Sam is that
28:47
if GPT-4 can do things like this
28:49
that weren't designed in,
28:51
companies like OpenAI might not be
28:53
able to predict what the next systems will be able
28:55
to do. These companies can't really say,
28:58
all right, next year we're going to be able to do this, then the
29:00
year after we're going to be able to do that.
29:02
They don't know at that point what it's going to be able to do.
29:05
They just got to wait and see, all right, what is it
29:07
capable of doing? Can it write a pastable essay? Can
29:09
it solve high school math problem? Just
29:12
putting these systems out in the world and seeing what they do.
29:14
And it's worth emphasizing that so many of
29:17
GPT-4's abilities were discovered
29:19
only after it was released to the public.
29:21
This seems like the recipe for being caught by
29:23
surprise when these things out in the world. And
29:27
laying the groundwork to have this go well is
29:30
going to be much harder than it needs to be.
29:33
Some researchers like Ellie have pushed
29:35
back on the idea that these abilities are fundamentally
29:37
unpredictable. We might just not
29:39
be able to predict them yet.
29:41
The science will get better. It just hasn't
29:44
caught up yet because this has all been happening in a
29:46
short time frame. But it is possible
29:48
that this is
29:49
a whole new beast and it's actually a fundamentally
29:51
unpredictable thing. That is a possibility. We definitely
29:53
can't rule it out. As AI starts
29:55
to get more powerful and more integrated
29:58
into the world, the fact that... its
30:00
creators can't fully explain
30:02
it becomes a lot more of a liability.
30:04
So some researchers are pushing for more effort
30:07
to go into demystifying AI, making
30:10
it interpretable.
30:11
Interpreterability as a goal in AI research
30:14
is being able to look inside our systems
30:16
and say what they're doing, why
30:19
they're doing it, just explain
30:21
clearly what's happening inside of a system.
30:23
Sam says there are two main ways to approach
30:25
this problem. One is to try to decipher
30:28
the systems we already have. To
30:30
understand what these billions of numbers
30:32
going up and down actually mean.
30:35
The other avenue of research is trying to build
30:37
systems that
30:39
can do a lot of the powerful things that we're excited about with
30:41
something like GPT-4. But where there
30:44
aren't these giant inscrutable piles of numbers
30:46
in the middle, where by design every
30:49
piece of the network, every piece of the system,
30:51
means something that we can understand.
30:53
But because every piece of these systems has
30:55
to be explainable, engineers often
30:57
have to make difficult choices that end
31:00
up limiting the power of these kind of AIs. Both
31:02
of these have turned out
31:04
in practice to be extremely, extremely
31:06
hard. I think we're not making particularly
31:08
fast progress on either of them, unfortunately. There
31:10
are a few reasons why this is so hard.
31:13
One is because these models are based
31:15
on the brain. If we ask questions about the human
31:17
brain, we very often don't have good
31:20
answers. We can't look
31:22
at how a person thinks and really explain
31:24
their reasoning by looking at the firings of the neurons.
31:27
We don't yet
31:28
really have the language, really have the concepts
31:31
that let us think in detail about the
31:34
kind of thing that a mind does.
31:36
And the second reason is that the amount of calculations
31:38
going on in GPT-4 is just
31:41
astronomically huge. There are
31:43
hundreds of billions of connections
31:46
in these neural networks. And so even if you can find a way that
31:48
if you stare at a piece of the network for a few hours, you
31:51
can make some good guesses about what's going on, we
31:53
would need every single person on Earth to be
31:56
staring at this network to really get through all of the
31:58
work of explaining it.
31:59
But there's another trickier issue
32:02
here. Unexplainability may
32:04
just end up being the bargain researchers
32:07
have made.
32:10
When scientists tasked AI to develop
32:12
its own capabilities, they allowed
32:14
it to generate solutions we can't
32:16
explain. It's not just parroting
32:18
our human knowledge back at us anymore. It's
32:21
something new. It might
32:23
be understanding, it might be learning,
32:25
it might be something else. The
32:28
weirdest thing is that right now we
32:30
don't know. We could end
32:33
up figuring it out someday, but there's no
32:35
guarantee. And companies are still
32:38
racing forward, deploying these programs
32:39
that might be as powerful
32:42
as they are because of our
32:44
lack of understanding.
32:46
We've got increasingly clear evidence that this technology
32:48
is improving very quickly in directions
32:51
that seem like they're aimed at some very, very important stuff
32:53
and potentially destabilizing to a lot
32:55
of important institutions.
32:58
We don't know how fast it's moving.
33:00
We don't know why it's working when it's
33:02
working. And that seems
33:04
very plausible to me. That's going to be the defining
33:07
story of the next decade or so is how we come
33:10
to a better understanding of this and how we navigate it.
33:24
Next week on the second part of our Black Box
33:26
series, how do we get ahead
33:28
of a technology
33:30
we don't understand? We've
33:32
seen this story play out before. Tech
33:35
companies essentially run mass
33:37
experiments on society. We're
33:39
now prepared. Huge harms happen.
33:42
And then afterwards we start to catch up and we say,
33:44
oh, we shouldn't let that catastrophe happen again. I
33:46
want us to get out in front of the catastrophe.
33:53
with
34:00
help from Bird Pinkerton and Meredith Hodnot,
34:02
who also manages our team. Mixing
34:04
and sound design from Christian Ayala, music
34:07
from me, fact-checking from Serena
34:09
Solon, Tian Nguyen, Bird,
34:11
and Meredith. And Manding
34:13
Nguyen is out on the prowl. For
34:16
the robo voice at the top of the episode, I
34:18
used a program called Descript. If
34:20
you're curious about their privacy policy, you
34:22
can find it at descript.com
34:25
slash privacy. And just
34:27
a quick note, Sam Bowman runs a research
34:29
group
34:29
at NYU, which has received funding from
34:32
Open Philanthropy, a nonprofit that funds
34:34
research into AI, global health, and
34:36
scientific development.
34:37
My brother is a board member at OpenPhil, but
34:40
he isn't involved in any other grant decisions.
34:43
Special thanks this week to Tanya Pye, Brian
34:46
Kaplan, Dan Hendricks, Alan Chan,
34:48
Gabe Gomez, and an extra thank
34:50
you to Kelsey Piper for just being
34:52
amazing. If you have thoughts about
34:54
the show, email us at unexplainable at vox.com,
34:57
or you could leave us a review or a rating, which
35:00
we'd also love. Unexplainable
35:02
is part of the Vox Media Podcast Network, and
35:04
we'll be back with episode two of our Black
35:07
Box series
35:08
next week.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More