Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
Our AI algorithms today are not
0:02
particularly efficient. In a GPU, most
0:04
of the time it's doing inference,
0:06
it's 5 or 7% utilized. That
0:08
means it's 95 or 93% wasted.
0:10
We won't be as dependent on
0:12
transformers in three years or five
0:14
years as we are now. 100%.
0:16
The fundamental architecture of the GPU
0:18
with off-chip memory is not great
0:20
for inference. Now they will continue
0:22
to do well in inference, but
0:24
it can be beaten, and I
0:26
think they know it. This
0:29
is 20 VC with me Harry Stebbings.
0:31
Now we did a show with
0:33
Jonathan Ross at Grock and it
0:35
blew all numbers out of the
0:37
water. Millions of plays, everyone loved
0:39
it, and everyone said that we
0:41
had to get Andrew Feldman from
0:43
cerebras on the show. So I'm
0:45
so excited to make this episode
0:47
happen today. Joining us in the
0:49
hot seat is Andrew Feldman, co-founder
0:51
and CEO of cerebras, the fastest
0:53
AI inference and training platform in
0:55
the world. Now in September 2024,
0:57
the company filed to go public off
1:00
the back of a rumoured one billion
1:02
dollar deal with G42 in the UAE.
1:04
in the inference market. Andrew is the
1:06
leading expert for all things inference. This
1:08
show was incredible. I have the best
1:10
job in the world. I sit down
1:12
with the smartest people and learn from
1:14
them and this show is exactly that.
1:16
But before we dive in today...
1:18
turning your back of a napkin
1:20
idea into a billion-dollar startup requires
1:22
countless hours of collaboration and teamwork.
1:25
It can be really difficult to
1:27
build a team that's aligned on
1:29
everything from values to workflow, but
1:31
that's exactly what CODA was made
1:33
to do. CODA is an all-in-one
1:35
collaborative workspace that started as a
1:37
napkin sketch. Now, just five years
1:39
since launching and baser, CODA has
1:41
helped 50,000 teams all over the
1:43
world get on the same page.
1:45
Now at 20 VC, we've used
1:47
CODA. to bring structure to our content
1:49
planning and episode prep. And it's made
1:52
a huge difference. Instead of bouncing between
1:54
different tools, we can keep everything from
1:56
guest research to scheduling and notes all
1:58
in one place. saves us so
2:01
much time. With codey you get
2:03
the flexibility of docs, the structure
2:05
of spreadsheets, and the power of
2:07
applications, all built for enterprise. and
2:09
it's got the intelligence of AI
2:11
which makes it even more awesome.
2:13
If you're a startup team looking
2:15
to increase alignment and agility, Codea
2:17
can help you move from planning
2:19
to execution in record time. To
2:21
try it for yourself, go to
2:23
codea.io/two zero VC today and get
2:25
six free months of the team
2:28
plan for startups. That's codea.io/two zero
2:30
VC to get started for free
2:32
and get six free months of
2:34
the team plan. Now that your
2:36
team is aligned and collaborating, let's
2:38
reports. You know, those receipts that
2:40
seem to multiply like rabbits in
2:42
your wallet, the endless email chains
2:44
asking, can you approve this? Don't
2:46
even get me started on a
2:48
month and panic when you realize
2:50
you have to reconcile it all.
2:52
Well, Plio offers smart company cards,
2:55
physical, virtual and vendor-specific. So teams
2:57
can buy what they need while
2:59
finance stays in control. Automate your
3:01
expanse reports. Process invoices seamlessly and
3:03
manage reimbursements effortlessly. All in one
3:05
platform. With integrations to tools like...
3:07
Like Zero, Quick Books and Net
3:09
Suite, Plio fits right into your
3:11
workflow, saving time and giving you
3:13
full visibility over every entity, payment
3:15
and subscription. Join over 37,000 companies
3:17
already using Plio to streamline their
3:19
finances. Try Plio today. It's like
3:22
magic, but with fewer rabbits. Find
3:24
out more at plio.io.org/20 VC. And
3:26
don't forget to revolutionize how your
3:28
team works together. Rome, a company
3:30
of tomorrow runs at hyperspeed hypersped
3:32
with... quick drop-in meetings. A company
3:34
of tomorrow is globally distributed and
3:36
fully digitized. The company of tomorrow
3:38
instantly connects human and AI workers.
3:40
A company of tomorrow is in
3:42
a Rome virtual office. See a
3:44
visualization of your whole company, the
3:46
live presence, the drop-in meetings, the
3:49
AI summaries, the chats. It's an
3:51
incredible view to see. Rome is
3:53
a breakthrough workplace experience loved by
3:55
over five... 100 companies of tomorrow
3:57
for a fraction of the cost
3:59
of Zoom and Slack. Visit Rome,
4:01
that's O.R. dot A.M. for an
4:03
instant demo of Rome today. Nobody
4:05
knows what the future holds. But
4:07
I do know this. It's going
4:09
to be built in a Rome
4:11
virtual office. Hopefully by you. That's
4:13
Rome. R.O. dot A.M. for an
4:16
instant demo. You have now arrived
4:18
at your destination. Andrew, it is
4:20
such a pleasure to meet man.
4:22
I've wanted to do this one
4:24
for a while. I've heard so
4:26
many good things from Eric for
4:28
a long time. So thank you
4:30
so much for joining me. Harry,
4:32
thank you for having me, I
4:34
appreciate it. Not at all, this
4:36
will be a fantastic conversation. I
4:38
have my pen ready. I feel
4:41
like this is going to be
4:43
a learning experience for me. I
4:45
want to go back to 2015.
4:47
What did you and the team
4:49
see in the AI landscape in
4:51
2015 that led to the founding
4:53
of Sarubris? This is every computer
4:55
architect's dream. We saw a new
4:57
problem to solve. What that means
4:59
is maybe you can build a
5:01
new machine better suited to that
5:03
problem. And so in 2015, and
5:05
the credit goes to Gary and
5:08
Sean and JP and Michael, my
5:10
co-founders, they saw on the horizon
5:12
the rise of AI. And what
5:14
that meant was there'd be a
5:16
new problem for computers that what
5:18
the AI software would ask. from
5:20
the underlying chip processor would be
5:22
different. We came to believe that
5:24
we could build a better machine
5:26
for that problem. That's what we
5:28
saw. You know, obviously we didn't
5:30
see it exactly right. I underestimated
5:32
it. You know, this is my
5:35
fifth startup and the first time
5:37
I underestimated the size of the
5:39
market by a lot. But what
5:41
we did get right was that
5:43
this was going to be big
5:45
and it would put a different
5:47
type of pressure on a processor.
5:49
and that it would put pressure
5:51
on the memory bandwidth, that it
5:53
would put pressure on the communication
5:55
structure. That's what we saw, we
5:57
dove in, it's been an extraordinary
5:59
nine years. How does the movement
6:02
into an age of AI change?
6:04
the requirements from a chip perspective
6:06
of what is needed for a
6:08
provider and how that then resulted
6:10
in how you built cerebral. The
6:12
way to think about a chip
6:14
is it does two things. It
6:16
does calculations and it moves data.
6:18
This is what a chip does,
6:20
sometimes along the way it stores
6:22
data. And so what AI presented
6:24
was a very unusual combination of
6:26
challenges. First, the underlying calculation is
6:29
trivial calculation is trivial. It's a
6:31
matrix multiplication. And an F-Mac can
6:33
be developed by any second-year electrical
6:35
engineering student. So you say to
6:37
yourself, holy cow, this has a
6:39
huge number of very, very simple
6:41
calculations. The hard part where the
6:44
I work is results and intermediate
6:46
results have to be moved a
6:48
lot. Therein is the most complicated
6:50
part. They have to be moved
6:52
to memory and from memory, and
6:54
they have to be broken up
6:57
and moved among GPUs. And what
6:59
we saw was that this was
7:01
going to be the hard problem,
7:03
and that if we could solve
7:05
for that problem, we would build
7:08
an AI computer that was faster
7:10
and use less power. When we
7:12
think about how we're going to
7:14
build and what we're building for,
7:16
to me, kind of a couple
7:18
of core elements, which is like,
7:21
why are you going to focus?
7:23
Are you focusing on fine tuning?
7:25
Are you focusing on training? Are
7:27
you focusing on inference? Three. You
7:29
chose all three. Yeah. Why? And
7:31
I'm sorry for my base questions,
7:34
but I thought like GPUs were
7:36
specialized towards training and they weren't
7:38
specialized towards inference. Can you have
7:40
a mono architecture that does three
7:42
best? The first step in computer
7:44
architecture is deciding what you're not
7:47
going to do. What are we
7:49
not going to be good at
7:51
is really the first important question.
7:53
To answer your question, you say,
7:55
is the computational work for training
7:57
from scratch? different from fine tuning?
8:00
And the answer is it's not
8:02
different. It's approximately the same. Now,
8:04
inference and training have some different
8:06
requirements. And generative inference in particular
8:08
has some very challenging requirements on
8:10
exactly the communication dimension that I
8:13
mentioned. In generative inference, you have
8:15
to move all the weights from
8:17
memory to compute to generate a
8:19
single word. And you have to move
8:21
them again to generate the next word.
8:23
And again, so if you have a 70
8:25
billion parameter model, not a giant
8:28
model, and each weight is
8:30
16 bits, You're moving what,
8:32
140 gigabytes of data to
8:34
generate one word. This is
8:36
an enormous amount of data
8:38
movement across memory and that's
8:40
called what it's consumed that
8:43
needs is memory bandwidth. If
8:45
you have an architecture like
8:47
we saw in the GPU,
8:49
that is your fundamental limitation.
8:51
That was what we went to a
8:53
way for scale to solve. They use
8:56
memory, a memory called HBM. A memory
8:58
called HBM. type of d-ram. It is
9:00
phenomenal memory, but it's slow
9:02
and high capacity. And when they
9:04
set the architecture for graphics, that's
9:07
what you wanted. You didn't have
9:09
to go back and forth to
9:11
memory very often. S-ram, on the
9:13
other hand, is unbelievably fast, but
9:15
has low capacity. And so we
9:17
wanted to use S-ram, but if
9:19
you build a normal-sized chip, you
9:21
can't hold a model. And so
9:23
by going to wafer scale, we
9:25
were able to put down a
9:27
huge amount of S-ram. and get
9:29
the benefits of speed and enough
9:31
capacity. If you build a normal-sized
9:33
chip with SRAM and you want to
9:35
do a 400 billion parameter model in
9:37
inference, you might need 4,000 chips or
9:39
if you want to do a deep-sea
9:42
671, you might need six or eight
9:44
thousand chips. What an administrative nightmare? And
9:46
if you can keep it on as
9:48
much as you can on one wave
9:50
or two wavered or four or ten,
9:52
you get all the benefit of the
9:55
SRAM and because you've been able to
9:57
the wafer, you get... There's tremendous capacity
9:59
as well. first, I totally get you
10:01
on HPM and kind of the slowness
10:03
of it. Why is it then that
10:05
Bluntley so much of the market just
10:08
continues to use it and 40% of
10:10
invidious revenue is using dead ships for
10:12
inference? Unless you went to Wayford Scale,
10:14
there wasn't really a credible other choice.
10:16
This is the way GPUs had always
10:18
been made. It's called the graphics processing
10:20
unit. That's the way they were built.
10:23
It was part of their advantage against
10:25
a CPU. dedicated chips like ours. What
10:27
used to be their advantage is another
10:29
weakness. That's a fun market to be
10:31
in, when over a very short period
10:33
of time, what you're good at becomes
10:35
your weakness. With a market cap, like
10:38
they do, and with Jensen as good
10:40
as he is, which I'm sure we
10:42
both agree with. They must know. They
10:44
do know this. A, they don't make
10:46
memory. So they're a consumer of other
10:48
people's memory. And that's SK. The High
10:50
Next guys or Samsung in Micron. They're
10:53
only three or four or four or
10:55
four or five companies that make huge
10:57
amounts of companies that make huge amounts
10:59
of memory. Not many choices. But it's
11:01
part of a complex architectural trade-off. In
11:03
the flip side, you could say it's
11:06
worked really well for them. Right? Look
11:08
at where it's taken them. But in
11:10
comparison to those of us who do
11:12
way for scale, it's a small set.
11:14
It's a set of one, us. We
11:16
have a real advantage against them on
11:18
inference. How do LPs fit into this
11:21
mix? In our business, there are a
11:23
lot of ways to skin a cat.
11:25
Our way is different than invidious way.
11:27
It's different than the TPU. It's different
11:29
from training them. They're different. Right now,
11:31
and every day since August 26th, when
11:33
we launched inference, our way has been
11:36
the fastest way across. a whole set
11:38
of models tested by artificial analysis and
11:40
others. Can I ask when we think
11:42
about kind of that speed? I am
11:44
interested you said that kind of you're
11:46
one of one with wafers and kind
11:48
of the architecture associated. What does that
11:51
mean in terms of cost? With such
11:53
efficiency, is it inherently more expensive? And
11:55
what does that look like from a
11:57
cost profile? This isn't our first dance.
11:59
We've been building computers for a long
12:01
time. When you make a choice like
12:03
wave for scale, you have to weigh
12:06
the trade-offs. We use less power because
12:08
one of the most power-hungry things on
12:10
a chip are the IOs. Are moving
12:12
data off-chip. And so if you are
12:14
moving data off-chip frequently, you're using more
12:16
power. Then if you can keep it
12:18
in the silicon domain on chip. So
12:21
we knew we would use less power.
12:23
We knew if you went to wafer
12:25
scale that you had to solve some
12:27
problems that people said were impossible to
12:29
solve like yield. So we had to
12:31
invent techniques that allowed us to yield
12:34
wafers. In fact, we invented techniques that
12:36
allow us to yield as well or
12:38
better than others who were building much
12:40
smaller choves. What is yield and why
12:42
is it impossible to solve? A wafer
12:44
begins a 12-inch diameter circle slice of
12:46
silicon. And your chip is punched out
12:49
of this, the way your mother might
12:51
take a cookie cutter and cut out
12:53
cookie dough. During the process, at some
12:55
point, just like your mom might have
12:57
done, she lifts up the edges and
12:59
all the little bits are removed and
13:01
what's left are just the cookies. Those
13:04
are your chips. Now, what happens is
13:06
there are a set of naturally occurring
13:08
flaws. And that's like your mother closing
13:10
rise and throwing up a handful of
13:12
M&M's. Now, the bigger the cookie, the
13:14
higher probability you hit an M&M, the
13:16
bigger the chip, the higher the possibility
13:19
that you have a flaw. And traditionally,
13:21
what you did when you had a
13:23
flaw was you threw away the chip,
13:25
or you sold it as a less
13:27
valuable part. You shut down part of
13:29
the chip and sold it as a
13:31
less valuable part, something called binning. So,
13:34
every wafer is going to have flaws.
13:36
The bigger your chip, the higher probability
13:38
you hit a flaw and the more
13:40
part of silicon is wasted when you
13:42
throw it you throw it away. This
13:44
is what everybody thought was known truth.
13:47
And one of the things our team
13:49
realized was that... There are other ways
13:51
to handle flaws. What if instead you
13:53
built your computer, you built your processor,
13:55
out of hundreds of thousands of identical
13:57
tiles, and say there was a flaw,
13:59
say you just shut down that tile
14:02
and worked around it, say you had
14:04
a row or a column of redundant
14:06
tiles, that when you needed them, you
14:08
could just pull in. Now, that had
14:10
been traditionally the technique used in memory
14:12
making, and the memory yields are extraordinary.
14:14
And so it occurred to us that
14:17
if we could build a computer, build
14:19
a processor, built of hundreds of thousands
14:21
of identical tiles, we could use redundancy
14:23
such that when there was a flaw,
14:25
we could just leave it there, shut
14:27
it down, work around it, and pull
14:29
in one of the redundant tiles. And
14:32
that had never been done in a
14:34
computer before, and that's at the heart
14:36
of our architecture. That allowed us to
14:38
yield and deliver whole wafers. Nobody ever
14:40
been able to do that in the
14:42
70-year history of our industry. Really, really
14:44
smart people struggled. I mean, Gene Amdahl,
14:47
one of the fathers of our industry,
14:49
had a company called Trilogy that crashed
14:51
and burned trying to do this. And
14:53
we figured it out. When you speak
14:55
about kind of being the fastest and
14:57
across-all benchmarks being the fastest, what matters
14:59
the most? Is it being the most
15:02
efficient? Is it being the most efficient?
15:04
Is it being the least efficient? Is
15:06
it being the least costly? If you
15:08
go to get a cancer diagnosis on
15:10
God forbid your mother or your wife,
15:12
I think 93% accuracy is just plain
15:15
not as good as 94% accuracy. And
15:17
you pay a lot and wait another
15:19
week to understand what the accuracy is,
15:21
right? You pay a lot. Now, on
15:23
the other hand, if you want Lama
15:25
405B to generate data to help you
15:27
tune Lama 70B, maybe you can wait
15:30
a few days, three days a week
15:32
more. There's no urgency there. On the
15:34
other hand, if you want an answer
15:36
from perplexity, you don't want to wait
15:38
45 seconds for a search answer. You
15:40
don't want to... wait three minutes for
15:42
R1 on GPUs to give you an
15:45
answer. What we know is that in
15:47
interactive mode, milliseconds matter. In interactive mode,
15:49
what Ours holds over Google years ago
15:51
showed was that you can destroy your
15:53
user's attention with milliseconds a delay. So
15:55
being the fastest matters, everything in that
15:57
domain. So I think what you have
16:00
to do is sort of be thoughtful
16:02
and say in some cases, being the
16:04
fastest doesn't matter. We'll call those batch.
16:06
Lots of maybe the cheapest matters there.
16:08
In other domains, there is no search
16:10
if you got to wait eight minutes
16:12
to get an answer. That's not a
16:15
product. When you go fast, a whole
16:17
set of new opportunities open up. Netflix
16:19
used to mail DVDs. That's what happened
16:21
when the internet was slow. They mailed
16:23
DVDs. I looked young Andrew, I'm not
16:25
that young. I remember blockbuster. Yeah. Well,
16:28
you remember blockbuster, right. First, I mean,
16:30
let's look at the history of that.
16:32
You're exactly right. First, we used to
16:34
drive to blockbuster to get a DVD.
16:36
Then, Netflix was mailing them to us,
16:38
and then we got broadband, and there
16:40
suddenly, Amazon's a studio, right? It changed
16:43
everything. And speed in inference does the
16:45
same thing. When we chatted before, you
16:47
gave this great equation for inference. What
16:49
was the equation that you gave for
16:51
inference? Because it was really helpful for
16:53
me in understanding. It begins with the
16:55
following. Training makes AI. That's how we
16:58
make AI. And inference is how we
17:00
use or consume AI. And so understanding
17:02
how big the inference market is, is
17:04
understanding the number of people who are
17:06
going to use it, how often they're
17:08
going to use it, times how much
17:10
compute each use takes. And right now
17:13
we are in this rare time where
17:15
the number of people using AI is
17:17
growing. the frequency with which they use
17:19
it is growing and the amount of
17:21
compute used in each instance of use
17:23
is growing. That's why you're getting this
17:25
extraordinary growth. And that's why it's off
17:28
the charts right now. When we think
17:30
about the dish. distribution of
17:32
resources between training
17:34
and inference. What will
17:36
that look like
17:38
in five years' time?
17:40
Because we've seen
17:43
all focus go to
17:45
training, well, not
17:47
all, but a lot
17:49
of focus go
17:51
to training and not
17:53
as much go
17:56
to inference. What does
17:58
that look like?
18:00
What we made until
18:02
the middle of
18:04
2024, what we made
18:06
in AI was
18:08
a novelty. It wasn't
18:11
very useful. Late
18:13
in 2024, what we
18:15
made began to
18:17
be useful. What was
18:19
the turning point?
18:21
If you look at
18:23
the models, they
18:26
became, I mean, ChatGPT
18:28
was not really a
18:30
technical innovation. It was a user
18:32
interface invention. But it gave more
18:34
people access. But we didn't really
18:36
right away know what to do
18:38
with it. It was cool. Right?
18:40
That's what I mean by novelty.
18:42
It was like, whoa, this is
18:44
cool. Now, if your marketing team
18:47
isn't on an LLM each person
18:49
several times a day, they're not
18:51
doing their job. That difference between
18:53
novelty, it's cool. And this is
18:55
part of everyday workflow. That's what
18:57
changed starting sometime in Q4 last
18:59
year and running into this year
19:01
is AI became useful, not just
19:03
to a select group in Silicon
19:05
Valley, but to my dad, to
19:07
my brothers, the doctors, to ordinary
19:09
people who aren't buried in the
19:11
Silicon Valley discussion. And when you
19:13
get them, then the market is
19:15
ripping. Do you not still think
19:17
we are so incredibly early though?
19:19
Going back to your point about
19:21
how many, in five years time
19:23
then, where are we? Are we
19:25
a hundred times bigger? Are we
19:27
a thousand times bigger in terms
19:29
of demand? think we're way over
19:31
a hundred times bigger. What does
19:33
that mean in terms of what
19:35
we need to equip ourselves to deliver?
19:38
These are incredibly energy utilizing. It
19:40
is incredibly difficult. Our industry consumes a
19:42
lot of power. Yeah, and a
19:44
lot of water. And we're seeing that
19:46
come down. But are we equipped
19:48
from an energy and a data center
19:50
standpoint to deliver the inference requirements
19:52
for a population that is as AI
19:54
hungry as we are? I think
19:56
a couple things. I think the first
19:58
thing is to admit this is a
20:00
power-intensive problem. We consume, our industry consumes
20:03
an enormous amount of power. The second
20:05
thing to say is, therefore, the burden
20:07
is on us to deliver exceptional value
20:10
as an industry. You take both, the
20:12
good and the bad, right? In order
20:14
to make it worthwhile from a societal
20:17
perspective to expand all this power, you
20:19
better deliver the goods. We better use
20:22
AI to find cures for diseases.
20:24
We better use AI to solve a
20:26
bunch of different societal problems. That's the
20:28
macro view. Do I think that we
20:31
are equipped? I think we are in
20:33
a very unusual situation in the US
20:35
where we have plenty of power, but
20:37
it's in all the wrong places. We
20:39
have power in Niagara. What we don't
20:42
have is power where you want to
20:44
build data centers, where we have good
20:46
fiber. What we don't have is a
20:48
national way to relax the local regulations
20:51
that make getting power. And so when
20:53
you go to Silicon Valley, if you
20:55
want to build a data center, you're
20:57
dealing with local government and installed interests.
20:59
And that is not an efficient way
21:01
to decide if you want to build
21:03
a power plant or put a new
21:06
data center in, especially if it's large.
21:08
I think those places that have... ripped
21:10
out some of that burden in Texas,
21:12
for example, of getting a huge amount
21:14
of data centers built. You know, when
21:16
I spoke to, you know, Johnson at
21:18
Grock before, he said there were a
21:20
huge amount of data centers being built
21:22
that were not actually really equipped properly,
21:25
and that we've seen this massive
21:27
supply of data centers that are
21:29
really come done by tourists, so
21:31
to speak, and that is a
21:33
massive problem, and that the provisioning
21:35
of these data centers isn't there.
21:37
Do you agree? A data center
21:39
is a construction project? and it's
21:41
got a design engineering component. I
21:43
think there's been a huge push
21:45
for new construction data centers. We
21:47
will see, we don't know if
21:49
they're going to be good enough. I
21:51
think many of them will be fine.
21:53
The guys who were there early were
21:55
some of the Bitcoin mining companies, Terowulf,
21:58
the guys of Crusoe, and guys in...
22:00
Europe, they were early in building buildings
22:02
near low-cost power in order to
22:04
run compute that used a lot
22:06
of power. And they are some
22:08
of the leaders now in some
22:10
of the largest projects. Now those
22:12
are certainly not tourists. Those are
22:14
extremely sophisticated data center builders. Sure,
22:16
there are some tourists, but there
22:18
are a lot of very, very
22:20
knowledgeable data center builders building huge
22:22
facilities right now. I mean gigawatt
22:24
scale facilities, both domestically and internationally.
22:26
How do you think about how
22:28
the cost of inference goes down?
22:30
With the surge of demand that
22:32
we mentioned, you know, over 100x,
22:34
does the price reduce 100x? Does
22:36
it follow Moore's law continuously? How
22:38
do we think about the ever-reducing
22:40
price of inference? The cost of
22:42
inference is built up of several
22:44
pieces, right? There's the power in
22:46
space that is consumed to generate
22:48
their response. That's a data center
22:50
cost. That's an op-ex item, number
22:52
one. Number two, there's the... cost
22:54
of the computer. We can drive
22:56
down the cost of the computers
22:58
with each generation by driving up
23:01
their performance, etc. The other thing
23:03
we can do is we can
23:05
develop more efficient algorithms. Our AI
23:07
algorithms today are not particularly efficient.
23:09
There's a tremendous amount of room.
23:11
In a GPU, most of the
23:13
time it's doing inference, it's 5
23:15
or 7% utilized. That means it's
23:17
95 or 93% wasted. Over time,
23:19
I think as an industry, we
23:21
get better at things. We can
23:23
drive the cost of compute down,
23:25
we can build more efficient data
23:27
centers with lower PUEs, and our
23:29
algorithms will get more efficient so
23:31
that our utilizations on our now
23:33
cheaper computers are higher, so you
23:35
get a higher percentage of the
23:37
maximum number of flops. You get
23:39
more tokens per unit time for
23:41
the same power. When you look
23:43
at the inefficiency of the algorithms,
23:45
as you mentioned that, and what
23:47
that means for the utilization of
23:49
the chips, why are people suggesting
23:51
that we're scaling laws already? That
23:53
seems to suggest that there is
23:55
so much room for improvement. How
23:57
do you think about what you
23:59
just said? in conjunction with the
24:01
idea that scaling laws were hitting
24:03
this asymptote point, how do you
24:05
reconcile it too? I don't think
24:07
there's a lot of debate among
24:09
senior ML thinkers that we have
24:11
tremendous room for algorithmic improvement. I
24:13
don't think there's a lot of
24:15
debate there. There's even debate about
24:17
whether the scaling laws are over,
24:19
whether we ran out of mojo
24:21
to keep. making data or gathering
24:23
data to fill these ever bigger
24:25
models. But open AI's work on
24:27
01 shows me that the scaling
24:29
law, certainly for inference, are fully
24:31
functional, right? And the more compute
24:33
you put on inference, the better
24:35
answer you get. Many of the
24:37
leading models are now MOEs. They're
24:39
not presenting all of the weights
24:41
to teach token, and that's one
24:43
way to do it. Present the
24:46
important stuff, not the unimportant stuff.
24:48
There are other ways to do
24:50
it that we will invent and
24:52
learn over time, but we have
24:54
human models that aren't all to
24:56
all connected. Many of our models
24:58
today are all to all connected.
25:00
That's a lot of unnecessary connections,
25:02
connections that don't produce anything that
25:04
we still end up doing math
25:06
over. I'm sorry, what does all
25:08
to all connected mean? In many
25:10
of the layers in a neural
25:12
network, every element is connected to
25:14
every other one. That's not the
25:16
way... actually the learning happens. Some
25:18
are more valuable and some are
25:20
not valuable at all. Imagine you
25:22
got to read 50 books, you
25:24
want to learn something, you can
25:26
read all 50 books, or you
25:28
could read three books that are
25:30
really important, or you could read
25:32
summaries of the three books that
25:34
are the most important. The problem
25:36
is we don't know which they
25:38
are at the beginning. And there's
25:40
a process that you could learn.
25:42
There's things called dropout and all
25:44
these other techniques to use sparsity
25:46
to help solve these problems. We
25:48
are early in the evolution of
25:50
AI. It plays right into this
25:52
point that we'll get better these
25:54
algorithms. Transformers aren't the end of
25:56
the world, right? We'll get better.
25:58
Better will mean fast and more
26:00
accurate and more efficient. That's what's
26:02
exciting about an everchanging industry. That's
26:04
why I'm not in. all these
26:06
other industries that don't change quickly.
26:08
Same nine years as they are
26:10
today. But this show is kind
26:12
of strange for me because I
26:14
speak to a lot of people
26:16
and they think about the three
26:18
pillars and they're like you know
26:20
compute algorithms and data, a lot
26:22
of the common refrain is that
26:24
a lot of the common refrain
26:26
is that actually we're very far
26:28
along in all of them and
26:30
that has been the refrain and
26:33
when I hear you it's like
26:35
actually it's very exciting. I think
26:37
they're wrong on all its underpinnings.
26:39
I think we are early in
26:41
all of them. If we just
26:43
take them one by one, in
26:45
five years' time, how much synthetic
26:47
versus human data will be used
26:49
to train models if you were
26:51
to put a percent on it?
26:53
Almost all synthetic. And the utility
26:55
value of synthetic is the same
26:57
as human? When you teach a
26:59
pilot to fly in a simulator,
27:01
there is a lot of potential
27:03
data that isn't very useful in
27:05
teaching her to fly. They spend
27:07
a lot of time going straight
27:09
doing nothing. as a pilot. Now,
27:11
takeoff and landings are where you
27:13
want to spend your time, and
27:15
that's why we put them in
27:17
simulators, that's what we have them
27:19
doing. And in simulators, we can
27:21
create data where engines blow, where
27:23
there are a whole set of
27:25
problems where learning can take place.
27:27
That's simulated data. And in the
27:29
same way, as we think about
27:31
creating data, whether it's for other
27:33
forms of AI, what we want
27:35
is the data that's the data
27:37
that's hard to gather, right? Not
27:39
difficult, we've been able to do
27:41
that for a decade. What we
27:43
want is an unprotected left turn
27:45
in the snow. It's snowing, it's
27:47
hard to see, you've got an
27:49
unprotected left turn, that's a difficult
27:51
thing. And you want that, thousands
27:53
of different ways, millions of different
27:55
ways. That's where the synthetic data
27:57
comes along, is to use it
27:59
to fill in the empty parts
28:01
where it's really expensive or painful
28:03
to get that type of data.
28:05
Think of the pilot. You want
28:07
them spending a huge amount of
28:09
time on things that are rare
28:11
in their training. Same with a
28:13
surgeon. A huge amount of time
28:15
on things that are rare most
28:17
of the time. It's carpentry. But
28:20
their expertise is only when it's
28:22
rare. Something happens. The unexpected occurs.
28:24
That's when their metal is shown.
28:26
And we will get better synthetic
28:28
data by a great deal. I
28:30
love it. I get it from
28:32
a consumer perspective and from an
28:34
expectations perspective. If we move the
28:36
needle on compute algorithms and data,
28:38
what does that mean for the
28:40
experience of AI? Faster and cheaper
28:42
is the first answer. The second
28:44
is is when things become faster
28:46
and cheaper, new applications emerge. It's
28:48
used everywhere, right? When computers became
28:50
faster and cheaper, suddenly they were
28:52
in cars. And then you were
28:54
in your pocket. And then they
28:56
were in your dishwasher. And then
28:58
they were in your dishwasher. And
29:00
in your TV. And then they
29:02
were in your dishwasher. And in
29:04
your TV. You've got them in
29:06
your TV. You've got them. diffusion
29:08
of innovation accelerates when you make
29:10
things faster and cheaper. This is
29:12
Jevan's paradox and Satch's belief there
29:14
now. Yeah, I know in the
29:16
VC community you got to you
29:18
got to cite 19th century English
29:20
economists. I'm English, like, come on,
29:22
like, if I'm not allowed to
29:24
cite an English philosopher, what am
29:26
I here for? Are you just
29:28
like, oh, he's fucking VC, he's
29:30
just being like, oh, Javen's power
29:32
and off? That's right. It's like,
29:34
like, make stuff cheaper and faster.
29:36
There are very few examples in
29:38
our industry, actually, none in compute
29:40
in 50 years, in which, by
29:42
making things cheaper and faster, the
29:44
market got smaller, market always gets
29:46
bigger, always. Is there a world
29:48
where we move past transformers? There
29:50
is a world of 100%. We
29:52
won't be as dependent on transformers
29:54
in three years or five years
29:56
as we are now. 100%. They're
29:58
not the end-all be-all. Why is
30:00
it? What will replace it and
30:02
what does that look like? I
30:04
don't know. I don't know whether
30:07
they're going to be other types
30:09
of models, but what I know
30:11
for sure is that innovation doesn't
30:13
stop. The transformer has some weaknesses.
30:15
that people are desperate to overcome.
30:17
There's a quadratic effect in the
30:19
attention head. There's all sorts of
30:21
things that could be improved, but
30:23
it's pretty darn good now, the
30:25
best we have. And that's what
30:27
you run with. You run with
30:29
the best you have, and the
30:31
minute it's not the best you
30:33
have, you drop it in favor
30:35
of the best you have. The
30:37
number of innovative companies designing models
30:39
is large. And what Deep Seek
30:41
showed us is you don't need
30:43
5,000 people. billions of dollars a
30:45
gear. You can do it with
30:47
200 smart people. More gear than
30:49
Deep Seek said they had, but
30:51
less gear than others had. Were
30:53
you very impressed with Deep Seek
30:55
and what impressed you most? I
30:57
think it was a result of
30:59
focused engineering and that impressed me.
31:01
It was designed to be better.
31:03
They weren't confused about being model
31:05
intellectuals or they weren't confused about
31:07
whether it was important to break
31:09
new ground or they were interested
31:11
in being better. from an invention
31:13
standpoint, that's a little boring. But
31:15
from an engineering standpoint, that was
31:17
sweet effort. They really built a
31:19
model that was just plain better
31:21
at many, many things. And that's
31:23
cool. I like good engineering projects.
31:25
Now, that they chose to announce
31:27
it right around Trump's inauguration and
31:29
the politics of it, that's all
31:31
a separate matter, and we can
31:33
talk about that later. But is
31:35
distillation wrong? I'm a VC, are
31:37
you kidding me? That's what we
31:39
do. In sunrise you wouldn't know
31:41
anything, right? That's exactly right. No,
31:43
that's exactly right. No, I don't
31:45
think distillation is wrong. And if
31:47
distillation is wrong, then certainly using
31:49
people's copyrighted data is wrong. That's
31:52
the problem. The problem is, you
31:54
got to be a little bit
31:56
consistent. Well, Sam has been, I
31:58
guess, many times, and we hope
32:00
he will be again. I hope
32:02
he's open. So everything that they
32:04
did innovate on, open AI can
32:06
learn from and take to. I
32:08
think there are a few examples
32:10
of an open source anything having
32:12
this sort of immediate impact. that
32:14
model had. I mean, that model
32:16
had a giant impact in a
32:18
technical community of really smart people.
32:20
And there are very few examples
32:22
of other open source software projects
32:24
that had that type of impact
32:26
in that amount of time. You
32:28
know, you're in the business of
32:30
betting on these guys. They ramp
32:32
up and they, oh, look, 10,000,
32:34
that's 100,000 users. It's now a
32:36
million users. We better start a
32:38
company around that. Get those grad
32:40
students. But this had a loud
32:42
boom in the industry immediately. It
32:44
was like, whoa. The thing I
32:46
have to think is the venture
32:48
master was, where is enduring and
32:50
defensible value simply? And how do
32:52
I get in early and build
32:54
that over time? In hardware. Well,
32:56
this was my question. You're just
32:58
like, well, I mean, you have
33:00
to be a very smart ambassador,
33:02
Richard, to do whole lot, to
33:04
be clear. But on the model
33:06
side, do you think there is
33:08
value when you look at the
33:10
sheer number of players, or with
33:12
relatively comparable models? To demonstrate enduring
33:14
value, you need both immediate value
33:16
and a trajectory for more. I
33:18
think the problem is in some
33:20
industries you are capable of demonstrating...
33:22
a leadership position for a short
33:24
period of time. And then someone
33:26
else, maybe the next generation, they
33:28
generate the next, and the next
33:30
generation the next. And I think
33:32
that ends up in the software
33:34
world being you're competing against other
33:36
people's release cadences. You're four months
33:39
ahead, there's six months, if that's
33:41
really where you are, there's not
33:43
a lot of value. But if
33:45
you can stay at the top
33:47
over years, right? above you are
33:49
changing constantly. Very large Silicon Valley
33:51
companies have been built with not
33:53
the most compelling technology. It might
33:55
have started the most compelling technology
33:57
and then it got to a
33:59
point where it was good enough.
34:01
It was easy enough to use.
34:03
That's when you're at the mature
34:05
market. But we're a long way
34:07
from there right now. Right now
34:09
we are in the early phases.
34:11
you characterize my position exactly right
34:13
data compute algorithm I think we
34:15
have a ton of room for
34:17
improvement on all of them when
34:19
we let you said that compute
34:21
in hardware that's where the value
34:23
is how does that value distribution
34:25
shake out you know we've obviously
34:27
got the 800 pound gorilla that
34:29
is in video how do you
34:31
think about how the distribution of
34:33
value shakes out in hardware and
34:35
in compute over the next five
34:37
years historically one of the barriers
34:39
to entry was sort of the
34:41
capital intensity of a project And
34:43
in the world of building chips,
34:45
there's both scarce resources and expertise,
34:47
and it's very expensive. Historically, it
34:49
hasn't fit very comfortably in a
34:51
software company. And the things that
34:53
software, modern software companies, value are
34:55
not entirely conducive to chip making.
34:57
So when I look down the
34:59
road, who has endured in much
35:01
of infrastructure tech, people who build
35:03
systems? have endured. There's a reason
35:05
that Apple and invidious are one
35:07
of the most valuable companies on
35:09
earth. There is what they do
35:11
is hard. That's why it's worth
35:13
challenging. If it weren't hard, if
35:15
it wasn't enormous and difficult, why
35:17
spend time being the underdog and
35:19
challenging it? A lot of people
35:21
place defenseability around and invidious, kind
35:23
of cuda lock in. To what
35:26
extent is that real versus type?
35:28
In inference, it's not real at
35:30
all. There's no kooda lock in
35:32
an inference. None. You can move
35:34
from open AI on an invidious,
35:36
to cerebras, to fireworks, service on
35:38
something else, to together, to perplexity
35:40
with 10 keystrokes. Anybody who actually
35:42
uses AI knows there's no kooda
35:44
lock in any inference. I think
35:46
there is, there was a fundamental
35:48
effort to disintermediate Kuda first. by
35:50
Google with tensor flow and first
35:52
by some grad students with cafe
35:54
and some of these early efforts,
35:56
but later by Google. with tensor
35:58
flow and then Facebook or matter
36:00
with pie torch. I think today,
36:02
most AI is written in pie
36:04
torch and you ought to be
36:06
able to compile it and run
36:08
it on on your hardware. Invidia
36:10
has many moats. When you are
36:12
a dominant market chair leader, that
36:14
in itself is a moat, that
36:16
you're the default solution is a
36:18
moat, that everybody learns to think
36:20
about AI in your structures. Those
36:22
are moats. The software, compilers are
36:24
hard, but they're tractable. I completely
36:26
agree with you in terms of
36:28
kind of being the leader is
36:30
the most in itself. It's never
36:32
talked about that way. Would you
36:34
put open AI in that same?
36:36
It is the leader. Everyone's mother
36:38
knows ChatGPT. Let's look at Intel,
36:40
right? Intel has made, until hiring
36:42
Libbo, prior to that, nearly a
36:44
decade of catastrophic decisions. And they
36:46
still own 80% of the X86
36:48
market. 75% of the market. AMD
36:50
is worked up to like 25%
36:52
or 30% and after a decade
36:54
of screwing up and you ask
36:56
yourself like that's a moat. How
36:58
big's my moat? I can make
37:00
a bunch of bad decisions for
37:02
a decade and only lose 20%
37:04
share. That's extraordinary. The moat was
37:06
just unbelievable. We'll see. I mean
37:08
I'm a huge fan of lipos.
37:11
He's an investor in our company.
37:13
I wish him well and I
37:15
think if anybody can change that
37:17
company he can. but I think
37:19
we rarely talk about what being
37:21
the market chair leader means in
37:23
terms of a moat in the
37:25
right context because as a challenger
37:27
we have to think about it
37:29
exactly because it's exactly that that
37:31
we need to we need a
37:33
bridge for. It's exactly these characteristics
37:35
of the moat that we need
37:37
to get over. In five years
37:39
time though, is it Uber or
37:41
is it... like AWS and cloud.
37:43
And what I mean by that
37:45
is that cloud is an interesting
37:47
market where like a couple of
37:49
players, several players have relative segments,
37:51
25, 30% and it's shared relatively
37:53
evenly between them. Not exactly, but
37:55
relatively. Or is it one like
37:57
Uber where Uber has 90% lift
37:59
as far? and then there's alternative
38:01
providers with the other five. I
38:03
think it's going to be between
38:05
those two. In five years from now,
38:07
invidity is going to have 60, right? I
38:09
think right now they have approximately all
38:12
of it. I think they will come
38:14
down over time. Of invidious usage, what
38:16
percent will be training versus inference? I
38:18
think they will continue to have a
38:21
meaningful business on both sides. I think
38:23
they're exceptional at training. rollover and play
38:25
dead in inference, I think. They're a
38:27
world-class company. I mean, they've had one
38:29
of the great decades of any company
38:32
in history, right? I mean, from 2014,
38:34
they were worth, what, $10 billion to
38:36
where they are right now? It's one
38:38
of the great decades in corporate history.
38:41
I don't think they're going to roll
38:43
over and, oh yeah, we're not going
38:45
to be in the inference market. That's
38:47
not going to happen. They can have
38:50
a meaningful share. They're going to have
38:52
meaningful share. Very big companies made in
38:54
this 100x growth. Do you think chip
38:57
providers will be far larger than model
38:59
providers in terms of enterprise value? In
39:01
the five-year time frame? Yes. How does
39:03
that prediction change in a different timeline?
39:06
I think in a shorter timeline, you
39:08
know, when you price an option,
39:10
variance and uncertainty increases the options
39:12
value. If you look at the
39:14
way BlackSholes works or if you
39:16
look at any option pricing model,
39:18
uncertainty is a friend of the
39:21
value of the option. And when people
39:23
are paying these extraordinarily high prices for
39:25
our model companies right now, I think
39:27
part of that is this extraordinary uncertainty,
39:29
is this wild variance. And so in
39:32
the shorter run, it might not be
39:34
the case. But in the longer run,
39:36
as markets mature, as we begin to
39:38
understand the value of these models, we
39:41
understand what their businesses look like, what
39:43
their long term net profitability looks like.
39:45
What did Warren Buffett say about markets
39:47
in the short term? They're a voting
39:49
mechanism in the long term. They're a
39:52
weighing mechanism. At some point, the weighing kicks
39:54
in. Usually it's in the public markets. And
39:56
then investors say, which is likely to give
39:58
me better growth in the future? and you
40:00
mentioned the word public there, I
40:02
do want to just hone in
40:04
on your business. Your cash flow
40:07
positive in a world where everyone
40:09
else literally bleeds cash. Help me
40:11
understand, what do you do to
40:13
making cash flow positive when everyone
40:15
else is bleeding or hemorrhaging cash?
40:17
Traditionally, your gross margins were a
40:19
measure of your technical differentiation, right?
40:21
If you're running a negative gross
40:24
margin business, it speaks for itself.
40:26
You're selling commodity. value creation isn't
40:28
being recognized in the market. And
40:30
so I think our technology is
40:32
creating an opportunity for us to
40:34
maintain margins where some others can't.
40:36
A lot of your revenue is
40:38
concentrated to the G42 deal. To
40:41
what extent is that a strength
40:43
or a weakness? It's both. The
40:45
way you catch three large customers
40:47
is to catch one first. The
40:49
way you build three large strategic
40:51
partners is learn to be a
40:53
strategic partner. That's a learned skill.
40:55
we didn't arrive knowing how to
40:58
be a strategic partner G42. Now
41:00
that we've worked at it and
41:02
worked at it, it's a muscle
41:04
we can replicate. We could be
41:06
a better partner to any of
41:08
a dozen different companies in the
41:10
world. What have you learned in
41:12
the G42 relationship bill process that
41:15
may see a good partner in
41:17
the way that you worked? We
41:19
deployed tens of exaflops of compute.
41:21
vastly more than anybody else that
41:23
isn't AMD or invidia, right? I
41:25
mean, at a huge amount of
41:27
compute, our software has been hardened
41:29
on some of the largest AI
41:32
clusters in the world. We've gone
41:34
through the growing pains of increasing
41:36
manufacturing, 2X and 5X and 2X,
41:38
I mean, through unbelievable growth in
41:40
manufacturing. We've worked with our supply
41:42
chain partners to be sure that
41:44
they're ready for this extraordinary growth.
41:46
When you work with a strategic
41:49
partner, of this size, your organization
41:51
comes out different on the other
41:53
side. There are things you've learned
41:55
and there are mistakes you've made
41:57
and I hadn't done a big
41:59
relationship in the Middle East. There
42:01
was a huge amount to learn.
42:03
I think you come out a
42:06
much better company and much better
42:08
prepared to do. business with Iberscaler
42:10
to do business with another massive
42:12
partner to do business with another
42:14
sovereign, it takes real work and
42:16
your team has to learn. You
42:18
said you'd come out better. Why
42:20
go public when you did? When
42:23
this happened I was like it
42:25
seemed preemptive respectfully and my question
42:27
now to companies is why go
42:29
public at all? There is so
42:31
much private capital. The colusons have
42:33
shown I think very clearly that
42:35
you can stay for a lot
42:37
longer than you planned to. The
42:40
database has certainly shown that, right?
42:42
I mean, there, those were historically
42:44
public market valuations, you know, the
42:46
valuations that they're entropic and open
42:48
AI and some of the others
42:50
are getting are historically public market
42:52
only valuations. And like you said,
42:54
you're Aswan's live, anyone can read
42:57
it. I wouldn't want people reading
42:59
mine. We have nothing to hide,
43:01
I think. No, but your competitors
43:03
have got asymmetric information. Yeah, we've
43:05
got asymmetric technology. To be public,
43:07
you have to be ready organizationally,
43:09
be ready with your processes. You
43:11
need to be ready to forecast
43:14
and predict, to be held accountable
43:16
in a way that private companies
43:18
historically haven't been. We think that
43:20
there's tremendous value. We think that
43:22
we will be among the first
43:24
in the category. We think that
43:26
some of our largest targets would
43:28
have a stated preference for doing
43:31
business with public companies. large enterprises
43:33
in the US, I've done that
43:35
historically. Those were some of the
43:37
reasons that led us to. How
43:39
many G42 relationships, all at G42,
43:41
will you have in the next
43:43
24 months? How fast can you
43:45
ramp them? That's a good question,
43:48
several, those big numbers. So remind
43:50
me, how big is the G42?
43:52
It's 87% of revenue, I know
43:54
that. It was big. I mean,
43:56
when we announced it, it was
43:58
some estimated it was north of
44:00
a billion. Well done. That must
44:02
be a bit of a high
44:05
five, wasn't it? I think for
44:07
first, yeah, there's tremendous excitement. And
44:09
then there's sort of every entrepreneur's
44:11
reality is, I gotta make a
44:13
lot more gear. I need to
44:15
make it. You make a list
44:17
of your top 10 vendors and
44:19
you fly it to them all.
44:22
Say, big orders are coming. Be
44:24
ready. Right. You work with all
44:26
your partners to get ready because
44:28
you need to make a great
44:30
deal more stuff. And that's one
44:32
of the real differences between hardware
44:34
and software is when we grow
44:36
fast, the number of people you
44:39
need to work with in your
44:41
supply chain, and the amount of
44:43
collaboration that needs to happen is
44:45
truly extraordinary. Are you going to
44:47
have a cluster fuck of unhappy
44:49
customers who bluntly have waited so
44:51
long for chips? By the time
44:53
they get them, the chips are
44:56
outdated, and they're going, what? All
44:58
of that's an opportunity for... for
45:00
us and others. That's opportunity. I
45:02
think being a market cheerleader isn't
45:04
easy either, but when you're late,
45:06
when the bully falls, everybody wants
45:08
to give him a kick. I
45:10
mean, that's, a lot of that
45:13
happened at Intel. They'd been the
45:15
dominant player and when they fell,
45:17
everybody was happy to jump in
45:19
and kick them when they were
45:21
down. I think there is a
45:23
real opportunity in the potential for
45:25
a video customer and happiness for
45:27
sure. for those of us who
45:30
are competing with them. I mean,
45:32
if you can't get your gear,
45:34
you may as well test somebody
45:36
else's. That's a huge opening. Had
45:38
over to cerebrus and use the
45:40
promo code, Harry 20, for your
45:42
chips today. Do that. There we
45:44
go. I'm here for you, baby.
45:47
Influence the mo turned on. Yeah,
45:49
yeah. No, no, why is it's
45:51
fine. I hope if we could
45:53
do a 20% take on the
45:55
billion deal. I think that's fine.
45:57
That's fine. I know this venture
45:59
business has been so good to
46:01
you, Harry, and you got to
46:04
get shoes for your kids and
46:06
the like, and yeah, we're happy
46:08
to donate to the Harry calls.
46:10
400 million fund and fees. That's
46:12
right. When I have no kids
46:14
as well. Two and 20 is
46:16
a rough way to make a
46:18
living, Harry. Don't get it, okay?
46:21
You in hardware. You said about
46:23
the complexity of hardware. Oh, export
46:25
controls being implemented properly. Do you
46:27
think that is a good idea?
46:29
You know, everyone was going with
46:31
Deep Z. Wow, how did this
46:33
happen? They must have stolen chips.
46:35
How could this be? It turns
46:38
out that they probably did use
46:40
chips in Singapore. I think the
46:42
following. I think managing. software and
46:44
managing hardware compliance are extremely different
46:46
things because their vector of diffusion
46:48
is different. There's different weights. If
46:50
you sell a server that weighs
46:52
five or six hundred pounds, arrives
46:55
on a palate, you can go
46:57
visit it. You want to deploy
46:59
it in Kazakhstan, you can put
47:01
a data center, and you can
47:03
have somebody from the embassy visit
47:05
it. Take photos of it, once
47:07
a month. It's not going anywhere.
47:09
You can keep track of who
47:12
uses it, and provide logs. That's
47:14
much, much harder with software. And
47:16
open source is a whole other.
47:18
That's the first observation. The second
47:20
is that I got to know
47:22
the leadership in commerce in the
47:24
previous administration. I didn't always agree
47:26
with their policies, but it is
47:29
a world of unintended consequences. You
47:31
sought to limit Chinese access to...
47:33
EDA tools to delay the growth
47:35
of a Chinese chip market. And
47:37
so US venture capitalists back tons
47:39
of Chinese companies in Shenzhen to
47:41
build EDA tools. Right. This is
47:43
a unbelievably slippery dynamic challenging problem.
47:46
I don't know if it's a
47:48
tractable problem to delay another nation's
47:50
progress on a technical trajectory is
47:52
an enormously challenging thing. I certainly
47:54
came to appreciate just how difficult
47:56
it was for well-meaning people to
47:58
predict the impact of policy. during
48:00
the last two years, for sure.
48:03
Do you think this administration is
48:05
better for AI than the prior
48:07
administration? I don't think there's any
48:09
doubt that's the case. The past
48:11
administration lined itself up against Big
48:13
Tech. That was a mistake. AI
48:15
is also in a different place,
48:17
so it's easier to be for
48:20
it. It's less scary now than
48:22
it was. We sort of have
48:24
a better picture of the trajectory,
48:26
both the risks and the... the
48:28
benefits. This administration sort of had
48:30
the foresight to put in place
48:32
an AI czar or leader to
48:34
be a focal point for discussions.
48:37
Yeah, I think it's probably a
48:39
fair bit better. You said it's
48:41
very challenging to kind of hinder
48:43
a nation's development, adoption, progression of
48:45
a technology. Respectfully, you chose to
48:47
not sell to China. Yeah. Why
48:49
was that? And does that not
48:51
go against the difficulty in hindering
48:54
progression? No, I have a... a
48:56
very simple rule and I encourage
48:58
a team to use it. I
49:00
mean, you don't need a big
49:02
handbook to help you make good
49:04
decisions in a company. Just ask
49:06
yourself, would my mother be proud?
49:08
And which would you be proud
49:11
if I did this? Would you
49:13
be proud if I did this?
49:15
Would you be proud if I
49:17
did this? Would you be proud
49:19
if I did this? Would you
49:21
be proud if I explained it
49:23
to my mother? and that's a
49:25
moral compass. What do you mean
49:28
it wouldn't have been used for
49:30
good? To do facial recognition to
49:32
identify minorities for persecution, build military
49:34
equipment, to things that I either
49:36
couldn't see or what I saw
49:38
didn't didn't feel right, it's more
49:40
important than money. Do you think
49:42
we fundamentally underestimate the Chinese's capabilities?
49:45
100% and it is one of
49:47
the most obvious and frequent... errors
49:49
in judgment is that you underestimate
49:51
the other side. You have to
49:53
look carefully at what they're doing
49:55
and their investment in infrastructure has
49:57
been extraordinary. The rate at which
49:59
they generate engineering talent is exceptional.
50:02
The government's ability to have a
50:04
policy and implement it, and that's
50:06
not a democracy. They weren't designed
50:08
to have checks and balances there.
50:10
The funding that flowed into the
50:12
development of AI technology, that their
50:14
venture capitalists were backed up by
50:16
their government, they have national champion
50:19
companies, that they've developed a belt
50:21
and suspender strategy to sort of
50:23
make much of the third world
50:25
dependent on them and their technologies.
50:27
I think they absolutely should not
50:29
be underestimated. They have a lot
50:31
of people, and we see a
50:33
tiny fraction of it. They have
50:36
produced industrial policy that has moved
50:38
their nation forward. What was the
50:40
most significant, do you think? The
50:42
creation of economic zones like Shenzhen
50:44
was clearly a visionary move. They
50:46
knew that their own system was
50:48
in the way. They created zones
50:50
that relaxed their own system. Could
50:53
the US learn from them that
50:55
way? We did some of the
50:57
same things in Trump won. administration,
50:59
right? What do we do? We
51:01
relaxed our own rules in the
51:03
development of vaccines. We knew that
51:05
in this time, it would be
51:07
very difficult to go through the
51:10
steps that we always go through,
51:12
and we tried to implement some
51:14
thoughtful workarounds, rather. I think that,
51:16
you know, why are they committed
51:18
to trains as a motor transportation,
51:20
and we can't build a decent
51:22
train system in the US, or
51:24
in California, or why we have
51:27
three different standards for train rails.
51:29
and the rest of the world
51:31
can build extraordinary high-speed trains linking
51:33
important cities. What are we doing
51:35
wrong in the building of our
51:37
infrastructure that our bridges and our
51:39
freeways are in disarray? Those are
51:41
questions we got to ask ourselves
51:44
when we see other people doing
51:46
it differently. If you watch a
51:48
good football team and you say,
51:50
whoa, that's an interesting offense. And
51:52
you're not thinking to yourself, how
51:54
could our team learn? What could
51:56
we do? Why did that work?
51:58
or the structure or something that
52:01
made that a successful series of
52:03
plays. And what can I take
52:05
away from that? How can that
52:07
inspire me to do better? I'm
52:09
always looking for inspiration in others
52:11
and competitors and partners. We have
52:13
some of our partners in G42.
52:15
I mean, the work ethic is
52:18
unbelievable. It inspires me. And the
52:20
scope of the challenge that undertaken
52:22
inspires me. And I think I'm
52:24
always looking for that. Andrew I
52:26
could talk to you all day
52:28
I do want to do a
52:30
quick fire with you so I say
52:33
a short statement you're ready yeah sure
52:35
what do you believe that most around
52:37
you disbelieve I think we're closer to
52:39
peace in the Middle East than people
52:41
believe there is a rise of a
52:44
of a moderate business focused Arab state
52:46
that it wasn't there 25 or 30
52:48
years ago if you visit the UAE
52:50
or cutter or or even KSA what
52:53
you see is amazing transformation a desire
52:55
for to be included in the West
52:57
in their own way, but also to
52:59
enjoy the benefits of it. We
53:02
are closer than people think. What's
53:04
the most underrated threat to invidious
53:06
market share dominance? The fundamental architecture
53:08
of the GPU with off-chip memory
53:11
is not great for inferences. Now
53:13
they will continue to do well
53:15
in inference, but it can be
53:18
beaten, and I think they know
53:20
it. What's a crazy AI prediction
53:22
you have that most people
53:24
would call science fiction? Dariot
53:26
Anthropics that will live to
53:29
150? I don't think we're going to
53:31
live to 150. I don't think what
53:33
a 90% of our code will be
53:35
written by machines in this year. But
53:37
I do think that within a year
53:40
or two, most people in the US
53:42
will engage with an AI every single
53:44
day in one form or whether they
53:46
know it or not. That AI might
53:48
be in their mapping program that helps
53:51
them pick a better out to work.
53:53
It might be any number of different
53:55
things within a year or two. AI's
53:57
penetration will be approximately the same as
54:00
telephones. What have you changed
54:02
your mind on in the last 12
54:04
months? Many decisions I made turned out
54:06
to be wrong. What was the most
54:09
wrong decision? There are two ways you
54:11
can be wrong. You can actively be
54:13
wrong or you can fight against what
54:15
was right. In 2016, JP, one of
54:18
our co-founders in Chief System Architect, laid
54:20
out a plan that would have us
54:22
doing water cooling and for our systems.
54:25
Nobody else was doing it and I
54:27
fought so hard and I was so
54:29
wrong. JP was right about a year
54:32
or two later, Google announced that the
54:34
TPUs were going to be water-cooled. We
54:36
were first, and now invidious only selling
54:38
water-cooled parts. I mean, I was dead
54:41
wrong, and JP was right. Many, many
54:43
instances when you make a lot of
54:45
decisions every day, where you're wrong, I've
54:48
been wrong about people. People I thought
54:50
were pretty good turned out to be
54:52
extraordinary. People I thought would be extraordinary.
54:55
We're really smart, but couldn't finish projects
54:57
and get stuff done. wrong a fair
54:59
bit, you ought not to be making
55:01
a lot of decisions because it comes
55:04
with territory. As a venture capitalist, I'm
55:06
never wrong, so I don't know. As
55:08
a venture capitalist, you're wrong nine times
55:11
in ten and everybody forgets as long
55:13
as you're really right. And I get
55:15
a picture of you signing the term
55:18
sheet with me and then he's for
55:20
public, and then I go... And I
55:22
think yours is a perfect industry in
55:25
which nobody cares about the average, you're
55:27
wrong all the time. And what they
55:29
care about is the occasional time you're
55:31
really right. That's what moves a fund.
55:34
That's different than being a CEO. I
55:36
think we got to be mostly right
55:38
most of the time, but if you're
55:41
making a lot of decisions, you're still
55:43
making a ton of mistakes. This is
55:45
your fifth startup. I mean, you are
55:48
a sucker for punishment, aren't you? I
55:50
mean, really, like five times. Like, Christ
55:52
Andrew, did you not get beaten alive
55:54
enough? My question to you though is
55:57
like, I believe in the value of
55:59
serial entrepreneurship. Don't! How do you think
56:01
about the inherent benefits that you have
56:04
having done it four times before? I
56:06
think if you are in a business
56:08
in which running a business... is a
56:11
benefit, then experience matters a great deal.
56:13
I think if you are in a
56:15
business in which you look like your
56:17
customer, there was a reason why social
56:20
networks were started by people right out
56:22
of college or in college, because dating
56:24
is top of their mind. And they
56:27
looked like their customers. And that was
56:29
more important than knowing anything about running
56:31
a business. In that environment, it will
56:34
certainly select for people who... are of
56:36
the demographic that their customers are. They
56:38
know that backwards and forwards. But if
56:40
you want to have a business, has
56:43
manufacturing in it, it has a supply
56:45
chain that has you managing hundreds or
56:47
thousands of engineers to a timeline, to
56:50
a schedule, I don't think anybody would
56:52
turn around your statement and with a
56:54
straight face that you know what I'm
56:57
looking for is an engineering leader with
56:59
no experience. Right, no, and I don't
57:01
want somebody who's led a team of
57:04
four or five hundred who has experienced
57:06
the challenges of growth. What I'm looking
57:08
for is somebody with no experience. Naivity
57:10
is a bonus here. Yes, right. I
57:13
think the people who sell that sometimes
57:15
are consultants, right? Oh, look, my guys
57:17
have no experience in your industry. They're
57:20
not biased. Maybe a little bit of
57:22
experience in the industry would help, right?
57:24
Come on. Where are people investing today
57:27
in AI? Why is so much cash
57:29
going to that part? I'm not saying
57:31
that company I don't... I think part
57:33
of the dynamic in your industry is
57:36
sometimes money needs to find a home.
57:38
Some guys have raised really really big
57:40
funds and they got to find a
57:43
home for their money. and some people
57:45
don't like to be left out and
57:47
they're willing to make investments for maybe
57:50
for some status purposes or other reasons
57:52
that they don't seem to make sense.
57:54
There's some underappreciated places of investment. I'd
57:56
say in the chip world the sub-milliwatt,
57:59
really tiny, tiny, little chips that live
58:01
next to sensors that do inference. These
58:03
are tiny little things that will only
58:06
send back useful data is a... interesting
58:08
market and they will sell enormous volume.
58:10
It's not a part of the market.
58:13
I love to play in, I like
58:15
to build bigger things and sell them
58:17
to the data center, but I think
58:20
that part is extremely interesting. I think
58:22
they'll be fundamental for robotics. That's an
58:24
area where extremely underappreciated. Final one, if
58:26
we think about cerebral in 10 years
58:29
time, where do you envision the business
58:31
in 10 years time? If everything goes
58:33
well, where are we in business having
58:36
that conversation? So 10 years ago in
58:38
video is worth $10 billion. That's a
58:40
long run in our world right now.
58:43
I think in three to five years,
58:45
I would like our technology to have
58:47
been used to solve two important societal
58:49
problems. I would like it to be
58:52
used to have found a therapeutic for
58:54
an affliction that impacts more than a
58:56
million people here. I would like our
58:59
inference to be powering a collection of
59:01
apps that don't exist today. And I
59:03
would like that a meaningful portion of
59:06
the population in the US and in
59:08
Europe inadvertently uses our technology. So uses
59:10
something that we power and that they
59:12
don't even know it. I think those
59:15
are things that would make me really
59:17
happy. And I've wanted to make this
59:19
show happen for a long time. As
59:22
I said, I heard so many good
59:24
things from Harry for many years. There's
59:26
been so many requests to have you
59:29
on the show. My team is just
59:31
like, just get Andrew on the show,
59:33
Harry. I'm like, okay, okay. I like
59:36
tweeted it, obviously, which is how we
59:38
got this. But thank you for joining
59:40
it. 40 people sent me a note
59:42
saying how come you're avoiding Harry how
59:45
come he has to go tweet it.
59:47
I was just like all right just
59:49
call me it's good send me no
59:52
happy to come on really thoughtful questions
59:54
Harry really thoughtful and interesting a really
59:56
fun conversation so I have wanted to
59:59
do that show for a while but
1:00:01
frankly I was just blown away by
1:00:03
Andrew's humility his no BS approach he
1:00:05
was incredible to work within the process
1:00:08
and I just so appreciate his time
1:00:10
state if you want to watch a
1:00:12
full episode, you can find it on
1:00:15
YouTube by searching for 20 VC, that's
1:00:17
20 VC on YouTube. But before we
1:00:19
leave you today, turning your back of
1:00:22
a napkin idea into a billion dollar
1:00:24
startup requires countless hours of collaboration and
1:00:26
teamwork, it can be really difficult to
1:00:28
build a team that's aligned on everything
1:00:31
from values to workflow. But that's exactly
1:00:33
what CODA was made to do. CODA
1:00:35
is an all-in-one collaborative workspace that started
1:00:38
as a napkin sketch. Now, just five
1:00:40
years since launching and baser, CODA has
1:00:42
helped 50,000 teams all over the world
1:00:45
get on the same page. 20 VC,
1:00:47
we've used Coder to bring structure to
1:00:49
our content planning and episode prep, and
1:00:52
it's made a huge difference. Instead of
1:00:54
bouncing between different tools, we can keep
1:00:56
everything from guest research to scheduling and
1:00:58
notes all in one place, which saves
1:01:01
us so much time. With Coder you
1:01:03
get the flexibility of docs, the structure
1:01:05
of spreadsheets and the power of applications,
1:01:08
all built for enterprise. and it's got
1:01:10
the intelligence of AI which makes it
1:01:12
even more awesome. If you're a startup
1:01:15
team looking to increase alignment and agility,
1:01:17
Codea can help you move from planning
1:01:19
to execution in record time. To try
1:01:21
it for yourself, go to codea.io/20vc today
1:01:24
and get six free months of the
1:01:26
team plan for startups. That's codea.io/20vc to
1:01:28
get started for free and get six
1:01:31
free months of the team plan. Now
1:01:33
that your team is aligned and collaborating,
1:01:35
let's tackle those messy expense. reports. You
1:01:38
know, those receipts that seem to multiply
1:01:40
like rabbits in your wallet, the endless
1:01:42
email chains asking, can you approve this?
1:01:44
Don't even get me started on a
1:01:47
month and panic when you realize you
1:01:49
have to reconcile it all. Well, Plio
1:01:51
offers smart company cards, physical, virtual and
1:01:54
vendor-specific. So teams can buy what they
1:01:56
need while finance stays in control. Automate
1:01:58
your expanse reports, process invoices seamlessly, and
1:02:01
manage reimbursements effortlessly, all in one platform.
1:02:03
With integrations to tools like Zero, Quick
1:02:05
Books and Net Suite, Plio fits right
1:02:08
into your workflow. time and giving you
1:02:10
full visibility over every entity, payment and
1:02:12
subscription. Join over 37,000 companies already using
1:02:14
Plio to streamline their finances. Try Plio
1:02:17
today. It's like magic but with fewer
1:02:19
rabbits. Find out more at plio.io.io.org/20 VC.
1:02:21
And don't forget to revolutionize how your
1:02:24
team works together. Rome. A company of
1:02:26
tomorrow runs at hyper speed with quick
1:02:28
drop-in meetings. A company of tomorrow is
1:02:31
globally distributed. and fully digitized. The company
1:02:33
of tomorrow instantly connects human and AI
1:02:35
workers. A company of tomorrow is in
1:02:37
a Rome virtual office. See a visualization
1:02:40
of your whole company, the live presence,
1:02:42
the drop-in meetings, the AI summaries, the
1:02:44
chats. It's an incredible view to see.
1:02:47
Rome is a breakthrough workplace experience loved
1:02:49
by over 500 companies of tomorrow for
1:02:51
a fraction of the cost of zoom
1:02:54
and slack. Visit Rome, that's OR.AM. For
1:02:56
an instant demo of Rome today. Nobody
1:02:58
knows what the future holds, but I
1:03:00
do know this. It's going to be
1:03:03
built in a Rome virtual office, hopefully
1:03:05
by you. That's Rome, R-O-O-D-M, for an
1:03:07
instant demo. As always, I so appreciate
1:03:10
all your support and stay tuned for
1:03:12
a fantastic episode coming on Wednesday, with
1:03:14
I think one of the most under-disgussed
1:03:17
firms in venture capital, Lead Edge Capital,
1:03:19
and their founder Mitchell.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More