20VC: AI Chip Wars: How Cerebras Plans to Topple NVIDIA's Dominance | Why We Have Not Reached Scaling Laws in AI | What Happens to the Cost of Inference | How We Underestimate China and Shouldn't Sell To Them with Andrew Feldman by The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch | Podchaser

Episode from the podcastThe Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: AI Chip Wars: How Cerebras Plans to Topple NVIDIA's Dominance | Why We Have Not Reached Scaling Laws in AI | What Happens to the Cost of Inference | How We Underestimate China and Shouldn't Sell To Them with Andrew Feldman

Released Monday, 24th March 2025

Good episode? Give it some love!

20VC: AI Chip Wars: How Cerebras Plans to Topple NVIDIA's Dominance | Why We Have Not Reached Scaling Laws in AI | What Happens to the Cost of Inference | How We Underestimate China and Shouldn't Sell To Them with Andrew Feldman

20VC: AI Chip Wars: How Cerebras Plans to Topple NVIDIA's Dominance | Why We Have Not Reached Scaling Laws in AI | What Happens to the Cost of Inference | How We Underestimate China and Shouldn't Sell To Them with Andrew Feldman

Monday, 24th March 2025

Good episode? Give it some love!

Rate Episode

Podchaser Pro

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:00

Our AI algorithms today are not

0:02

particularly efficient. In a GPU, most

0:04

of the time it's doing inference,

0:06

it's 5 or 7% utilized. That

0:08

means it's 95 or 93% wasted.

0:10

We won't be as dependent on

0:12

transformers in three years or five

0:14

years as we are now. 100%.

0:16

The fundamental architecture of the GPU

0:18

with off-chip memory is not great

0:20

for inference. Now they will continue

0:22

to do well in inference, but

0:24

it can be beaten, and I

0:26

think they know it. This

0:29

is 20 VC with me Harry Stebbings.

0:31

Now we did a show with

0:33

Jonathan Ross at Grock and it

0:35

blew all numbers out of the

0:37

water. Millions of plays, everyone loved

0:39

it, and everyone said that we

0:41

had to get Andrew Feldman from

0:43

cerebras on the show. So I'm

0:45

so excited to make this episode

0:47

happen today. Joining us in the

0:49

hot seat is Andrew Feldman, co-founder

0:51

and CEO of cerebras, the fastest

0:53

AI inference and training platform in

0:55

the world. Now in September 2024,

0:57

the company filed to go public off

1:00

the back of a rumoured one billion

1:02

dollar deal with G42 in the UAE.

1:04

in the inference market. Andrew is the

1:06

leading expert for all things inference. This

1:08

show was incredible. I have the best

1:10

job in the world. I sit down

1:12

with the smartest people and learn from

1:14

them and this show is exactly that.

1:16

But before we dive in today...

1:18

turning your back of a napkin

1:20

idea into a billion-dollar startup requires

1:22

countless hours of collaboration and teamwork.

1:25

It can be really difficult to

1:27

build a team that's aligned on

1:29

everything from values to workflow, but

1:31

that's exactly what CODA was made

1:33

to do. CODA is an all-in-one

1:35

collaborative workspace that started as a

1:37

napkin sketch. Now, just five years

1:39

since launching and baser, CODA has

1:41

helped 50,000 teams all over the

1:43

world get on the same page.

1:45

Now at 20 VC, we've used

1:47

CODA. to bring structure to our content

1:49

planning and episode prep. And it's made

1:52

a huge difference. Instead of bouncing between

1:54

different tools, we can keep everything from

1:56

guest research to scheduling and notes all

1:58

in one place. saves us so

2:01

much time. With codey you get

2:03

the flexibility of docs, the structure

2:05

of spreadsheets, and the power of

2:07

applications, all built for enterprise. and

2:09

it's got the intelligence of AI

2:11

which makes it even more awesome.

2:13

If you're a startup team looking

2:15

to increase alignment and agility, Codea

2:17

can help you move from planning

2:19

to execution in record time. To

2:21

try it for yourself, go to

2:23

codea.io/two zero VC today and get

2:25

six free months of the team

2:28

plan for startups. That's codea.io/two zero

2:30

VC to get started for free

2:32

and get six free months of

2:34

the team plan. Now that your

2:36

team is aligned and collaborating, let's

2:38

reports. You know, those receipts that

2:40

seem to multiply like rabbits in

2:42

your wallet, the endless email chains

2:44

asking, can you approve this? Don't

2:46

even get me started on a

2:48

month and panic when you realize

2:50

you have to reconcile it all.

2:52

Well, Plio offers smart company cards,

2:55

physical, virtual and vendor-specific. So teams

2:57

can buy what they need while

2:59

finance stays in control. Automate your

3:01

expanse reports. Process invoices seamlessly and

3:03

manage reimbursements effortlessly. All in one

3:05

platform. With integrations to tools like...

3:07

Like Zero, Quick Books and Net

3:09

Suite, Plio fits right into your

3:11

workflow, saving time and giving you

3:13

full visibility over every entity, payment

3:15

and subscription. Join over 37,000 companies

3:17

already using Plio to streamline their

3:19

finances. Try Plio today. It's like

3:22

magic, but with fewer rabbits. Find

3:24

out more at plio.io.org/20 VC. And

3:26

don't forget to revolutionize how your

3:28

team works together. Rome, a company

3:30

of tomorrow runs at hyperspeed hypersped

3:32

with... quick drop-in meetings. A company

3:34

of tomorrow is globally distributed and

3:36

fully digitized. The company of tomorrow

3:38

instantly connects human and AI workers.

3:40

A company of tomorrow is in

3:42

a Rome virtual office. See a

3:44

visualization of your whole company, the

3:46

live presence, the drop-in meetings, the

3:49

AI summaries, the chats. It's an

3:51

incredible view to see. Rome is

3:53

a breakthrough workplace experience loved by

3:55

over five... 100 companies of tomorrow

3:57

for a fraction of the cost

3:59

of Zoom and Slack. Visit Rome,

4:01

that's O.R. dot A.M. for an

4:03

instant demo of Rome today. Nobody

4:05

knows what the future holds. But

4:07

I do know this. It's going

4:09

to be built in a Rome

4:11

virtual office. Hopefully by you. That's

4:13

Rome. R.O. dot A.M. for an

4:16

instant demo. You have now arrived

4:18

at your destination. Andrew, it is

4:20

such a pleasure to meet man.

4:22

I've wanted to do this one

4:24

for a while. I've heard so

4:26

many good things from Eric for

4:28

a long time. So thank you

4:30

so much for joining me. Harry,

4:32

thank you for having me, I

4:34

appreciate it. Not at all, this

4:36

will be a fantastic conversation. I

4:38

have my pen ready. I feel

4:41

like this is going to be

4:43

a learning experience for me. I

4:45

want to go back to 2015.

4:47

What did you and the team

4:49

see in the AI landscape in

4:51

2015 that led to the founding

4:53

of Sarubris? This is every computer

4:55

architect's dream. We saw a new

4:57

problem to solve. What that means

4:59

is maybe you can build a

5:01

new machine better suited to that

5:03

problem. And so in 2015, and

5:05

the credit goes to Gary and

5:08

Sean and JP and Michael, my

5:10

co-founders, they saw on the horizon

5:12

the rise of AI. And what

5:14

that meant was there'd be a

5:16

new problem for computers that what

5:18

the AI software would ask. from

5:20

the underlying chip processor would be

5:22

different. We came to believe that

5:24

we could build a better machine

5:26

for that problem. That's what we

5:28

saw. You know, obviously we didn't

5:30

see it exactly right. I underestimated

5:32

it. You know, this is my

5:35

fifth startup and the first time

5:37

I underestimated the size of the

5:39

market by a lot. But what

5:41

we did get right was that

5:43

this was going to be big

5:45

and it would put a different

5:47

type of pressure on a processor.

5:49

and that it would put pressure

5:51

on the memory bandwidth, that it

5:53

would put pressure on the communication

5:55

structure. That's what we saw, we

5:57

dove in, it's been an extraordinary

5:59

nine years. How does the movement

6:02

into an age of AI change?

6:04

the requirements from a chip perspective

6:06

of what is needed for a

6:08

provider and how that then resulted

6:10

in how you built cerebral. The

6:12

way to think about a chip

6:14

is it does two things. It

6:16

does calculations and it moves data.

6:18

This is what a chip does,

6:20

sometimes along the way it stores

6:22

data. And so what AI presented

6:24

was a very unusual combination of

6:26

challenges. First, the underlying calculation is

6:29

trivial calculation is trivial. It's a

6:31

matrix multiplication. And an F-Mac can

6:33

be developed by any second-year electrical

6:35

engineering student. So you say to

6:37

yourself, holy cow, this has a

6:39

huge number of very, very simple

6:41

calculations. The hard part where the

6:44

I work is results and intermediate

6:46

results have to be moved a

6:48

lot. Therein is the most complicated

6:50

part. They have to be moved

6:52

to memory and from memory, and

6:54

they have to be broken up

6:57

and moved among GPUs. And what

6:59

we saw was that this was

7:01

going to be the hard problem,

7:03

and that if we could solve

7:05

for that problem, we would build

7:08

an AI computer that was faster

7:10

and use less power. When we

7:12

think about how we're going to

7:14

build and what we're building for,

7:16

to me, kind of a couple

7:18

of core elements, which is like,

7:21

why are you going to focus?

7:23

Are you focusing on fine tuning?

7:25

Are you focusing on training? Are

7:27

you focusing on inference? Three. You

7:29

chose all three. Yeah. Why? And

7:31

I'm sorry for my base questions,

7:34

but I thought like GPUs were

7:36

specialized towards training and they weren't

7:38

specialized towards inference. Can you have

7:40

a mono architecture that does three

7:42

best? The first step in computer

7:44

architecture is deciding what you're not

7:47

going to do. What are we

7:49

not going to be good at

7:51

is really the first important question.

7:53

To answer your question, you say,

7:55

is the computational work for training

7:57

from scratch? different from fine tuning?

8:00

And the answer is it's not

8:02

different. It's approximately the same. Now,

8:04

inference and training have some different

8:06

requirements. And generative inference in particular

8:08

has some very challenging requirements on

8:10

exactly the communication dimension that I

8:13

mentioned. In generative inference, you have

8:15

to move all the weights from

8:17

memory to compute to generate a

8:19

single word. And you have to move

8:21

them again to generate the next word.

8:23

And again, so if you have a 70

8:25

billion parameter model, not a giant

8:28

model, and each weight is

8:30

16 bits, You're moving what,

8:32

140 gigabytes of data to

8:34

generate one word. This is

8:36

an enormous amount of data

8:38

movement across memory and that's

8:40

called what it's consumed that

8:43

needs is memory bandwidth. If

8:45

you have an architecture like

8:47

we saw in the GPU,

8:49

that is your fundamental limitation.

8:51

That was what we went to a

8:53

way for scale to solve. They use

8:56

memory, a memory called HBM. A memory

8:58

called HBM. type of d-ram. It is

9:00

phenomenal memory, but it's slow

9:02

and high capacity. And when they

9:04

set the architecture for graphics, that's

9:07

what you wanted. You didn't have

9:09

to go back and forth to

9:11

memory very often. S-ram, on the

9:13

other hand, is unbelievably fast, but

9:15

has low capacity. And so we

9:17

wanted to use S-ram, but if

9:19

you build a normal-sized chip, you

9:21

can't hold a model. And so

9:23

by going to wafer scale, we

9:25

were able to put down a

9:27

huge amount of S-ram. and get

9:29

the benefits of speed and enough

9:31

capacity. If you build a normal-sized

9:33

chip with SRAM and you want to

9:35

do a 400 billion parameter model in

9:37

inference, you might need 4,000 chips or

9:39

if you want to do a deep-sea

9:42

671, you might need six or eight

9:44

thousand chips. What an administrative nightmare? And

9:46

if you can keep it on as

9:48

much as you can on one wave

9:50

or two wavered or four or ten,

9:52

you get all the benefit of the

9:55

SRAM and because you've been able to

9:57

the wafer, you get... There's tremendous capacity

9:59

as well. first, I totally get you

10:01

on HPM and kind of the slowness

10:03

of it. Why is it then that

10:05

Bluntley so much of the market just

10:08

continues to use it and 40% of

10:10

invidious revenue is using dead ships for

10:12

inference? Unless you went to Wayford Scale,

10:14

there wasn't really a credible other choice.

10:16

This is the way GPUs had always

10:18

been made. It's called the graphics processing

10:20

unit. That's the way they were built.

10:23

It was part of their advantage against

10:25

a CPU. dedicated chips like ours. What

10:27

used to be their advantage is another

10:29

weakness. That's a fun market to be

10:31

in, when over a very short period

10:33

of time, what you're good at becomes

10:35

your weakness. With a market cap, like

10:38

they do, and with Jensen as good

10:40

as he is, which I'm sure we

10:42

both agree with. They must know. They

10:44

do know this. A, they don't make

10:46

memory. So they're a consumer of other

10:48

people's memory. And that's SK. The High

10:50

Next guys or Samsung in Micron. They're

10:53

only three or four or four or

10:55

four or five companies that make huge

10:57

amounts of companies that make huge amounts

10:59

of memory. Not many choices. But it's

11:01

part of a complex architectural trade-off. In

11:03

the flip side, you could say it's

11:06

worked really well for them. Right? Look

11:08

at where it's taken them. But in

11:10

comparison to those of us who do

11:12

way for scale, it's a small set.

11:14

It's a set of one, us. We

11:16

have a real advantage against them on

11:18

inference. How do LPs fit into this

11:21

mix? In our business, there are a

11:23

lot of ways to skin a cat.

11:25

Our way is different than invidious way.

11:27

It's different than the TPU. It's different

11:29

from training them. They're different. Right now,

11:31

and every day since August 26th, when

11:33

we launched inference, our way has been

11:36

the fastest way across. a whole set

11:38

of models tested by artificial analysis and

11:40

others. Can I ask when we think

11:42

about kind of that speed? I am

11:44

interested you said that kind of you're

11:46

one of one with wafers and kind

11:48

of the architecture associated. What does that

11:51

mean in terms of cost? With such

11:53

efficiency, is it inherently more expensive? And

11:55

what does that look like from a

11:57

cost profile? This isn't our first dance.

11:59

We've been building computers for a long

12:01

time. When you make a choice like

12:03

wave for scale, you have to weigh

12:06

the trade-offs. We use less power because

12:08

one of the most power-hungry things on

12:10

a chip are the IOs. Are moving

12:12

data off-chip. And so if you are

12:14

moving data off-chip frequently, you're using more

12:16

power. Then if you can keep it

12:18

in the silicon domain on chip. So

12:21

we knew we would use less power.

12:23

We knew if you went to wafer

12:25

scale that you had to solve some

12:27

problems that people said were impossible to

12:29

solve like yield. So we had to

12:31

invent techniques that allowed us to yield

12:34

wafers. In fact, we invented techniques that

12:36

allow us to yield as well or

12:38

better than others who were building much

12:40

smaller choves. What is yield and why

12:42

is it impossible to solve? A wafer

12:44

begins a 12-inch diameter circle slice of

12:46

silicon. And your chip is punched out

12:49

of this, the way your mother might

12:51

take a cookie cutter and cut out

12:53

cookie dough. During the process, at some

12:55

point, just like your mom might have

12:57

done, she lifts up the edges and

12:59

all the little bits are removed and

13:01

what's left are just the cookies. Those

13:04

are your chips. Now, what happens is

13:06

there are a set of naturally occurring

13:08

flaws. And that's like your mother closing

13:10

rise and throwing up a handful of

13:12

M&M's. Now, the bigger the cookie, the

13:14

higher probability you hit an M&M, the

13:16

bigger the chip, the higher the possibility

13:19

that you have a flaw. And traditionally,

13:21

what you did when you had a

13:23

flaw was you threw away the chip,

13:25

or you sold it as a less

13:27

valuable part. You shut down part of

13:29

the chip and sold it as a

13:31

less valuable part, something called binning. So,

13:34

every wafer is going to have flaws.

13:36

The bigger your chip, the higher probability

13:38

you hit a flaw and the more

13:40

part of silicon is wasted when you

13:42

throw it you throw it away. This

13:44

is what everybody thought was known truth.

13:47

And one of the things our team

13:49

realized was that... There are other ways

13:51

to handle flaws. What if instead you

13:53

built your computer, you built your processor,

13:55

out of hundreds of thousands of identical

13:57

tiles, and say there was a flaw,

13:59

say you just shut down that tile

14:02

and worked around it, say you had

14:04

a row or a column of redundant

14:06

tiles, that when you needed them, you

14:08

could just pull in. Now, that had

14:10

been traditionally the technique used in memory

14:12

making, and the memory yields are extraordinary.

14:14

And so it occurred to us that

14:17

if we could build a computer, build

14:19

a processor, built of hundreds of thousands

14:21

of identical tiles, we could use redundancy

14:23

such that when there was a flaw,

14:25

we could just leave it there, shut

14:27

it down, work around it, and pull

14:29

in one of the redundant tiles. And

14:32

that had never been done in a

14:34

computer before, and that's at the heart

14:36

of our architecture. That allowed us to

14:38

yield and deliver whole wafers. Nobody ever

14:40

been able to do that in the

14:42

70-year history of our industry. Really, really

14:44

smart people struggled. I mean, Gene Amdahl,

14:47

one of the fathers of our industry,

14:49

had a company called Trilogy that crashed

14:51

and burned trying to do this. And

14:53

we figured it out. When you speak

14:55

about kind of being the fastest and

14:57

across-all benchmarks being the fastest, what matters

14:59

the most? Is it being the most

15:02

efficient? Is it being the most efficient?

15:04

Is it being the least efficient? Is

15:06

it being the least costly? If you

15:08

go to get a cancer diagnosis on

15:10

God forbid your mother or your wife,

15:12

I think 93% accuracy is just plain

15:15

not as good as 94% accuracy. And

15:17

you pay a lot and wait another

15:19

week to understand what the accuracy is,

15:21

right? You pay a lot. Now, on

15:23

the other hand, if you want Lama

15:25

405B to generate data to help you

15:27

tune Lama 70B, maybe you can wait

15:30

a few days, three days a week

15:32

more. There's no urgency there. On the

15:34

other hand, if you want an answer

15:36

from perplexity, you don't want to wait

15:38

45 seconds for a search answer. You

15:40

don't want to... wait three minutes for

15:42

R1 on GPUs to give you an

15:45

answer. What we know is that in

15:47

interactive mode, milliseconds matter. In interactive mode,

15:49

what Ours holds over Google years ago

15:51

showed was that you can destroy your

15:53

user's attention with milliseconds a delay. So

15:55

being the fastest matters, everything in that

15:57

domain. So I think what you have

16:00

to do is sort of be thoughtful

16:02

and say in some cases, being the

16:04

fastest doesn't matter. We'll call those batch.

16:06

Lots of maybe the cheapest matters there.

16:08

In other domains, there is no search

16:10

if you got to wait eight minutes

16:12

to get an answer. That's not a

16:15

product. When you go fast, a whole

16:17

set of new opportunities open up. Netflix

16:19

used to mail DVDs. That's what happened

16:21

when the internet was slow. They mailed

16:23

DVDs. I looked young Andrew, I'm not

16:25

that young. I remember blockbuster. Yeah. Well,

16:28

you remember blockbuster, right. First, I mean,

16:30

let's look at the history of that.

16:32

You're exactly right. First, we used to

16:34

drive to blockbuster to get a DVD.

16:36

Then, Netflix was mailing them to us,

16:38

and then we got broadband, and there

16:40

suddenly, Amazon's a studio, right? It changed

16:43

everything. And speed in inference does the

16:45

same thing. When we chatted before, you

16:47

gave this great equation for inference. What

16:49

was the equation that you gave for

16:51

inference? Because it was really helpful for

16:53

me in understanding. It begins with the

16:55

following. Training makes AI. That's how we

16:58

make AI. And inference is how we

17:00

use or consume AI. And so understanding

17:02

how big the inference market is, is

17:04

understanding the number of people who are

17:06

going to use it, how often they're

17:08

going to use it, times how much

17:10

compute each use takes. And right now

17:13

we are in this rare time where

17:15

the number of people using AI is

17:17

growing. the frequency with which they use

17:19

it is growing and the amount of

17:21

compute used in each instance of use

17:23

is growing. That's why you're getting this

17:25

extraordinary growth. And that's why it's off

17:28

the charts right now. When we think

17:30

about the dish. distribution of

17:32

resources between training

17:34

and inference. What will

17:36

that look like

17:38

in five years' time?

17:40

Because we've seen

17:43

all focus go to

17:45

training, well, not

17:47

all, but a lot

17:49

of focus go

17:51

to training and not

17:53

as much go

17:56

to inference. What does

17:58

that look like?

18:00

What we made until

18:02

the middle of

18:04

2024, what we made

18:06

in AI was

18:08

a novelty. It wasn't

18:11

very useful. Late

18:13

in 2024, what we

18:15

made began to

18:17

be useful. What was

18:19

the turning point?

18:21

If you look at

18:23

the models, they

18:26

became, I mean, ChatGPT

18:28

was not really a

18:30

technical innovation. It was a user

18:32

interface invention. But it gave more

18:34

people access. But we didn't really

18:36

right away know what to do

18:38

with it. It was cool. Right?

18:40

That's what I mean by novelty.

18:42

It was like, whoa, this is

18:44

cool. Now, if your marketing team

18:47

isn't on an LLM each person

18:49

several times a day, they're not

18:51

doing their job. That difference between

18:53

novelty, it's cool. And this is

18:55

part of everyday workflow. That's what

18:57

changed starting sometime in Q4 last

18:59

year and running into this year

19:01

is AI became useful, not just

19:03

to a select group in Silicon

19:05

Valley, but to my dad, to

19:07

my brothers, the doctors, to ordinary

19:09

people who aren't buried in the

19:11

Silicon Valley discussion. And when you

19:13

get them, then the market is

19:15

ripping. Do you not still think

19:17

we are so incredibly early though?

19:19

Going back to your point about

19:21

how many, in five years time

19:23

then, where are we? Are we

19:25

a hundred times bigger? Are we

19:27

a thousand times bigger in terms

19:29

of demand? think we're way over

19:31

a hundred times bigger. What does

19:33

that mean in terms of what

19:35

we need to equip ourselves to deliver?

19:38

These are incredibly energy utilizing. It

19:40

is incredibly difficult. Our industry consumes a

19:42

lot of power. Yeah, and a

19:44

lot of water. And we're seeing that

19:46

come down. But are we equipped

19:48

from an energy and a data center

19:50

standpoint to deliver the inference requirements

19:52

for a population that is as AI

19:54

hungry as we are? I think

19:56

a couple things. I think the first

19:58

thing is to admit this is a

20:00

power-intensive problem. We consume, our industry consumes

20:03

an enormous amount of power. The second

20:05

thing to say is, therefore, the burden

20:07

is on us to deliver exceptional value

20:10

as an industry. You take both, the

20:12

good and the bad, right? In order

20:14

to make it worthwhile from a societal

20:17

perspective to expand all this power, you

20:19

better deliver the goods. We better use

20:22

AI to find cures for diseases.

20:24

We better use AI to solve a

20:26

bunch of different societal problems. That's the

20:28

macro view. Do I think that we

20:31

are equipped? I think we are in

20:33

a very unusual situation in the US

20:35

where we have plenty of power, but

20:37

it's in all the wrong places. We

20:39

have power in Niagara. What we don't

20:42

have is power where you want to

20:44

build data centers, where we have good

20:46

fiber. What we don't have is a

20:48

national way to relax the local regulations

20:51

that make getting power. And so when

20:53

you go to Silicon Valley, if you

20:55

want to build a data center, you're

20:57

dealing with local government and installed interests.

20:59

And that is not an efficient way

21:01

to decide if you want to build

21:03

a power plant or put a new

21:06

data center in, especially if it's large.

21:08

I think those places that have... ripped

21:10

out some of that burden in Texas,

21:12

for example, of getting a huge amount

21:14

of data centers built. You know, when

21:16

I spoke to, you know, Johnson at

21:18

Grock before, he said there were a

21:20

huge amount of data centers being built

21:22

that were not actually really equipped properly,

21:25

and that we've seen this massive

21:27

supply of data centers that are

21:29

really come done by tourists, so

21:31

to speak, and that is a

21:33

massive problem, and that the provisioning

21:35

of these data centers isn't there.

21:37

Do you agree? A data center

21:39

is a construction project? and it's

21:41

got a design engineering component. I

21:43

think there's been a huge push

21:45

for new construction data centers. We

21:47

will see, we don't know if

21:49

they're going to be good enough. I

21:51

think many of them will be fine.

21:53

The guys who were there early were

21:55

some of the Bitcoin mining companies, Terowulf,

21:58

the guys of Crusoe, and guys in...

22:00

Europe, they were early in building buildings

22:02

near low-cost power in order to

22:04

run compute that used a lot

22:06

of power. And they are some

22:08

of the leaders now in some

22:10

of the largest projects. Now those

22:12

are certainly not tourists. Those are

22:14

extremely sophisticated data center builders. Sure,

22:16

there are some tourists, but there

22:18

are a lot of very, very

22:20

knowledgeable data center builders building huge

22:22

facilities right now. I mean gigawatt

22:24

scale facilities, both domestically and internationally.

22:26

How do you think about how

22:28

the cost of inference goes down?

22:30

With the surge of demand that

22:32

we mentioned, you know, over 100x,

22:34

does the price reduce 100x? Does

22:36

it follow Moore's law continuously? How

22:38

do we think about the ever-reducing

22:40

price of inference? The cost of

22:42

inference is built up of several

22:44

pieces, right? There's the power in

22:46

space that is consumed to generate

22:48

their response. That's a data center

22:50

cost. That's an op-ex item, number

22:52

one. Number two, there's the... cost

22:54

of the computer. We can drive

22:56

down the cost of the computers

22:58

with each generation by driving up

23:01

their performance, etc. The other thing

23:03

we can do is we can

23:05

develop more efficient algorithms. Our AI

23:07

algorithms today are not particularly efficient.

23:09

There's a tremendous amount of room.

23:11

In a GPU, most of the

23:13

time it's doing inference, it's 5

23:15

or 7% utilized. That means it's

23:17

95 or 93% wasted. Over time,

23:19

I think as an industry, we

23:21

get better at things. We can

23:23

drive the cost of compute down,

23:25

we can build more efficient data

23:27

centers with lower PUEs, and our

23:29

algorithms will get more efficient so

23:31

that our utilizations on our now

23:33

cheaper computers are higher, so you

23:35

get a higher percentage of the

23:37

maximum number of flops. You get

23:39

more tokens per unit time for

23:41

the same power. When you look

23:43

at the inefficiency of the algorithms,

23:45

as you mentioned that, and what

23:47

that means for the utilization of

23:49

the chips, why are people suggesting

23:51

that we're scaling laws already? That

23:53

seems to suggest that there is

23:55

so much room for improvement. How

23:57

do you think about what you

23:59

just said? in conjunction with the

24:01

idea that scaling laws were hitting

24:03

this asymptote point, how do you

24:05

reconcile it too? I don't think

24:07

there's a lot of debate among

24:09

senior ML thinkers that we have

24:11

tremendous room for algorithmic improvement. I

24:13

don't think there's a lot of

24:15

debate there. There's even debate about

24:17

whether the scaling laws are over,

24:19

whether we ran out of mojo

24:21

to keep. making data or gathering

24:23

data to fill these ever bigger

24:25

models. But open AI's work on

24:27

01 shows me that the scaling

24:29

law, certainly for inference, are fully

24:31

functional, right? And the more compute

24:33

you put on inference, the better

24:35

answer you get. Many of the

24:37

leading models are now MOEs. They're

24:39

not presenting all of the weights

24:41

to teach token, and that's one

24:43

way to do it. Present the

24:46

important stuff, not the unimportant stuff.

24:48

There are other ways to do

24:50

it that we will invent and

24:52

learn over time, but we have

24:54

human models that aren't all to

24:56

all connected. Many of our models

24:58

today are all to all connected.

25:00

That's a lot of unnecessary connections,

25:02

connections that don't produce anything that

25:04

we still end up doing math

25:06

over. I'm sorry, what does all

25:08

to all connected mean? In many

25:10

of the layers in a neural

25:12

network, every element is connected to

25:14

every other one. That's not the

25:16

way... actually the learning happens. Some

25:18

are more valuable and some are

25:20

not valuable at all. Imagine you

25:22

got to read 50 books, you

25:24

want to learn something, you can

25:26

read all 50 books, or you

25:28

could read three books that are

25:30

really important, or you could read

25:32

summaries of the three books that

25:34

are the most important. The problem

25:36

is we don't know which they

25:38

are at the beginning. And there's

25:40

a process that you could learn.

25:42

There's things called dropout and all

25:44

these other techniques to use sparsity

25:46

to help solve these problems. We

25:48

are early in the evolution of

25:50

AI. It plays right into this

25:52

point that we'll get better these

25:54

algorithms. Transformers aren't the end of

25:56

the world, right? We'll get better.

25:58

Better will mean fast and more

26:00

accurate and more efficient. That's what's

26:02

exciting about an everchanging industry. That's

26:04

why I'm not in. all these

26:06

other industries that don't change quickly.

26:08

Same nine years as they are

26:10

today. But this show is kind

26:12

of strange for me because I

26:14

speak to a lot of people

26:16

and they think about the three

26:18

pillars and they're like you know

26:20

compute algorithms and data, a lot

26:22

of the common refrain is that

26:24

a lot of the common refrain

26:26

is that actually we're very far

26:28

along in all of them and

26:30

that has been the refrain and

26:33

when I hear you it's like

26:35

actually it's very exciting. I think

26:37

they're wrong on all its underpinnings.

26:39

I think we are early in

26:41

all of them. If we just

26:43

take them one by one, in

26:45

five years' time, how much synthetic

26:47

versus human data will be used

26:49

to train models if you were

26:51

to put a percent on it?

26:53

Almost all synthetic. And the utility

26:55

value of synthetic is the same

26:57

as human? When you teach a

26:59

pilot to fly in a simulator,

27:01

there is a lot of potential

27:03

data that isn't very useful in

27:05

teaching her to fly. They spend

27:07

a lot of time going straight

27:09

doing nothing. as a pilot. Now,

27:11

takeoff and landings are where you

27:13

want to spend your time, and

27:15

that's why we put them in

27:17

simulators, that's what we have them

27:19

doing. And in simulators, we can

27:21

create data where engines blow, where

27:23

there are a whole set of

27:25

problems where learning can take place.

27:27

That's simulated data. And in the

27:29

same way, as we think about

27:31

creating data, whether it's for other

27:33

forms of AI, what we want

27:35

is the data that's the data

27:37

that's hard to gather, right? Not

27:39

difficult, we've been able to do

27:41

that for a decade. What we

27:43

want is an unprotected left turn

27:45

in the snow. It's snowing, it's

27:47

hard to see, you've got an

27:49

unprotected left turn, that's a difficult

27:51

thing. And you want that, thousands

27:53

of different ways, millions of different

27:55

ways. That's where the synthetic data

27:57

comes along, is to use it

27:59

to fill in the empty parts

28:01

where it's really expensive or painful

28:03

to get that type of data.

28:05

Think of the pilot. You want

28:07

them spending a huge amount of

28:09

time on things that are rare

28:11

in their training. Same with a

28:13

surgeon. A huge amount of time

28:15

on things that are rare most

28:17

of the time. It's carpentry. But

28:20

their expertise is only when it's

28:22

rare. Something happens. The unexpected occurs.

28:24

That's when their metal is shown.

28:26

And we will get better synthetic

28:28

data by a great deal. I

28:30

love it. I get it from

28:32

a consumer perspective and from an

28:34

expectations perspective. If we move the

28:36

needle on compute algorithms and data,

28:38

what does that mean for the

28:40

experience of AI? Faster and cheaper

28:42

is the first answer. The second

28:44

is is when things become faster

28:46

and cheaper, new applications emerge. It's

28:48

used everywhere, right? When computers became

28:50

faster and cheaper, suddenly they were

28:52

in cars. And then you were

28:54

in your pocket. And then they

28:56

were in your dishwasher. And then

28:58

they were in your dishwasher. And

29:00

in your TV. And then they

29:02

were in your dishwasher. And in

29:04

your TV. You've got them in

29:06

your TV. You've got them. diffusion

29:08

of innovation accelerates when you make

29:10

things faster and cheaper. This is

29:12

Jevan's paradox and Satch's belief there

29:14

now. Yeah, I know in the

29:16

VC community you got to you

29:18

got to cite 19th century English

29:20

economists. I'm English, like, come on,

29:22

like, if I'm not allowed to

29:24

cite an English philosopher, what am

29:26

I here for? Are you just

29:28

like, oh, he's fucking VC, he's

29:30

just being like, oh, Javen's power

29:32

and off? That's right. It's like,

29:34

like, make stuff cheaper and faster.

29:36

There are very few examples in

29:38

our industry, actually, none in compute

29:40

in 50 years, in which, by

29:42

making things cheaper and faster, the

29:44

market got smaller, market always gets

29:46

bigger, always. Is there a world

29:48

where we move past transformers? There

29:50

is a world of 100%. We

29:52

won't be as dependent on transformers

29:54

in three years or five years

29:56

as we are now. 100%. They're

29:58

not the end-all be-all. Why is

30:00

it? What will replace it and

30:02

what does that look like? I

30:04

don't know. I don't know whether

30:07

they're going to be other types

30:09

of models, but what I know

30:11

for sure is that innovation doesn't

30:13

stop. The transformer has some weaknesses.

30:15

that people are desperate to overcome.

30:17

There's a quadratic effect in the

30:19

attention head. There's all sorts of

30:21

things that could be improved, but

30:23

it's pretty darn good now, the

30:25

best we have. And that's what

30:27

you run with. You run with

30:29

the best you have, and the

30:31

minute it's not the best you

30:33

have, you drop it in favor

30:35

of the best you have. The

30:37

number of innovative companies designing models

30:39

is large. And what Deep Seek

30:41

showed us is you don't need

30:43

5,000 people. billions of dollars a

30:45

gear. You can do it with

30:47

200 smart people. More gear than

30:49

Deep Seek said they had, but

30:51

less gear than others had. Were

30:53

you very impressed with Deep Seek

30:55

and what impressed you most? I

30:57

think it was a result of

30:59

focused engineering and that impressed me.

31:01

It was designed to be better.

31:03

They weren't confused about being model

31:05

intellectuals or they weren't confused about

31:07

whether it was important to break

31:09

new ground or they were interested

31:11

in being better. from an invention

31:13

standpoint, that's a little boring. But

31:15

from an engineering standpoint, that was

31:17

sweet effort. They really built a

31:19

model that was just plain better

31:21

at many, many things. And that's

31:23

cool. I like good engineering projects.

31:25

Now, that they chose to announce

31:27

it right around Trump's inauguration and

31:29

the politics of it, that's all

31:31

a separate matter, and we can

31:33

talk about that later. But is

31:35

distillation wrong? I'm a VC, are

31:37

you kidding me? That's what we

31:39

do. In sunrise you wouldn't know

31:41

anything, right? That's exactly right. No,

31:43

that's exactly right. No, I don't

31:45

think distillation is wrong. And if

31:47

distillation is wrong, then certainly using

31:49

people's copyrighted data is wrong. That's

31:52

the problem. The problem is, you

31:54

got to be a little bit

31:56

consistent. Well, Sam has been, I

31:58

guess, many times, and we hope

32:00

he will be again. I hope

32:02

he's open. So everything that they

32:04

did innovate on, open AI can

32:06

learn from and take to. I

32:08

think there are a few examples

32:10

of an open source anything having

32:12

this sort of immediate impact. that

32:14

model had. I mean, that model

32:16

had a giant impact in a

32:18

technical community of really smart people.

32:20

And there are very few examples

32:22

of other open source software projects

32:24

that had that type of impact

32:26

in that amount of time. You

32:28

know, you're in the business of

32:30

betting on these guys. They ramp

32:32

up and they, oh, look, 10,000,

32:34

that's 100,000 users. It's now a

32:36

million users. We better start a

32:38

company around that. Get those grad

32:40

students. But this had a loud

32:42

boom in the industry immediately. It

32:44

was like, whoa. The thing I

32:46

have to think is the venture

32:48

master was, where is enduring and

32:50

defensible value simply? And how do

32:52

I get in early and build

32:54

that over time? In hardware. Well,

32:56

this was my question. You're just

32:58

like, well, I mean, you have

33:00

to be a very smart ambassador,

33:02

Richard, to do whole lot, to

33:04

be clear. But on the model

33:06

side, do you think there is

33:08

value when you look at the

33:10

sheer number of players, or with

33:12

relatively comparable models? To demonstrate enduring

33:14

value, you need both immediate value

33:16

and a trajectory for more. I

33:18

think the problem is in some

33:20

industries you are capable of demonstrating...

33:22

a leadership position for a short

33:24

period of time. And then someone

33:26

else, maybe the next generation, they

33:28

generate the next, and the next

33:30

generation the next. And I think

33:32

that ends up in the software

33:34

world being you're competing against other

33:36

people's release cadences. You're four months

33:39

ahead, there's six months, if that's

33:41

really where you are, there's not

33:43

a lot of value. But if

33:45

you can stay at the top

33:47

over years, right? above you are

33:49

changing constantly. Very large Silicon Valley

33:51

companies have been built with not

33:53

the most compelling technology. It might

33:55

have started the most compelling technology

33:57

and then it got to a

33:59

point where it was good enough.

34:01

It was easy enough to use.

34:03

That's when you're at the mature

34:05

market. But we're a long way

34:07

from there right now. Right now

34:09

we are in the early phases.

34:11

you characterize my position exactly right

34:13

data compute algorithm I think we

34:15

have a ton of room for

34:17

improvement on all of them when

34:19

we let you said that compute

34:21

in hardware that's where the value

34:23

is how does that value distribution

34:25

shake out you know we've obviously

34:27

got the 800 pound gorilla that

34:29

is in video how do you

34:31

think about how the distribution of

34:33

value shakes out in hardware and

34:35

in compute over the next five

34:37

years historically one of the barriers

34:39

to entry was sort of the

34:41

capital intensity of a project And

34:43

in the world of building chips,

34:45

there's both scarce resources and expertise,

34:47

and it's very expensive. Historically, it

34:49

hasn't fit very comfortably in a

34:51

software company. And the things that

34:53

software, modern software companies, value are

34:55

not entirely conducive to chip making.

34:57

So when I look down the

34:59

road, who has endured in much

35:01

of infrastructure tech, people who build

35:03

systems? have endured. There's a reason

35:05

that Apple and invidious are one

35:07

of the most valuable companies on

35:09

earth. There is what they do

35:11

is hard. That's why it's worth

35:13

challenging. If it weren't hard, if

35:15

it wasn't enormous and difficult, why

35:17

spend time being the underdog and

35:19

challenging it? A lot of people

35:21

place defenseability around and invidious, kind

35:23

of cuda lock in. To what

35:26

extent is that real versus type?

35:28

In inference, it's not real at

35:30

all. There's no kooda lock in

35:32

an inference. None. You can move

35:34

from open AI on an invidious,

35:36

to cerebras, to fireworks, service on

35:38

something else, to together, to perplexity

35:40

with 10 keystrokes. Anybody who actually

35:42

uses AI knows there's no kooda

35:44

lock in any inference. I think

35:46

there is, there was a fundamental

35:48

effort to disintermediate Kuda first. by

35:50

Google with tensor flow and first

35:52

by some grad students with cafe

35:54

and some of these early efforts,

35:56

but later by Google. with tensor

35:58

flow and then Facebook or matter

36:00

with pie torch. I think today,

36:02

most AI is written in pie

36:04

torch and you ought to be

36:06

able to compile it and run

36:08

it on on your hardware. Invidia

36:10

has many moats. When you are

36:12

a dominant market chair leader, that

36:14

in itself is a moat, that

36:16

you're the default solution is a

36:18

moat, that everybody learns to think

36:20

about AI in your structures. Those

36:22

are moats. The software, compilers are

36:24

hard, but they're tractable. I completely

36:26

agree with you in terms of

36:28

kind of being the leader is

36:30

the most in itself. It's never

36:32

talked about that way. Would you

36:34

put open AI in that same?

36:36

It is the leader. Everyone's mother

36:38

knows ChatGPT. Let's look at Intel,

36:40

right? Intel has made, until hiring

36:42

Libbo, prior to that, nearly a

36:44

decade of catastrophic decisions. And they

36:46

still own 80% of the X86

36:48

market. 75% of the market. AMD

36:50

is worked up to like 25%

36:52

or 30% and after a decade

36:54

of screwing up and you ask

36:56

yourself like that's a moat. How

36:58

big's my moat? I can make

37:00

a bunch of bad decisions for

37:02

a decade and only lose 20%

37:04

share. That's extraordinary. The moat was

37:06

just unbelievable. We'll see. I mean

37:08

I'm a huge fan of lipos.

37:11

He's an investor in our company.

37:13

I wish him well and I

37:15

think if anybody can change that

37:17

company he can. but I think

37:19

we rarely talk about what being

37:21

the market chair leader means in

37:23

terms of a moat in the

37:25

right context because as a challenger

37:27

we have to think about it

37:29

exactly because it's exactly that that

37:31

we need to we need a

37:33

bridge for. It's exactly these characteristics

37:35

of the moat that we need

37:37

to get over. In five years

37:39

time though, is it Uber or

37:41

is it... like AWS and cloud.

37:43

And what I mean by that

37:45

is that cloud is an interesting

37:47

market where like a couple of

37:49

players, several players have relative segments,

37:51

25, 30% and it's shared relatively

37:53

evenly between them. Not exactly, but

37:55

relatively. Or is it one like

37:57

Uber where Uber has 90% lift

37:59

as far? and then there's alternative

38:01

providers with the other five. I

38:03

think it's going to be between

38:05

those two. In five years from now,

38:07

invidity is going to have 60, right? I

38:09

think right now they have approximately all

38:12

of it. I think they will come

38:14

down over time. Of invidious usage, what

38:16

percent will be training versus inference? I

38:18

think they will continue to have a

38:21

meaningful business on both sides. I think

38:23

they're exceptional at training. rollover and play

38:25

dead in inference, I think. They're a

38:27

world-class company. I mean, they've had one

38:29

of the great decades of any company

38:32

in history, right? I mean, from 2014,

38:34

they were worth, what, $10 billion to

38:36

where they are right now? It's one

38:38

of the great decades in corporate history.

38:41

I don't think they're going to roll

38:43

over and, oh yeah, we're not going

38:45

to be in the inference market. That's

38:47

not going to happen. They can have

38:50

a meaningful share. They're going to have

38:52

meaningful share. Very big companies made in

38:54

this 100x growth. Do you think chip

38:57

providers will be far larger than model

38:59

providers in terms of enterprise value? In

39:01

the five-year time frame? Yes. How does

39:03

that prediction change in a different timeline?

39:06

I think in a shorter timeline, you

39:08

know, when you price an option,

39:10

variance and uncertainty increases the options

39:12

value. If you look at the

39:14

way BlackSholes works or if you

39:16

look at any option pricing model,

39:18

uncertainty is a friend of the

39:21

value of the option. And when people

39:23

are paying these extraordinarily high prices for

39:25

our model companies right now, I think

39:27

part of that is this extraordinary uncertainty,

39:29

is this wild variance. And so in

39:32

the shorter run, it might not be

39:34

the case. But in the longer run,

39:36

as markets mature, as we begin to

39:38

understand the value of these models, we

39:41

understand what their businesses look like, what

39:43

their long term net profitability looks like.

39:45

What did Warren Buffett say about markets

39:47

in the short term? They're a voting

39:49

mechanism in the long term. They're a

39:52

weighing mechanism. At some point, the weighing kicks

39:54

in. Usually it's in the public markets. And

39:56

then investors say, which is likely to give

39:58

me better growth in the future? and you

40:00

mentioned the word public there, I

40:02

do want to just hone in

40:04

on your business. Your cash flow

40:07

positive in a world where everyone

40:09

else literally bleeds cash. Help me

40:11

understand, what do you do to

40:13

making cash flow positive when everyone

40:15

else is bleeding or hemorrhaging cash?

40:17

Traditionally, your gross margins were a

40:19

measure of your technical differentiation, right?

40:21

If you're running a negative gross

40:24

margin business, it speaks for itself.

40:26

You're selling commodity. value creation isn't

40:28

being recognized in the market. And

40:30

so I think our technology is

40:32

creating an opportunity for us to

40:34

maintain margins where some others can't.

40:36

A lot of your revenue is

40:38

concentrated to the G42 deal. To

40:41

what extent is that a strength

40:43

or a weakness? It's both. The

40:45

way you catch three large customers

40:47

is to catch one first. The

40:49

way you build three large strategic

40:51

partners is learn to be a

40:53

strategic partner. That's a learned skill.

40:55

we didn't arrive knowing how to

40:58

be a strategic partner G42. Now

41:00

that we've worked at it and

41:02

worked at it, it's a muscle

41:04

we can replicate. We could be

41:06

a better partner to any of

41:08

a dozen different companies in the

41:10

world. What have you learned in

41:12

the G42 relationship bill process that

41:15

may see a good partner in

41:17

the way that you worked? We

41:19

deployed tens of exaflops of compute.

41:21

vastly more than anybody else that

41:23

isn't AMD or invidia, right? I

41:25

mean, at a huge amount of

41:27

compute, our software has been hardened

41:29

on some of the largest AI

41:32

clusters in the world. We've gone

41:34

through the growing pains of increasing

41:36

manufacturing, 2X and 5X and 2X,

41:38

I mean, through unbelievable growth in

41:40

manufacturing. We've worked with our supply

41:42

chain partners to be sure that

41:44

they're ready for this extraordinary growth.

41:46

When you work with a strategic

41:49

partner, of this size, your organization

41:51

comes out different on the other

41:53

side. There are things you've learned

41:55

and there are mistakes you've made

41:57

and I hadn't done a big

41:59

relationship in the Middle East. There

42:01

was a huge amount to learn.

42:03

I think you come out a

42:06

much better company and much better

42:08

prepared to do. business with Iberscaler

42:10

to do business with another massive

42:12

partner to do business with another

42:14

sovereign, it takes real work and

42:16

your team has to learn. You

42:18

said you'd come out better. Why

42:20

go public when you did? When

42:23

this happened I was like it

42:25

seemed preemptive respectfully and my question

42:27

now to companies is why go

42:29

public at all? There is so

42:31

much private capital. The colusons have

42:33

shown I think very clearly that

42:35

you can stay for a lot

42:37

longer than you planned to. The

42:40

database has certainly shown that, right?

42:42

I mean, there, those were historically

42:44

public market valuations, you know, the

42:46

valuations that they're entropic and open

42:48

AI and some of the others

42:50

are getting are historically public market

42:52

only valuations. And like you said,

42:54

you're Aswan's live, anyone can read

42:57

it. I wouldn't want people reading

42:59

mine. We have nothing to hide,

43:01

I think. No, but your competitors

43:03

have got asymmetric information. Yeah, we've

43:05

got asymmetric technology. To be public,

43:07

you have to be ready organizationally,

43:09

be ready with your processes. You

43:11

need to be ready to forecast

43:14

and predict, to be held accountable

43:16

in a way that private companies

43:18

historically haven't been. We think that

43:20

there's tremendous value. We think that

43:22

we will be among the first

43:24

in the category. We think that

43:26

some of our largest targets would

43:28

have a stated preference for doing

43:31

business with public companies. large enterprises

43:33

in the US, I've done that

43:35

historically. Those were some of the

43:37

reasons that led us to. How

43:39

many G42 relationships, all at G42,

43:41

will you have in the next

43:43

24 months? How fast can you

43:45

ramp them? That's a good question,

43:48

several, those big numbers. So remind

43:50

me, how big is the G42?

43:52

It's 87% of revenue, I know

43:54

that. It was big. I mean,

43:56

when we announced it, it was

43:58

some estimated it was north of

44:00

a billion. Well done. That must

44:02

be a bit of a high

44:05

five, wasn't it? I think for

44:07

first, yeah, there's tremendous excitement. And

44:09

then there's sort of every entrepreneur's

44:11

reality is, I gotta make a

44:13

lot more gear. I need to

44:15

make it. You make a list

44:17

of your top 10 vendors and

44:19

you fly it to them all.

44:22

Say, big orders are coming. Be

44:24

ready. Right. You work with all

44:26

your partners to get ready because

44:28

you need to make a great

44:30

deal more stuff. And that's one

44:32

of the real differences between hardware

44:34

and software is when we grow

44:36

fast, the number of people you

44:39

need to work with in your

44:41

supply chain, and the amount of

44:43

collaboration that needs to happen is

44:45

truly extraordinary. Are you going to

44:47

have a cluster fuck of unhappy

44:49

customers who bluntly have waited so

44:51

long for chips? By the time

44:53

they get them, the chips are

44:56

outdated, and they're going, what? All

44:58

of that's an opportunity for... for

45:00

us and others. That's opportunity. I

45:02

think being a market cheerleader isn't

45:04

easy either, but when you're late,

45:06

when the bully falls, everybody wants

45:08

to give him a kick. I

45:10

mean, that's, a lot of that

45:13

happened at Intel. They'd been the

45:15

dominant player and when they fell,

45:17

everybody was happy to jump in

45:19

and kick them when they were

45:21

down. I think there is a

45:23

real opportunity in the potential for

45:25

a video customer and happiness for

45:27

sure. for those of us who

45:30

are competing with them. I mean,

45:32

if you can't get your gear,

45:34

you may as well test somebody

45:36

else's. That's a huge opening. Had

45:38

over to cerebrus and use the

45:40

promo code, Harry 20, for your

45:42

chips today. Do that. There we

45:44

go. I'm here for you, baby.

45:47

Influence the mo turned on. Yeah,

45:49

yeah. No, no, why is it's

45:51

fine. I hope if we could

45:53

do a 20% take on the

45:55

billion deal. I think that's fine.

45:57

That's fine. I know this venture

45:59

business has been so good to

46:01

you, Harry, and you got to

46:04

get shoes for your kids and

46:06

the like, and yeah, we're happy

46:08

to donate to the Harry calls.

46:10

400 million fund and fees. That's

46:12

right. When I have no kids

46:14

as well. Two and 20 is

46:16

a rough way to make a

46:18

living, Harry. Don't get it, okay?

46:21

You in hardware. You said about

46:23

the complexity of hardware. Oh, export

46:25

controls being implemented properly. Do you

46:27

think that is a good idea?

46:29

You know, everyone was going with

46:31

Deep Z. Wow, how did this

46:33

happen? They must have stolen chips.

46:35

How could this be? It turns

46:38

out that they probably did use

46:40

chips in Singapore. I think the

46:42

following. I think managing. software and

46:44

managing hardware compliance are extremely different

46:46

things because their vector of diffusion

46:48

is different. There's different weights. If

46:50

you sell a server that weighs

46:52

five or six hundred pounds, arrives

46:55

on a palate, you can go

46:57

visit it. You want to deploy

46:59

it in Kazakhstan, you can put

47:01

a data center, and you can

47:03

have somebody from the embassy visit

47:05

it. Take photos of it, once

47:07

a month. It's not going anywhere.

47:09

You can keep track of who

47:12

uses it, and provide logs. That's

47:14

much, much harder with software. And

47:16

open source is a whole other.

47:18

That's the first observation. The second

47:20

is that I got to know

47:22

the leadership in commerce in the

47:24

previous administration. I didn't always agree

47:26

with their policies, but it is

47:29

a world of unintended consequences. You

47:31

sought to limit Chinese access to...

47:33

EDA tools to delay the growth

47:35

of a Chinese chip market. And

47:37

so US venture capitalists back tons

47:39

of Chinese companies in Shenzhen to

47:41

build EDA tools. Right. This is

47:43

a unbelievably slippery dynamic challenging problem.

47:46

I don't know if it's a

47:48

tractable problem to delay another nation's

47:50

progress on a technical trajectory is

47:52

an enormously challenging thing. I certainly

47:54

came to appreciate just how difficult

47:56

it was for well-meaning people to

47:58

predict the impact of policy. during

48:00

the last two years, for sure.

48:03

Do you think this administration is

48:05

better for AI than the prior

48:07

administration? I don't think there's any

48:09

doubt that's the case. The past

48:11

administration lined itself up against Big

48:13

Tech. That was a mistake. AI

48:15

is also in a different place,

48:17

so it's easier to be for

48:20

it. It's less scary now than

48:22

it was. We sort of have

48:24

a better picture of the trajectory,

48:26

both the risks and the... the

48:28

benefits. This administration sort of had

48:30

the foresight to put in place

48:32

an AI czar or leader to

48:34

be a focal point for discussions.

48:37

Yeah, I think it's probably a

48:39

fair bit better. You said it's

48:41

very challenging to kind of hinder

48:43

a nation's development, adoption, progression of

48:45

a technology. Respectfully, you chose to

48:47

not sell to China. Yeah. Why

48:49

was that? And does that not

48:51

go against the difficulty in hindering

48:54

progression? No, I have a... a

48:56

very simple rule and I encourage

48:58

a team to use it. I

49:00

mean, you don't need a big

49:02

handbook to help you make good

49:04

decisions in a company. Just ask

49:06

yourself, would my mother be proud?

49:08

And which would you be proud

49:11

if I did this? Would you

49:13

be proud if I did this?

49:15

Would you be proud if I

49:17

did this? Would you be proud

49:19

if I did this? Would you

49:21

be proud if I explained it

49:23

to my mother? and that's a

49:25

moral compass. What do you mean

49:28

it wouldn't have been used for

49:30

good? To do facial recognition to

49:32

identify minorities for persecution, build military

49:34

equipment, to things that I either

49:36

couldn't see or what I saw

49:38

didn't didn't feel right, it's more

49:40

important than money. Do you think

49:42

we fundamentally underestimate the Chinese's capabilities?

49:45

100% and it is one of

49:47

the most obvious and frequent... errors

49:49

in judgment is that you underestimate

49:51

the other side. You have to

49:53

look carefully at what they're doing

49:55

and their investment in infrastructure has

49:57

been extraordinary. The rate at which

49:59

they generate engineering talent is exceptional.

50:02

The government's ability to have a

50:04

policy and implement it, and that's

50:06

not a democracy. They weren't designed

50:08

to have checks and balances there.

50:10

The funding that flowed into the

50:12

development of AI technology, that their

50:14

venture capitalists were backed up by

50:16

their government, they have national champion

50:19

companies, that they've developed a belt

50:21

and suspender strategy to sort of

50:23

make much of the third world

50:25

dependent on them and their technologies.

50:27

I think they absolutely should not

50:29

be underestimated. They have a lot

50:31

of people, and we see a

50:33

tiny fraction of it. They have

50:36

produced industrial policy that has moved

50:38

their nation forward. What was the

50:40

most significant, do you think? The

50:42

creation of economic zones like Shenzhen

50:44

was clearly a visionary move. They

50:46

knew that their own system was

50:48

in the way. They created zones

50:50

that relaxed their own system. Could

50:53

the US learn from them that

50:55

way? We did some of the

50:57

same things in Trump won. administration,

50:59

right? What do we do? We

51:01

relaxed our own rules in the

51:03

development of vaccines. We knew that

51:05

in this time, it would be

51:07

very difficult to go through the

51:10

steps that we always go through,

51:12

and we tried to implement some

51:14

thoughtful workarounds, rather. I think that,

51:16

you know, why are they committed

51:18

to trains as a motor transportation,

51:20

and we can't build a decent

51:22

train system in the US, or

51:24

in California, or why we have

51:27

three different standards for train rails.

51:29

and the rest of the world

51:31

can build extraordinary high-speed trains linking

51:33

important cities. What are we doing

51:35

wrong in the building of our

51:37

infrastructure that our bridges and our

51:39

freeways are in disarray? Those are

51:41

questions we got to ask ourselves

51:44

when we see other people doing

51:46

it differently. If you watch a

51:48

good football team and you say,

51:50

whoa, that's an interesting offense. And

51:52

you're not thinking to yourself, how

51:54

could our team learn? What could

51:56

we do? Why did that work?

51:58

or the structure or something that

52:01

made that a successful series of

52:03

plays. And what can I take

52:05

away from that? How can that

52:07

inspire me to do better? I'm

52:09

always looking for inspiration in others

52:11

and competitors and partners. We have

52:13

some of our partners in G42.

52:15

I mean, the work ethic is

52:18

unbelievable. It inspires me. And the

52:20

scope of the challenge that undertaken

52:22

inspires me. And I think I'm

52:24

always looking for that. Andrew I

52:26

could talk to you all day

52:28

I do want to do a

52:30

quick fire with you so I say

52:33

a short statement you're ready yeah sure

52:35

what do you believe that most around

52:37

you disbelieve I think we're closer to

52:39

peace in the Middle East than people

52:41

believe there is a rise of a

52:44

of a moderate business focused Arab state

52:46

that it wasn't there 25 or 30

52:48

years ago if you visit the UAE

52:50

or cutter or or even KSA what

52:53

you see is amazing transformation a desire

52:55

for to be included in the West

52:57

in their own way, but also to

52:59

enjoy the benefits of it. We

53:02

are closer than people think. What's

53:04

the most underrated threat to invidious

53:06

market share dominance? The fundamental architecture

53:08

of the GPU with off-chip memory

53:11

is not great for inferences. Now

53:13

they will continue to do well

53:15

in inference, but it can be

53:18

beaten, and I think they know

53:20

it. What's a crazy AI prediction

53:22

you have that most people

53:24

would call science fiction? Dariot

53:26

Anthropics that will live to

53:29

150? I don't think we're going to

53:31

live to 150. I don't think what

53:33

a 90% of our code will be

53:35

written by machines in this year. But

53:37

I do think that within a year

53:40

or two, most people in the US

53:42

will engage with an AI every single

53:44

day in one form or whether they

53:46

know it or not. That AI might

53:48

be in their mapping program that helps

53:51

them pick a better out to work.

53:53

It might be any number of different

53:55

things within a year or two. AI's

53:57

penetration will be approximately the same as

54:00

telephones. What have you changed

54:02

your mind on in the last 12

54:04

months? Many decisions I made turned out

54:06

to be wrong. What was the most

54:09

wrong decision? There are two ways you

54:11

can be wrong. You can actively be

54:13

wrong or you can fight against what

54:15

was right. In 2016, JP, one of

54:18

our co-founders in Chief System Architect, laid

54:20

out a plan that would have us

54:22

doing water cooling and for our systems.

54:25

Nobody else was doing it and I

54:27

fought so hard and I was so

54:29

wrong. JP was right about a year

54:32

or two later, Google announced that the

54:34

TPUs were going to be water-cooled. We

54:36

were first, and now invidious only selling

54:38

water-cooled parts. I mean, I was dead

54:41

wrong, and JP was right. Many, many

54:43

instances when you make a lot of

54:45

decisions every day, where you're wrong, I've

54:48

been wrong about people. People I thought

54:50

were pretty good turned out to be

54:52

extraordinary. People I thought would be extraordinary.

54:55

We're really smart, but couldn't finish projects

54:57

and get stuff done. wrong a fair

54:59

bit, you ought not to be making

55:01

a lot of decisions because it comes

55:04

with territory. As a venture capitalist, I'm

55:06

never wrong, so I don't know. As

55:08

a venture capitalist, you're wrong nine times

55:11

in ten and everybody forgets as long

55:13

as you're really right. And I get

55:15

a picture of you signing the term

55:18

sheet with me and then he's for

55:20

public, and then I go... And I

55:22

think yours is a perfect industry in

55:25

which nobody cares about the average, you're

55:27

wrong all the time. And what they

55:29

care about is the occasional time you're

55:31

really right. That's what moves a fund.

55:34

That's different than being a CEO. I

55:36

think we got to be mostly right

55:38

most of the time, but if you're

55:41

making a lot of decisions, you're still

55:43

making a ton of mistakes. This is

55:45

your fifth startup. I mean, you are

55:48

a sucker for punishment, aren't you? I

55:50

mean, really, like five times. Like, Christ

55:52

Andrew, did you not get beaten alive

55:54

enough? My question to you though is

55:57

like, I believe in the value of

55:59

serial entrepreneurship. Don't! How do you think

56:01

about the inherent benefits that you have

56:04

having done it four times before? I

56:06

think if you are in a business

56:08

in which running a business... is a

56:11

benefit, then experience matters a great deal.

56:13

I think if you are in a

56:15

business in which you look like your

56:17

customer, there was a reason why social

56:20

networks were started by people right out

56:22

of college or in college, because dating

56:24

is top of their mind. And they

56:27

looked like their customers. And that was

56:29

more important than knowing anything about running

56:31

a business. In that environment, it will

56:34

certainly select for people who... are of

56:36

the demographic that their customers are. They

56:38

know that backwards and forwards. But if

56:40

you want to have a business, has

56:43

manufacturing in it, it has a supply

56:45

chain that has you managing hundreds or

56:47

thousands of engineers to a timeline, to

56:50

a schedule, I don't think anybody would

56:52

turn around your statement and with a

56:54

straight face that you know what I'm

56:57

looking for is an engineering leader with

56:59

no experience. Right, no, and I don't

57:01

want somebody who's led a team of

57:04

four or five hundred who has experienced

57:06

the challenges of growth. What I'm looking

57:08

for is somebody with no experience. Naivity

57:10

is a bonus here. Yes, right. I

57:13

think the people who sell that sometimes

57:15

are consultants, right? Oh, look, my guys

57:17

have no experience in your industry. They're

57:20

not biased. Maybe a little bit of

57:22

experience in the industry would help, right?

57:24

Come on. Where are people investing today

57:27

in AI? Why is so much cash

57:29

going to that part? I'm not saying

57:31

that company I don't... I think part

57:33

of the dynamic in your industry is

57:36

sometimes money needs to find a home.

57:38

Some guys have raised really really big

57:40

funds and they got to find a

57:43

home for their money. and some people

57:45

don't like to be left out and

57:47

they're willing to make investments for maybe

57:50

for some status purposes or other reasons

57:52

that they don't seem to make sense.

57:54

There's some underappreciated places of investment. I'd

57:56

say in the chip world the sub-milliwatt,

57:59

really tiny, tiny, little chips that live

58:01

next to sensors that do inference. These

58:03

are tiny little things that will only

58:06

send back useful data is a... interesting

58:08

market and they will sell enormous volume.

58:10

It's not a part of the market.

58:13

I love to play in, I like

58:15

to build bigger things and sell them

58:17

to the data center, but I think

58:20

that part is extremely interesting. I think

58:22

they'll be fundamental for robotics. That's an

58:24

area where extremely underappreciated. Final one, if

58:26

we think about cerebral in 10 years

58:29

time, where do you envision the business

58:31

in 10 years time? If everything goes

58:33

well, where are we in business having

58:36

that conversation? So 10 years ago in

58:38

video is worth $10 billion. That's a

58:40

long run in our world right now.

58:43

I think in three to five years,

58:45

I would like our technology to have

58:47

been used to solve two important societal

58:49

problems. I would like it to be

58:52

used to have found a therapeutic for

58:54

an affliction that impacts more than a

58:56

million people here. I would like our

58:59

inference to be powering a collection of

59:01

apps that don't exist today. And I

59:03

would like that a meaningful portion of

59:06

the population in the US and in

59:08

Europe inadvertently uses our technology. So uses

59:10

something that we power and that they

59:12

don't even know it. I think those

59:15

are things that would make me really

59:17

happy. And I've wanted to make this

59:19

show happen for a long time. As

59:22

I said, I heard so many good

59:24

things from Harry for many years. There's

59:26

been so many requests to have you

59:29

on the show. My team is just

59:31

like, just get Andrew on the show,

59:33

Harry. I'm like, okay, okay. I like

59:36

tweeted it, obviously, which is how we

59:38

got this. But thank you for joining

59:40

it. 40 people sent me a note

59:42

saying how come you're avoiding Harry how

59:45

come he has to go tweet it.

59:47

I was just like all right just

59:49

call me it's good send me no

59:52

happy to come on really thoughtful questions

59:54

Harry really thoughtful and interesting a really

59:56

fun conversation so I have wanted to

59:59

do that show for a while but

1:00:01

frankly I was just blown away by

1:00:03

Andrew's humility his no BS approach he

1:00:05

was incredible to work within the process

1:00:08

and I just so appreciate his time

1:00:10

state if you want to watch a

1:00:12

full episode, you can find it on

1:00:15

YouTube by searching for 20 VC, that's

1:00:17

20 VC on YouTube. But before we

1:00:19

leave you today, turning your back of

1:00:22

a napkin idea into a billion dollar

1:00:24

startup requires countless hours of collaboration and

1:00:26

teamwork, it can be really difficult to

1:00:28

build a team that's aligned on everything

1:00:31

from values to workflow. But that's exactly

1:00:33

what CODA was made to do. CODA

1:00:35

is an all-in-one collaborative workspace that started

1:00:38

as a napkin sketch. Now, just five

1:00:40

years since launching and baser, CODA has

1:00:42

helped 50,000 teams all over the world

1:00:45

get on the same page. 20 VC,

1:00:47

we've used Coder to bring structure to

1:00:49

our content planning and episode prep, and

1:00:52

it's made a huge difference. Instead of

1:00:54

bouncing between different tools, we can keep

1:00:56

everything from guest research to scheduling and

1:00:58

notes all in one place, which saves

1:01:01

us so much time. With Coder you

1:01:03

get the flexibility of docs, the structure

1:01:05

of spreadsheets and the power of applications,

1:01:08

all built for enterprise. and it's got

1:01:10

the intelligence of AI which makes it

1:01:12

even more awesome. If you're a startup

1:01:15

team looking to increase alignment and agility,

1:01:17

Codea can help you move from planning

1:01:19

to execution in record time. To try

1:01:21

it for yourself, go to codea.io/20vc today

1:01:24

and get six free months of the

1:01:26

team plan for startups. That's codea.io/20vc to

1:01:28

get started for free and get six

1:01:31

free months of the team plan. Now

1:01:33

that your team is aligned and collaborating,

1:01:35

let's tackle those messy expense. reports. You

1:01:38

know, those receipts that seem to multiply

1:01:40

like rabbits in your wallet, the endless

1:01:42

email chains asking, can you approve this?

1:01:44

Don't even get me started on a

1:01:47

month and panic when you realize you

1:01:49

have to reconcile it all. Well, Plio

1:01:51

offers smart company cards, physical, virtual and

1:01:54

vendor-specific. So teams can buy what they

1:01:56

need while finance stays in control. Automate

1:01:58

your expanse reports, process invoices seamlessly, and

1:02:01

manage reimbursements effortlessly, all in one platform.

1:02:03

With integrations to tools like Zero, Quick

1:02:05

Books and Net Suite, Plio fits right

1:02:08

into your workflow. time and giving you

1:02:10

full visibility over every entity, payment and

1:02:12

subscription. Join over 37,000 companies already using

1:02:14

Plio to streamline their finances. Try Plio

1:02:17

today. It's like magic but with fewer

1:02:19

rabbits. Find out more at plio.io.io.org/20 VC.

1:02:21

And don't forget to revolutionize how your

1:02:24

team works together. Rome. A company of

1:02:26

tomorrow runs at hyper speed with quick

1:02:28

drop-in meetings. A company of tomorrow is

1:02:31

globally distributed. and fully digitized. The company

1:02:33

of tomorrow instantly connects human and AI

1:02:35

workers. A company of tomorrow is in

1:02:37

a Rome virtual office. See a visualization

1:02:40

of your whole company, the live presence,

1:02:42

the drop-in meetings, the AI summaries, the

1:02:44

chats. It's an incredible view to see.

1:02:47

Rome is a breakthrough workplace experience loved

1:02:49

by over 500 companies of tomorrow for

1:02:51

a fraction of the cost of zoom

1:02:54

and slack. Visit Rome, that's OR.AM. For

1:02:56

an instant demo of Rome today. Nobody

1:02:58

knows what the future holds, but I

1:03:00

do know this. It's going to be

1:03:03

built in a Rome virtual office, hopefully

1:03:05

by you. That's Rome, R-O-O-D-M, for an

1:03:07

instant demo. As always, I so appreciate

1:03:10

all your support and stay tuned for

1:03:12

a fantastic episode coming on Wednesday, with

1:03:14

I think one of the most under-disgussed

1:03:17

firms in venture capital, Lead Edge Capital,

1:03:19

and their founder Mitchell.

Rate

Get this podcast via API

From The Podcast

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

The Twenty Minute VC (20VC) interviews the world's greatest venture capitalists with prior guests including Sequoia's Doug Leone and Benchmark's Bill Gurley. Once per week, 20VC Host, Harry Stebbings is also joined by one of the great founders of our time with prior founder episodes from Spotify's Daniel Ek, Linkedin's Reid Hoffman, and Snowflake's Frank Slootman. If you would like to see more of The Twenty Minute VC (20VC), head to www.20vc.com for more information on the podcast, show notes, resources and more.

Join Podchaser to...

Rate podcasts and episodes
Follow podcasts and creators
Create podcast and episode lists
& much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.

,

Unlock more with Podchaser Pro

Audience Insights

Contact Information

Demographics

Charts

Sponsor History

and More!

Pro Features

Resources
Help Center
Blog
API

Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More