Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
a lesson that I've seen people learn over and
0:02
over again in this field is like, you know,
0:04
we think that we can do things that are
0:06
smarter than what the models do by writing it
0:08
ourselves, but as the field progresses, the models come
0:11
up with better solutions to things than humans do.
0:13
The like probably like number one lesson on machine
0:15
learning is like, you get what you optimize for.
0:17
And so if you're able to set up the
0:20
system such that you can optimize directly
0:22
for the outcome that you're looking for.
0:24
the results are going to be much,
0:26
much better than if you sort of
0:29
try to glue together models that are
0:31
not optimized end-to-end for the tasks they
0:33
are trying to have them do.
0:35
So my long-term guidance is that
0:37
I think like reinforcement learning tuning
0:39
on top of models is probably
0:41
going to be a critical part
0:43
of how the most powerful agents get
0:46
built. We're
1:02
excited to welcome Issa Fulford and Josh Tobin
1:04
who lead the Deep Research product at Open
1:06
AI. Deep Research launched three weeks ago
1:08
and has quickly become a hit
1:10
product used by many tech luminaries
1:12
like the Collisons for everything from
1:14
industry analysis to medical research to
1:16
birthday party planning. Deep Research was
1:18
trained using end-to-end reinforcement learning on hard
1:21
browsing and reasoning tasks and is the
1:23
second product in a series of agent
1:25
launches from Open AI with the first
1:28
being operator. We talked to ESA and
1:30
Josh about everything from Deep Research's
1:32
use cases to how the technology
1:34
works under the hood to what
1:36
we should expect in future agent
1:38
lunches from Open AI. ESA and Josh,
1:40
welcome to the show. Thank you. Thank you
1:42
so much for joining us. Excited to
1:44
be here. Thank you for having us.
1:46
So maybe let's start with what is
1:48
Deep Research? Tell us about the origin
1:50
stories and what this product is doing.
1:53
So Deep Research is a agent
1:55
that is able to... such
1:57
many online websites and it
1:59
can create very comprehensive reports. It
2:01
can do tasks that would take
2:03
humans many hours to complete and
2:05
it's in Chachibati and it takes
2:07
like five to 30 minutes to
2:09
answer you and so it's able
2:11
to do much more in-depth research
2:13
and answer your questions with much
2:15
more detail and specific sources than
2:17
regular Chachibati-PT response would be able to
2:19
do. It's one of the first agents
2:21
that we've released so we released operator
2:24
pretty recently as well and so deep research
2:26
is the second agent and you know
2:28
will release many more in. in future. What's
2:30
the origin story behind deep research?
2:33
Like when did you choose to
2:35
do this? What was the inspiration
2:37
and how many people work on
2:39
it? Like what did it take to
2:41
bring this to fruition? Good question. This
2:43
is before my time. So I
2:45
think maybe around a year ago
2:48
we were seeing a lot of
2:50
success internally with this new reasoning
2:52
paradigm and training models to think
2:54
before responding and we were
2:56
focusing a lot on math and science domains
2:58
but I think that the other thing that
3:01
this kind of new reasoning model
3:03
regime unlocks is the ability to
3:05
do longer horizon tasks that involve like
3:07
agentic kind of you know abilities and
3:09
so we thought you know a lot
3:12
of people do tasks that require a
3:14
lot of online research or a lot
3:16
of external context and that involves a
3:18
lot of reasoning and discriminating between sources
3:20
and you have to be quite creative
3:23
to do those kinds of things. And
3:25
I think we finally had models or
3:27
a way of training models that would
3:29
allow us to be able to tackle
3:31
some of those tasks. So we decided
3:34
to try and start training models
3:36
to do first browsing tasks. So
3:38
using the same methods that we
3:40
used to train reasoning models, but
3:42
on more real-world tasks. Was it
3:44
your idea? And Josh, how did
3:47
you get involved? At first, it
3:49
was like me and Josh Patel,
3:51
who, as that opening, he's working
3:53
on a similar project that will be
3:55
released at some point which we're very
3:57
excited about and we built an original
3:59
demo. And then also with Thomas Stimson,
4:01
who's one of those people who just,
4:04
is an amazing engineer, like,
4:06
will dive into anything and just,
4:08
you know, get loads of things on. So it
4:10
was very fun. Yeah, and I joined more
4:12
recently. I rejoined opening eye
4:14
about six months ago from my startup.
4:17
I was an eye opening eye in the
4:19
early days and was looking around the
4:21
projects when I rejoined and
4:23
got very interested in some of our
4:25
age and take efforts, including this
4:27
one and got involved with that.
4:30
Amazing. Well, tell us a
4:32
little about who you built it
4:34
for. Yeah, but it's really
4:36
for anyone who does knowledge
4:38
work as part of their
4:40
day-to-day job or really as part
4:43
of their life. So we're seeing
4:45
a lot of the usage come
4:47
from people using it for
4:49
work, doing things like research
4:51
as part of their jobs, for
4:54
understanding markets,
4:56
companies, real estate. a
4:58
lot of scientific research, medical, I
5:00
think we've seen a lot of
5:02
medical examples as well. And one of the
5:05
things we're really excited about as well is
5:07
that this this style of like I just
5:09
need to go out and spend many hours
5:11
doing something that you know where I have
5:13
to do a bunch of web searches and
5:15
collate a bunch of information is not just
5:18
a work thing but it's also useful for
5:20
shopping and travel as well. So we're excited for the
5:22
plus launch so that more people will be able to. try
5:24
deep research and maybe we'll see some new use cases as
5:26
well. It's definitely one of the products I've used the most
5:28
over the last couple weeks. It's been amazing. Using it for
5:31
work? For work, definitely. Also for fun. What are you using
5:33
it for? Oh for me? Oh for me? Oh my goodness.
5:35
I was thinking about buying it for. Oh for me? So
5:37
I was thinking about buying a new car. I'm sorry. So I was
5:39
thinking about... Oh for me? Oh my goodness. I was thinking about, oh
5:41
my goodness. I was thinking about buying a new. Oh my goodness. I was
5:43
thinking about buying a new. Oh my goodness. Oh my goodness. Oh my
5:45
goodness. Oh my goodness. I was thinking about a new. Oh my goodness.
5:48
Oh my goodness. Oh my goodness. I was thinking about. Oh my goodness.
5:50
Oh my goodness. I was thinking about. Oh my goodness. Oh my goodness.
5:52
I was thinking about. Oh my goodness. I was thinking about. Oh my
5:54
goodness. Oh my goodness. I It put together an amazing report
5:56
that told me maybe wait a couple
5:58
months, but this year. like in the next
6:01
few months it should come out. Yeah,
6:03
like one of the things that's really
6:05
cool about it is it's like, it's
6:07
not just for going broad and gathering
6:09
all of the information about a
6:11
source, but it's also really good at
6:13
finding like very obscure, like weird
6:15
facts on the internet. Like if
6:18
you have something very specific you
6:20
want to know that you might not just
6:22
turn up in the first page of search results,
6:24
it's good at that kind of thing
6:26
too. So that kind of thing. So that
6:29
kind of people. are using it for coding.
6:31
Yeah. Which wasn't really a use
6:33
case I'd considered, but I've seen
6:35
a lot of people on Twitter and
6:37
in various places where we get
6:40
feedback using it for coding and
6:42
code search and also for
6:44
finding the latest documentation on
6:47
a certain package or a certain
6:49
package or a certain package or
6:51
something and helping them write a
6:53
script or something. Yeah, I'm like
6:55
I'm kind of embarrassed that we
6:57
didn't think of that as a use
6:59
case. will evolve over time? Like you
7:01
mentioned the plus launch that's happening, you
7:04
know, in a year's time or two
7:06
years time. Would you guess this is
7:08
mostly a business tool or mostly a
7:10
consumer tool? I would say hopefully both.
7:12
I think it's a pretty general capability,
7:15
which and I think it's something that we
7:18
do both in work and in personal
7:20
life. So I'm excited about both.
7:22
I think the magic of it is like,
7:24
um, it just saves people a lot of
7:26
time. You know, if there's... something that
7:28
might have taken you hours or in
7:31
some cases we've heard like days. People
7:33
can just put it in here and
7:35
get you know 90% of what they would have
7:37
come out up with on their own. And so
7:39
yeah I tend to think there's like
7:41
there's more tasks like that in business
7:43
than there are in personal but I
7:45
mean I think for sure it's gonna
7:48
be part of people's lives in both.
7:50
It's really become the majority
7:52
of my usage for chat. I
7:54
just always picked deep research rather
7:56
than normal. So what are you seeing in terms of
7:58
consumer use cases? And what are you excited? about? I
8:00
think a lot of shopping, travel
8:02
recommendations. I personally
8:05
used the model a lot.
8:07
I've been using it for months to
8:09
do these kinds of things. We were
8:11
in Japan for the for the launch
8:13
of deep research so it was
8:15
very helpful in finding restaurants
8:18
and finding things that
8:20
I wouldn't have like
8:22
necessarily found. Yeah and I found
8:24
it like when you have something...
8:26
It's like the kind of thing
8:29
where, you know, if you're shopping,
8:31
maybe for something expensive or you're
8:33
planning a trip that is special
8:36
or you want to spend a lot of,
8:38
that you're, you want to spend
8:40
a lot of time thinking about.
8:42
It's like, for me, you know, I
8:44
might go and spend hours and hours
8:47
like trying to read everything on
8:49
the internet about this one,
8:51
this product that I'm interested
8:53
in buying, like, like something
8:55
like that. very quickly. And so
8:57
it's really useful for that kind of
9:00
thing. The model is also very
9:02
good at instruction following. So if you
9:04
have a query with many different parts
9:06
or many different questions, so if
9:09
you want the information about the
9:11
product, but you also want comparisons
9:13
to all other products, and you
9:15
also want information about reviews from,
9:17
you know, read it or something
9:19
like that. You can give loads
9:22
of different requirements and
9:24
it will do all of them for you. ask
9:26
it to format it in a table. It
9:28
will usually do that anyway, but it's really
9:30
helpful to have a table with a bunch
9:33
of citations and things like that for all
9:35
the categories of things that you want to
9:37
research. Yeah, there are also some features
9:39
that hopefully will get into the product
9:41
at some point, but the underlying model
9:43
is able to embed images so it
9:45
can find images of the products. And
9:48
it's also, this is not a consumer
9:50
use case, but it's able to create
9:52
graphs as well and then embed those
9:54
in its response. Hopefully that will come
9:56
to chat to you soon as well. nerdy
9:58
consumer use case. Yeah. And
10:00
speaking of nerdy consumer use
10:03
cases, also like personalized education
10:05
is a really interesting use
10:07
case. Like if there's a
10:10
topic that you've been meeting to
10:12
learn about, you know, if you need
10:14
to brush up on your biology or,
10:16
you know, you want to learn about
10:19
like, like, like, some world event, it's,
10:21
it's really good at, you know, put
10:23
in all the information about.
10:25
what you feel like you don't understand, what
10:27
aspects of it you want to go do
10:29
research on and it'll put together a nice
10:31
report for you. One of my friends is considering
10:34
starting a CPG company and he's
10:36
been using it. so much to
10:38
find similar products to see if
10:40
specific names are already, you know,
10:43
the domains are already taken, market
10:45
sizing, like all of these different
10:47
things. So that's been fun to,
10:49
he'll share the reports with me
10:51
and I'll read them. So it's pretty
10:53
fun use case is it's really good
10:55
at finding like a single obscure
10:58
fact on the internet. Like if
11:00
there's like a, you know, like an
11:02
obscure TV show or something that you
11:04
want to... you know, to like find
11:07
like one particular episode of or
11:09
something like that, it'll go and
11:11
it'll go deep and find the
11:14
like one reference to it on the
11:16
web. Oh yeah, my my brother's
11:18
friend's dad had this very specific
11:20
fact. It was about some
11:22
Austrian general who was empowered during a
11:25
certain a death of someone during a
11:27
battle like a very niche question and
11:29
Apparently Chad GBT had previously answered it
11:31
wrong and he was very sure that
11:33
it was wrong So you went to
11:35
the public library and found a record and
11:38
found that it was wrong and so then
11:40
Deep research was able to get it right
11:42
so we sent it to him and he
11:44
was he was excited What is the rough
11:46
mental model for you know what deep research
11:48
is excellent at today and you know where
11:50
should people be using? the O series of
11:53
models, where should they be using
11:55
deep research? What deep research
11:57
really excels at is if you
11:59
have a... sort of detailed description of
12:01
what you want and in order to
12:03
get the best possible answer requires reading
12:06
a lot of the internet. If
12:08
you have kind of like more of a
12:10
vague question it'll help you kind of
12:12
clarify what you want but it's I
12:14
mean it's it's really at its best
12:16
when there's like a specific set of
12:18
information that you're looking for. And I
12:21
think it's very good at synthesizing
12:23
information it encounters. It's very
12:25
good at finding specific like hard
12:27
to find information. but it's
12:29
maybe less and it can make kind
12:32
of some new insights I guess
12:34
from what it from what
12:36
it encounters but I don't
12:38
think it's not making new
12:40
scientific discoveries yet and then I
12:42
think using the O-series model for
12:45
me if I'm asking for something
12:47
to do with coding usually it
12:49
doesn't require knowledge outside of
12:52
what the model already knows from
12:54
it like pre-training so I
12:56
would usually use O1Pro or O1
12:59
for coding, or O3 Mini, hi. And
13:01
so deep research is a great
13:03
example of where some of the
13:05
new product directions for open AI
13:07
are going. I'm curious, how can
13:09
the extent you can share, how does
13:11
it work? The model that powers deep
13:14
research is a fine-tuned version of
13:16
O3, which is our most advanced
13:19
reasoning model, and we specifically
13:21
trained it on hard browsing tasks
13:23
that we collected, as well as
13:26
other reasoning tasks. And so it
13:28
also has access to a browsing
13:30
tool and Python tool. So through
13:32
training, end to end on those
13:35
tasks, it learned like strategies to
13:37
solve them. And the resulting models
13:39
good at online search and analysis.
13:42
Yeah, like intuitively, the
13:44
way you can think about it is you
13:46
make this sort of this request,
13:48
ideally a detailed request about what
13:50
you want. The model thinks
13:53
hard about that. It searches for
13:55
information. It pulls that information and it
13:57
reads it and understands how it
13:59
relates to it. that request and then decides
14:01
what to search for next in order
14:03
to get kind of closer to the
14:06
final answer that you want. And it's
14:08
trained to do a good job of
14:10
pulling together all those all that information
14:13
to a nice tidy report with citations
14:15
that point back to the original information
14:17
that I found. Yeah I think what's
14:20
new about deep research as an agentic
14:22
capability is that because we have the
14:24
ability to train end to end there
14:27
are a lot of things that that
14:29
you have to do in the process
14:31
of doing research that you couldn't really
14:34
predict beforehand. So I don't think it's
14:36
possible to write some kind of language
14:38
model program or script that would be
14:41
as flexible as what the model is
14:43
able to learn through training where it's
14:45
actually reacted to live web information. And
14:48
based on something it sees, it has
14:50
to change its strategy and things like
14:52
that. So we actually see it doing
14:55
pretty creative searches. You can read the
14:57
chain of thought summary and I'm sure
14:59
you can see sometimes it's very very
15:02
smart about how it comes up with
15:04
the next thing to look for. So
15:06
John Carlson had a tweet that went
15:09
somewhat viral. You know how much of
15:11
the magic of deep research is real-time
15:13
access to web content and how much
15:15
of the magic is in kind of
15:18
chain of thought? Can you maybe shed
15:20
some light on that? I think it's
15:22
definitely a combination. I think you can
15:25
see that because there are other such
15:27
products that don't necessarily, that weren't trained
15:29
end to end, so won't be as
15:32
flexible in responding to, you're responding to
15:34
information in accounters, won't be as creative
15:36
about how to solve specific problems because
15:39
they weren't specifically trained for that purpose.
15:41
So it's definitely a combination. I mean,
15:43
it's a fine team version of O3.
15:46
O3 is a very smart and powerful
15:48
model. A lot of the analysis capability.
15:50
is also from the underlying 03 model
15:53
training. But so I think it's definitely
15:55
a combination. Before. Open AI was working
15:57
at a startup and we were dabbling
16:00
in building agents kind of the way
16:02
that I see most people describe building
16:04
agents on the internet, which is essentially,
16:07
you know, you construct this graph of
16:09
operations and some of the nodes in
16:11
that graph are language models. And so
16:14
you can, the language model can decide
16:16
what to do next, but the overarching
16:18
logic of the sequence of steps that
16:21
happen is defined by a human. What
16:23
we found is that it's really, it's
16:25
like a powerful way of building things
16:28
to get quickly to a prototype, but
16:30
it falls down pretty quickly in the
16:32
real world because it's very hard to
16:35
anticipate all the scenarios that the model
16:37
might face and think about all the
16:39
different branches of the path that you
16:41
might want to take. In addition to
16:44
that, the models often are not the
16:46
best decision makers at nodes in that
16:48
graph because they weren't trained to do
16:51
to make those decisions. They were trained
16:53
to do things that look similar to
16:55
that look similar to that. And so
16:58
I think the thing that's really powerful
17:00
about this model is that it's trained
17:02
directly end to end to solve the
17:05
kinds of tasks that users are using
17:07
it to solve. So you don't have
17:09
to set up a graph or make
17:12
those node-like decisions on the architecture on
17:14
the back end? It's all driven by
17:16
the model itself. Yeah. Can you say
17:19
more about this? You know, it seems
17:21
like that's one of the very opinionated
17:23
decisions that you've made and clearly it's
17:26
worked. There's so many companies that are
17:28
building on your API, kind of prompting
17:30
to, you know, to, you know, solve
17:33
specific tasks for specific users. Do you
17:35
think a lot of those applications would
17:37
be better served by kind of having,
17:40
you know, trained models end-to-end for their
17:42
specific workflows? I think if you have
17:44
a very specific workflow that is quite
17:47
predictable. it makes a lot of sense
17:49
to do something like Josh described, but
17:51
if you have something that has a
17:54
lot of edge cases or it needs
17:56
to be quite flexible, then I think
17:58
something similar to Deep Research is probably
18:01
a better approach. Yeah, I think like
18:03
the guidance I give people is the
18:05
one thing that you don't want to
18:07
bake into the model is like kind
18:10
of hard and fast rules. Like if
18:12
you have, you know, a database that
18:14
you don't want the model to touch
18:17
or something like that, it's better to
18:19
encode that in human written logic, but
18:21
I think it's kind of like a
18:24
lesson that I've seen people learn over
18:26
and over again in this field is
18:28
like, you know, we think that we
18:31
can do things that are smarter than
18:33
what the models do by writing it
18:35
ourselves. But in reality, like usually as
18:38
the field progresses, the models come up
18:40
with better solutions to things than humans
18:42
do. And also like, you know, the
18:45
like probably like number one lesson on
18:47
machine learning is like you get what
18:49
you optimize for. And so if you're
18:52
able to set up the system such
18:54
that you can optimize directly for the
18:56
outcome that you're looking for. the results
18:59
are going to be much, much better
19:01
than if you sort of try to
19:03
glue together models that are not optimized
19:06
end-to-end for the tasks they are trying
19:08
to have them do. So my long-term
19:10
guidance is that I think like reinforcement
19:13
learning tuning on top of models is
19:15
probably going to be a critical part
19:17
of how the most powerful agents get
19:20
built. What were the biggest technical challenges
19:22
along the way to making this work?
19:24
Well, I mean, maybe I can say
19:26
as like an observer rather than someone
19:29
who was involved in this from the
19:31
beginning, but it seems like kind of
19:33
one of the things that ESA and
19:36
the rest of the team worked really,
19:38
really hard on and was kind of
19:40
like one of the hidden keys to
19:43
success was like making really high quality
19:45
data sets. It's another one of those
19:47
like age old lessons in machine learning
19:50
that people keep re learning, but the
19:52
quality of the data that you put
19:54
into the model is probably the biggest
19:57
determining factor in the quality of the
19:59
model that you get on the other
20:01
side, who's other person who works on
20:04
the project, who just, any data set,
20:06
who will optimize, so that's... secret to
20:08
success. Find your Edward. Great, great, machine
20:11
learning, model training. How do you make
20:13
sure that it's right? Yeah, so that's
20:15
obviously a cool part of this model
20:18
and product is that we want it
20:20
to be users to be able to
20:22
trust the outputs. So part of that
20:25
is we have citations and so users
20:27
are able to see where the model
20:29
is. citing its information from. And we,
20:32
during training, that's something that we actually
20:34
try and make sure is correct, but
20:36
it's still possible for the model to
20:39
make mistakes or hallucinate or trust a
20:41
source that maybe isn't the most trustworthy
20:43
source of information. So that's definitely an
20:46
active area where we want to continue
20:48
improving the model. How should we think
20:50
about this together with, you know, O3
20:52
and operator and other different releases? Like,
20:55
does this use operator? Do these all
20:57
build on top of each other or
20:59
are they all kind of a series
21:02
of different applications of O3? Today, these
21:04
are pretty disconnected. But you can kind
21:06
of, you can imagine kind of where
21:09
we're going with this, right, which is
21:11
like, the ultimate agent that people have
21:13
access to. at some point in the
21:16
future should be able to do, you
21:18
know, not just web search or using
21:20
a computer or any of the other
21:23
types of actions that you'd want, like
21:25
kind of a human assistant to do,
21:27
but should be able to fuse all
21:30
these things in a more natural way.
21:32
Any other design decisions that, you know,
21:34
you've taken that maybe not obvious at
21:37
first glance? I think one of them
21:39
is the clarification flow. So if you've
21:41
used deep research, the model will ask
21:44
you questions before science research. And usually
21:46
ChatGBT, maybe I'll ask you a question
21:48
at the end of its response, but
21:51
it usually doesn't have such a, that
21:53
kind of behavior up front. And that
21:55
was intentional because you will get the
21:58
best response from the deep research model
22:00
if... the prompt is very well specified
22:02
and detailed. And I think that it's
22:05
not the natural user behavior to give
22:07
all of the information in the first
22:09
prompt. So we wanted to make sure
22:12
that if you're going to wait five
22:14
minutes, 30 minutes, that your response is
22:16
as detailed and you satisfactory. So. we
22:18
added this additional step to make sure
22:21
that the user provides all the detail
22:23
that we would need. And I've actually
22:25
seen a bunch of people on Twitter
22:28
saying that they have this flow or
22:30
that they will talk to 01 or
22:32
01 Pro to help make their prompt
22:35
more detailed. And then once they're happy
22:37
with the prompt, then they'll send it
22:39
to deep research, which is interesting. So
22:42
people are finding their own workloads for
22:44
how to use this. So
22:47
there's been three different deep research products
22:49
launched in the last few months Tell
22:51
us a little about what makes you
22:53
guys special and how we should think
22:56
about it And they're all called deep
22:58
research, right? They're all called deep research.
23:00
Yeah, not a lot of naming creativity
23:02
in this field I think I think
23:05
people should should trial them for themselves
23:07
and get a feel. I think I
23:09
think the difference in like quality I
23:12
think they all have pros and cons,
23:14
but I think the difference will be
23:16
clear But what that comes down to
23:18
is just the way that this model
23:21
was built. And the sort of the
23:23
effort that went into constructing the data
23:25
sets and then the the engine that
23:27
we have with the O-series models, which
23:30
allows us to just optimize models to
23:32
make things that are like really smart
23:34
and really high quality. We had the
23:37
O-1 team on the podcast last year
23:39
and we were joking that O-Net is
23:41
not that good at naming things. I
23:43
will say this is your best-named product.
23:46
Deep researches. At least it describes what
23:48
it does, I guess. Yeah. So I'm
23:50
curious to hear a little about where
23:52
you want to go from here. You
23:55
have deep research today, what do you
23:57
think it looks like a year from
23:59
now, and what maybe our complementary things
24:02
you want to build along the way?
24:04
Well, excited. to expand the data sources
24:06
that the model has access to. We've
24:08
trained the model that's generally very good
24:11
at browsing public information, but it should
24:13
also be able to search private data
24:15
as well. And then I think just
24:18
pushing the capabilities further, so it could
24:20
be better at browsing, it could be
24:22
better at analysis. And then thinking about
24:24
how this fits into our agent roadmap
24:27
more broadly. Like I think the recipe
24:29
here is something that's going to scale
24:31
to a pretty wide range of use
24:33
cases, things that are going to surprise
24:36
people how well they work. But this
24:38
idea of you take a state-of-the-art reasoning
24:40
model, you give it access to the
24:43
same tools that humans can use to
24:45
do their jobs or to go about
24:47
their daily lives, and then you optimize
24:49
directly for the kinds of outcomes that
24:52
you're looking that you want the agent
24:54
to be able to do. That recipe,
24:56
there's like really nothing stopping that recipe
24:58
from scaling to more and more complex
25:01
tasks. So I feel like, yeah, AGI
25:03
is like an operational problem now. And
25:05
I think, yeah, a lot of things
25:08
to come in that general formula. So
25:10
Sam had a pretty striking quote of
25:12
deep research will kind of take over
25:14
a single dinner percentage of all economically
25:17
viable tasks in the world. How should
25:19
we think about that? Deep Research is
25:21
not capable of doing all of what
25:24
you do, but it is capable of
25:26
saving you like hours or sometimes, in
25:28
some cases, days at a time. And
25:30
so I think like, what we're hopefully
25:33
relatively close to is deep research and
25:35
the agents that we build on top
25:37
of it, giving you, you know, one,
25:39
five, ten, 25% of your time back,
25:42
depending on the type of work that
25:44
you do. I mean, I think you
25:46
really are. made 80% of what I
25:49
do. So it's definitely on the higher
25:51
end for me. We just need to
25:53
start writing checks, I guess. Yeah. Are
25:55
there entire job categories that you think
25:58
are kind of more at risk is
26:00
the wrong word, but like more in
26:02
the in the strike zone for what
26:04
deep research is exceptional? So for example,
26:07
I'm thinking consulting, but like are there
26:09
specific categories that you think are more
26:11
in strike zone? Yeah, I used to
26:14
be consulting. I don't think any jobs
26:16
are at risk. at all. Like it's,
26:18
but for these types of knowledge work
26:20
jobs where like where you are spending
26:23
a lot of your time kind of
26:25
looking through information making conclusions, I think
26:27
it's going to give people superpowers. Yeah,
26:29
I'm very excited about a lot of
26:32
the medical use cases, just the ability
26:34
to find all of the literature or
26:36
all of the recent cases for a
26:39
certain condition. I think I've already seen
26:41
a lot of. doctors posting about this
26:43
or like they've reached out to us
26:45
and said oh we used it for
26:48
this thing we used it to help
26:50
find a clinical trial for this patient
26:52
or something like that so just people
26:55
who are already so busy just saving
26:57
some time or it's maybe something that
26:59
they wouldn't have had time to do
27:01
so and now they they are able
27:04
to have that information for them. Yeah
27:06
and I think the like the impact
27:08
of that is like maybe a little
27:10
bit more profound than it sounds on
27:13
the surface right it's not just like
27:15
you know getting 5% of your time
27:17
back but it's the type of thing
27:20
that might have taken you four hours
27:22
or eight hours to do, now you
27:24
can do for, you know, a chat
27:26
TV subscription and five minutes. And so,
27:29
like, what types of things would you
27:31
do if you had infinite time that
27:33
now maybe you can do, like, many,
27:35
many copies of? So, like, you know,
27:38
should you do research on every single
27:40
possible startup that you could invest in
27:42
instead of just the ones that you
27:45
have time to meet with, things like
27:47
that? Or on the consumer side, one
27:49
thing that I'm thinking of is, you
27:51
know, the working mom that's too busy
27:54
to plan a birthday party for her
27:56
toddler. Like, now it's, now it's too.
27:58
So it's I agree with you, it's
28:01
way more important than 5% of your
28:03
time. It's all the things you couldn't
28:05
do before. Exactly. What does this change
28:07
about education and the way we should
28:10
learn? And you know, what will you
28:12
be teaching your kids now that we're
28:14
in the world of agents in deep
28:16
research? Education's been like one of the
28:19
top few things that people use it
28:21
for. I think it's I mean this
28:23
is true for a chat tribute to
28:26
you generally. It's it's like a like
28:28
a like like learning things by talking
28:30
to an AI system that is able
28:32
to like personalize the information that gives
28:35
you based on what you tell it
28:37
or maybe in the future what it
28:39
knows about you. It feels like a
28:41
much more efficient way to learn and
28:44
a much more engaging way to learn
28:46
than like reading textbooks. We have some
28:48
lightning round questions. All right? Okay, your
28:51
favorite deep research use case. I'll say
28:53
yeah, like personalized education, just like learning
28:55
about anything I want to learn about.
28:57
I've already mentioned this, but I think.
29:00
a lot of the personal stories that
29:02
people have shared about finding information about
29:04
a diagnosis that they've received or someone
29:07
in their family received have been really
29:09
great to see. Okay, we saw a
29:11
few application categories breakout last year. So
29:13
for example, coding being an obvious one.
29:16
What application categories do you think will
29:18
break out this year? I mean, clearly
29:20
agents. Agents. I was going to say
29:22
too. I think it's like it's so
29:25
hard to keep up with the state
29:27
of the art in AI. I think
29:29
that you should recommend people reading to
29:32
read to learn more about agents or
29:34
where the state of AI is going.
29:36
Could be an author too. Training data.
29:38
Yeah, this fun cost. I think it's
29:41
like it's so hard to keep up
29:43
with the state of the art in
29:45
AI. I think the like the general
29:47
advice I have for people is like.
29:50
pick one or two subtopics that you're
29:52
really interested in and go like curate
29:54
a list of people who are we
29:57
think saying interesting things about it and
29:59
like how to find those one or
30:01
two things they were interested in. Maybe
30:03
actually that's a good deep research use
30:06
case. Like, you know, go, go, go,
30:08
go, use it to, like, go deep
30:10
on things that you want to learn
30:12
more about. This is a bit old
30:15
now, but I think a few years
30:17
ago I watched the, I think that
30:19
it was a, good introductions for reinforcement
30:22
learning so yeah would definitely second any
30:24
any content by Peter appeal my grad
30:26
school advisor yeah oh yeah okay reinforcement
30:28
learning is it you know it kind
30:31
of went through a peak and then
30:33
felt like it was in a little
30:35
bit of a dulled room again and
30:38
is speaking again is that the right
30:40
read on what's happening with our L
30:42
it's so back yeah why why now
30:44
because everything else is working mr. like
30:47
I think if you Maybe people who've
30:49
been following the field for a while
30:51
will remember the gallon liqueun cake analogy.
30:53
If you're building a cake, then most
30:56
of the cake is the cake. And
30:58
then there's a little bit of frosting
31:00
and then there's a few cherries on
31:03
top. And the analogy was that unsupervised
31:05
learning is the frosting and reinforcement learning
31:07
is the cherries on top. When we
31:09
in the field were working on reinforcement
31:12
learning back in, you know, 2015, 2016,
31:14
it's kind of like... I think Jan
31:16
Lekoon's analogy, which I think in retrospect
31:18
is probably correct, is that we were
31:21
like trying to add the cherries before
31:23
we had the cake. But now we
31:25
have language models that are pre-trained on
31:28
massive amounts of data and are incredibly
31:30
capable. We know how to, how to,
31:32
you know, do supervised fine tuning on
31:34
those language models to make them good
31:37
at instruction following and like generally doing
31:39
the things that people want them to
31:41
do. And so now that that works
31:44
really well, it's like very ripe to
31:46
tune those models for. Any kind of
31:48
use case that you can define a
31:50
reward function for great. Okay. So from
31:53
this lightning round we got agency you
31:55
know, the breakout
31:57
category in 2025 and reinforcement
31:59
learning is so
32:02
back. it. Thank you I
32:04
love it. joining us. We love you
32:06
guys so much for joining us. We
32:08
love this conversation. which is an incredible product and we can't
32:10
and we can't wait to see what
32:12
comes of it. you. you. Thank you. Thank you.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More