Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
We have 30 million software
0:02
engineers in the world, and
0:04
we have 750 million active
0:06
users of Excel. And now,
0:08
you'll ask, like, how does word
0:10
compare to Excel? And what Excel
0:13
did in the 80s to data analytics
0:15
and numbers is what we're
0:17
trying to do to AI.
0:19
So in the 80s, you
0:21
either had to have a
0:23
team of data analytics people
0:25
or engineers, or you're using
0:27
a calculator calculator. And I would
0:30
say the calculator equivalent here is
0:32
the chart to PT that you
0:34
every time you need to redo
0:37
the conversation and you every time
0:39
you need to instil your
0:41
own needs into it. What World
0:43
War is trying to do is
0:45
saying, hey, a lot of the
0:47
things that you do are repeatable,
0:50
similar to Excel, and you can
0:52
encode your taste into it with
0:54
World War. Today
1:09
we're joined by Philip Kazaera, co-founder
1:12
of Wordware, who's building tools to
1:14
help bridge the gap between human
1:16
creativity and AI. Philip shares his vision for
1:18
why English is becoming the new assembly
1:20
language for LLMs, why he believes that
1:23
the future belongs not just decoders,
1:25
but to people he calls word
1:27
artisans, who can communicate effectively their
1:29
creative vision to an AI system, and what
1:31
that means for the future of programming
1:33
computers. Philip, thank you so much
1:35
for joining us today. I'm excited to
1:38
learn about you and Wordware and get
1:40
your take on how programming computers is
1:42
fundamentally going to change with large language
1:44
models. Thank you for the invite. Let's dig
1:47
right in. I want to start with Andre
1:49
Carpothi. He had a tweet in 2023 that
1:51
went viral. The hottest new programming language is
1:53
English. What do you make of that? And
1:55
what does it mean relative to what you
1:58
are trying to build that Wordware? I think
2:00
syntax will not be as important.
2:02
You know, everyone will be somewhat
2:04
of a coder. People used to
2:06
have to know Python. Now English
2:09
is enough. However, you still need
2:11
to know what you're trying to
2:13
say. And in that way, I
2:15
would say, you know, not everyone
2:17
will be able to use it
2:19
because some people don't have that
2:21
much to say. And I would
2:23
maybe rephrase it a little bit
2:26
in a little bit more of
2:28
an exciting way for us, is
2:30
that the assembly language to LLMs
2:32
is English, but you still have
2:34
to structure it in the right
2:36
way, and you still have to
2:38
use some of the concepts even
2:41
from typical programming in order to
2:43
actually make sure that it does
2:45
what it's supposed to do. You
2:47
heard it here first, okay. English
2:49
is not the hottest new programming
2:51
language. It is the new assembly
2:53
language. And at word where you
2:55
were trying to build that new
2:58
programming language to work with that
3:00
new assembly language. I think bringing
3:02
structure to human AI collaboration is
3:04
something that I've chosen to spend
3:06
the next 10 years of my
3:08
life on. And it's an exciting
3:10
problem, because right now we're trying
3:13
to mix the structure of programming,
3:15
which is very rigid and deterministic,
3:17
with something that's intrinsically fuzzy. And
3:19
that marrying of these concepts and
3:21
choosing the right affordances and the
3:23
right obstruction layers for that communication
3:25
to be incredibly... Easy yet enable
3:28
people to do complex things is
3:30
hard. And this is what we
3:32
are trying to do with word
3:34
where as the engine to enable
3:36
the right mix of the two.
3:38
Let's talk a little bit more
3:40
about no code. I remember back
3:42
when when I first got to
3:45
Sequoia in 2018, I remember no
3:47
code was the hottest thing ever
3:49
and it was like, you know,
3:51
we're not going to we're not
3:53
going to program anymore. And, you
3:55
know, obviously some no code companies
3:57
have done extremely well, like retool
4:00
for example. have you know we
4:02
have software engineers today and so
4:04
so far that no code promise
4:06
hasn't really come to fruition. What's
4:08
different now like what has changed?
4:10
I think coming back to what
4:12
I just said as well on
4:15
the assembly language is English and
4:17
you I kind of have this
4:19
small insecurity when people call Wordware
4:21
a no-code tool. Because we have
4:23
not yet reached a ceiling with
4:25
any of our clients. So you
4:27
can achieve absolutely everything. We've created
4:29
an ability to put in code
4:32
execution blocks, which if you want
4:34
an escape hatch and Wordware is
4:36
not fully capable of everything, you
4:38
still can do it. And in
4:40
that way, you know, the document
4:42
format of how to structure agents.
4:44
and how to write them down
4:47
one by one is still very
4:49
similar to how code works. We
4:51
still have loops, we still have
4:53
conditional statements, and we, you know,
4:55
have flows calling other flows, which
4:57
really is functioned calling, you know.
4:59
So in a way, we don't
5:02
see ourselves as a no-code tool.
5:04
And we kind of believe that
5:06
the word crafters, the word artisans
5:08
of the future are still coding.
5:10
It's just, you know, they are
5:12
structuring the English in a very
5:14
precise way to make sure that
5:16
the prompt is populated in the
5:19
right manner. Word crafters and word
5:21
asins. I like it. Wordware engineers,
5:23
you've heard it here first. How
5:25
are you using the English language
5:27
then to make, you know, since
5:29
you bristle at the no code
5:31
term, how are you making no
5:34
code more code like? And maybe
5:36
this is a good time to
5:38
just give a 30 second overview
5:40
on, you know, the word where
5:42
product and how people use it
5:44
to build things. Sure. So by
5:46
trying to bring that structure to
5:49
the intrinsically fuzzy English language, we've
5:51
created an editor where you can
5:53
use the similar concepts to software
5:55
engineering, functions, and marry it in
5:57
that editor, in that natural language
5:59
IDE, in a way that people
6:01
can construct agents. Something that we're
6:03
not, we're not coaching, we go
6:06
straight to agents. And in my
6:08
definition, agents are almost like, they
6:10
are still software, they are still
6:12
a little bit like software taking
6:14
in inputs and outputting outputs, but
6:16
some of the stages of what
6:18
they do is fuzzy. So marrying
6:21
that structure in the editor, whatever
6:23
you build there, whatever you iterate
6:25
on there, you then have three
6:27
different ways of deploying it. You
6:29
can deploy it as an API
6:31
that will power your product or
6:33
that AI button or that AI
6:36
chatbot that is a little bit
6:38
more complex than just doing a
6:40
vanilla API call to Claude or
6:42
Open AI. That's number one. Number
6:44
two is you can deploy it
6:46
as a workflow where the main
6:48
part of the workflow is not
6:50
like Zapier, it's more AI native.
6:53
and the greedy, the brain of
6:55
the AI is a little bit
6:57
more complex than a couple prompts
6:59
strung together. And the third thing
7:01
that we're building is the GitHub
7:03
for AI, let's say, for people
7:05
to share and what they've done
7:08
and other people are able to
7:10
then fork it and use these
7:12
things as components. Again, marrying some
7:14
concept from software engineering, you need
7:16
other people's components and libraries and,
7:18
you know, you want to be
7:20
building on top of the, on
7:22
the shoulders of the job. of
7:25
these AI thinkers. So again, Wordware
7:27
engine, editor, way to actually create
7:29
these agents and then three different
7:31
ways to deploy. I think one
7:33
of the beauties of code is
7:35
just its expressibility and its precision.
7:37
You know, you know, exactly what
7:40
you were telling the machine to
7:42
do, and you were, you know,
7:44
expressing it in some languages in
7:46
the most precise way possible. English
7:48
is not like that. To your
7:50
point, it's fuzzy. and the steerability
7:52
of that? Or, you know, what
7:55
is the, I guess, abstract thing
7:57
you were doing with English to
7:59
make it more programmable? Yeah, I
8:01
think you're exactly correct. We are
8:03
trying to bring a little bit
8:05
more structure. It's not all the
8:07
way, because if you go all
8:09
the way to a programming language,
8:12
you lose the fuzziness and you
8:14
lose the power of it. But
8:16
it's hard. And right now, most
8:18
of the people, you know, we
8:20
had this way for evaluation software
8:22
companies at some stage. And what
8:24
we've realized with a bunch of
8:27
companies that we work with is
8:29
that they don't know, like they
8:31
don't have these data sets to
8:33
use for evils. And what we
8:35
came up with is that editor
8:37
very quickly gets you to understand
8:39
and develop an intuition of what
8:42
works and what doesn't work. And
8:44
for now, this is the most
8:46
important thing. It's clicking run a
8:48
hundred times quickly and making sure
8:50
that what you've written here. Has
8:52
enough structure in order to output
8:54
things on the right side that
8:56
you want and in that way,
8:59
you know as long as you
9:01
know what you're trying to achieve
9:03
And this is very hard, you
9:05
know, we have a lot of
9:07
companies coming in and just saying
9:09
AI predict weather I literally had
9:11
a big customer say can you
9:14
guys like predict weather for me?
9:16
And you know, that's not the
9:18
case. You need a document where
9:20
you outline what you're trying to
9:22
do and you know, even that
9:24
has enough structure If you have
9:26
done an intro, those are the
9:29
inputs. You'll be playing around with
9:31
images and PDFs, and then you'll
9:33
manipulate it in this way, and
9:35
then you'll get some outputs. And
9:37
developing that intuition is just enough
9:39
structure for today. Soon, maybe we'll
9:41
be able to do better evolves,
9:43
but for now, a lot of
9:46
people really don't know. At the
9:48
moment when they start playing around
9:50
with Wordware, they don't know what
9:52
we are trying to achieve. And
9:54
one of our customers coined this
9:56
term of speed of creativity with
9:58
Wordware is higher. So they learn
10:00
what they are actually trying to
10:03
do as they encounter problems
10:05
with the underlying models and
10:08
they realize Gemini 2.0 pro
10:10
might be better and maybe
10:12
Gemini can take huge PDFs
10:14
and Claude can kind of
10:17
take PDFs but smaller and
10:19
then GPT4 cannot take PDFs
10:21
and you know they develop
10:24
that understanding and that helps
10:26
them to structureize their faults.
10:29
How do you instruct the machine
10:31
to go from intent to outcome?
10:33
Like let's say I'm a brilliant
10:35
filmmaker and I want, you know,
10:38
I want to use Wordware to
10:40
create the next hit. What do
10:42
I do in Wordware in order
10:44
to make that happen? Yeah, so
10:47
for now we focus on knowledge
10:49
workers where that is a little
10:51
bit easier. I think using your
10:54
example for figuring out how the
10:56
future will be, you know, if we
10:58
have... let's say, you know, George
11:00
Lucas, playing around, hypothetically
11:02
playing around with GPT-7
11:04
and trying to create
11:06
Star Wars, he might
11:08
type in just a prompt
11:10
being like the two sentences,
11:12
and he would just say,
11:14
hey, create a movie about
11:16
words between Star Systems. And
11:18
that's just enough to give
11:20
a model, an ability to
11:22
run on its own. and
11:24
this is actually not what
11:26
you want. You want to
11:29
convey your creative vision, but
11:31
for now in order it's
11:33
just knowledge work, but soon
11:35
it will be all work where it
11:37
needs that human sprinkle, that human
11:39
paste. And these are the things
11:41
that I really value is like,
11:43
people say, oh, nobody will have a
11:46
job whatsoever. I don't agree with it
11:48
at all. I think the human taste and
11:50
how you do things will matter even
11:53
more. And I use this George Lucas
11:55
example because it's a little bit easier
11:57
to understand how taste is influencing that.
11:59
But... everywhere, writing a good
12:01
email is dependent on good taste.
12:03
Figuring out I was just hiring
12:06
for an executive assistant and like
12:08
everyone in our company needs to
12:10
show that taste and needs to
12:12
show a little bit more conviction
12:14
in the way that they do
12:17
things. So for executive assistant, like
12:19
she needed to choose a right
12:21
restaurant for our off-site, you know?
12:23
And that also has taste. I
12:25
don't want to trust in AI
12:28
with this. So yeah, that's kind
12:30
of that sprinkle of human touch
12:32
is very important. So taste as
12:34
the last bastion of humanity? I
12:36
think so. I think creativity and
12:39
taste. Do you think machines can
12:41
learn human taste? I think they
12:43
can. But that's not the point.
12:45
There is an interesting analogy here.
12:47
They put humans into an MRI
12:50
machine and they've shown them two
12:52
different pieces of art. Both of
12:54
them were done by AI. And
12:56
they told the people, hey, one
12:58
is created by a human artist
13:01
and another one is created by
13:03
AI. Our brains work completely differently
13:05
and our different parts of the
13:07
brain fire when we assume human
13:09
intent behind something. So you know,
13:12
I can create a song and
13:14
just like it will be a
13:16
good song with Suno or whatever
13:18
and send it to my friend
13:20
and he'll be like, yeah, it's
13:23
a cool song. But if I...
13:25
ingrained my intent and I'll create
13:27
a song about our, you know,
13:29
skiing trip to Shamboni and I'll
13:31
make it funny and I will,
13:34
that intent will be there. I'm
13:36
pretty sure his brain will be
13:38
firing in a completely different way,
13:40
giving him like, giving him a
13:42
completely different experience in that way.
13:45
Does that make sense? It makes
13:47
a lot of sense. Yeah. I
13:49
love it. I'm going to transition
13:51
to talking about the next billion
13:53
developers, and you've alluded to this
13:56
a few times in the conversation,
13:58
and I really want to just
14:00
pull on that thread. So you
14:02
started this, you know, this. conversation
14:04
by saying, you know, there's a
14:07
certain set of people in the
14:09
world that know how to code,
14:11
but there's a difference of people
14:13
in this world that have creativity
14:15
and have ideas. Do you think
14:17
that set of people is larger?
14:20
Is that how we get to
14:22
the next billion developers? There's an
14:24
interesting analogy here. We have 30
14:26
million software engineers in the world,
14:28
and we have 750 million active
14:31
users of Excel. And now... you'll
14:33
ask, like, how does word work
14:35
compared to Excel? And what Excel
14:37
did in the 80s to data
14:39
analytics and numbers is what we're
14:42
trying to do to AI. So
14:44
in the 80s, you either had
14:46
to have a team of data
14:48
analytics, people or engineers, or you're
14:50
using a calculator. And I would
14:53
say the calculator equivalent here is
14:55
the chart to PT that you
14:57
every time you need to redo
14:59
the conversation and you every time
15:01
you need to instil your own
15:04
needs into it. What World War
15:06
is trying to do is saying,
15:08
hey, a lot of the things
15:10
that you do are repeatable, similar
15:12
to Excel, and you can encode
15:15
your taste into it with Word
15:17
War. And I believe that... that
15:19
taste will be important as mentioned
15:21
before and I think the next
15:23
500 million or a billion users
15:26
of AI might be calling them
15:28
I don't know it will be
15:30
the term hopefully it's word we're
15:32
engineered but you know word artisan
15:34
or whoever and the really important
15:37
part here is that they need
15:39
to know what the AI is
15:41
supposed to do many many times
15:43
you know we are a horizontal
15:45
tool and people come to us
15:48
and they say, hey, what can
15:50
I do with AI? And I
15:52
tend to explain it as if
15:54
for now it's an intern, but
15:56
intern after university, and you need
15:59
to write out on a piece
16:01
of paper, a couple of things.
16:03
that you want to do with
16:05
the intern. So you need to
16:07
say, hey, this is your job.
16:10
This is the title of what
16:12
you're trying to do. Here are
16:14
some of the documents or input
16:16
that you'll be working with. Here
16:18
are the data sources, and here's
16:21
the output that I'm expecting of
16:23
you. And the important caveats here
16:25
that people don't often understand is
16:27
that the data sources has to
16:29
be something that you trust. You
16:32
can't just say hey go and
16:34
search the internet because often you
16:36
end up with things that you
16:38
don't Agree with and if the
16:40
intern works on top of that
16:43
that's a problem And another one
16:45
is and this is very important
16:47
is that you're gonna trust the
16:49
intern with this So you know
16:51
if you want to send a
16:53
thousand emails to every person that
16:56
needs a response in your inbox,
16:58
with some people you just want
17:00
trust and intern to do this.
17:02
And this is how AI works
17:04
right now. So as long as
17:07
you have a job right now,
17:09
that you say, hey, if I
17:11
haven't one intern, I could easily
17:13
explain it to them. And a
17:15
lot of our work is like
17:18
this right now. We often read
17:20
an email, go search and drop
17:22
box, go search notion, and then
17:24
we create a response that is
17:26
essentially... based on this database that
17:29
we curate, then you can be
17:31
using AI. And I think more
17:33
and more, this knowledge work is
17:35
gonna be automated. And I think,
17:37
no, in that way, the next
17:40
billion people are gonna be, they're
17:42
gonna need two things, intent, what
17:44
actions to happen, and taste, how
17:46
do you want to do this?
17:48
and all of the rest will
17:51
feel like CEOs of the biggest
17:53
enterprise because we will have a
17:55
thousand knowledge workers working beneath us
17:57
and trying to actually execute on
17:59
these two things. What I heard
18:02
just now was a lot of
18:04
automation about knowledge work. I mean,
18:06
the thing that I'm most intrigued
18:08
about within AI and within generative
18:10
AI is its generative capacity, including
18:13
the ability to create, you know,
18:15
you mentioned the... George Lucas example,
18:17
but also to create, you know,
18:19
new applications, new marketplaces, you know,
18:21
new products. And so do you
18:24
imagine, do you see Wordware primarily
18:26
serving, you know, making the knowledge
18:28
worker more productive? Or do you
18:30
see it also assisting in kind
18:32
of the creation of new products,
18:35
services, you know, pieces of art?
18:37
For now, it's mostly about the
18:39
productive work, I would say. It's,
18:41
you know, the AI engine is
18:43
the AI heart of your product
18:46
is wordwork. Currently, we have not
18:48
dipped into the generative UI part
18:50
of things. We're not lovable. We've
18:52
actually used lovable to, you know,
18:54
wrap our AI heart for some
18:57
of our customers. And that has
18:59
worked great. I'm just so impressed
19:01
with their product. Talk more about
19:03
this. Loveable, I think, also sees
19:05
themselves as, you know, enabling the
19:08
next billion developers. You see, you
19:10
have a similar vision. You know,
19:12
how do you think your view
19:14
of how the world will go
19:16
foots with their view? And like,
19:18
why didn't you choose to make
19:21
a, you know, that style of
19:23
no code tool? Yeah, because the
19:25
way that I see the word
19:27
developer is a little bit different.
19:29
they see the word developer as
19:32
what developers do today, which is,
19:34
you know, a lot of SAS
19:36
is a wrap around a database
19:38
with some dashboard and ability to
19:40
manipulate that data. They are creating
19:43
a much more personalized dashboard, you
19:45
know, and a lot of people
19:47
are going to create incredible vertical
19:49
SAS based on loveable. And I
19:51
think that's incredible. And the one
19:54
thing that was missing through all
19:56
of this is that they not
19:58
only grab the UI part, they
20:00
also grab the database part, which
20:02
many people do not know how
20:05
to manipulate and hence they unlocked
20:07
a lot more use cases. Whatever.
20:09
But what we are trying to
20:11
say is that this part of
20:13
creating a generative, creating a UI
20:16
on top of a database is
20:18
not the future. The future is
20:20
to actually utilize this reasoning engine
20:22
that an alum is in a
20:24
productive manner, and we focus on
20:27
that substance of AI at the
20:29
beginning. In the future, we might
20:31
want to expose that engine. as
20:33
a in a UI, maybe it's
20:35
a chatbot, maybe it's, you know,
20:38
digesting some images, etc. But the
20:40
real important part is the AI
20:42
engine. And yeah. So if you
20:44
think about an app as, you
20:46
know, there's the UI, there's the
20:49
application logic, and there's a database,
20:51
what you're saying is, you know,
20:53
you really want to just, you
20:55
know, knock it out of the
20:57
park on the application logic, so
21:00
to speak. And right now we're
21:02
able to work on a lot
21:04
more data, which is not structurized.
21:06
And this is the big, big
21:08
difference right now. A lot of
21:11
Suss's right now will still work
21:13
in a similar manner, just the
21:15
database is fuzzy. And the database
21:17
might be what you see every
21:19
day. And how the hell would
21:22
you put that in a typical
21:24
normal, you know, database? And I
21:26
think working on top of that
21:28
context is the really exciting part
21:30
for me. Let's go back to
21:33
this concept of word artisans or
21:35
word wear engineers if everything goes
21:37
right. You know, what's your vision
21:39
of what a word wear engineer
21:41
looks like in 10 years? Oh,
21:44
that's a tough one. I've been
21:46
thinking a lot about what does
21:48
work look like in 10 years
21:50
for human beings? And I was
21:52
struggling with this at the beginning
21:54
because it's really hard. to understand
21:57
people's jobs even today. And often
21:59
I boil it down to the
22:01
software that they use. They can
22:03
talk a big game about strategy
22:05
and you know I set the
22:08
mission and in the end of
22:10
the day I ask them do
22:12
you do meetings, do you do
22:14
email, do you do PowerPoint presentations,
22:16
do you work in Excel, or
22:19
do you work in code? And
22:21
I just want to understand what
22:23
does work look like in 10
22:25
years and like what are you
22:27
really working with you know is
22:30
it a hair like interface when
22:32
you just talk to the AI
22:34
and it does a lot of
22:36
work for you I think to
22:38
be honest voice is kind of
22:41
not the best modality to express
22:43
that hence I kind of think
22:45
that in its simplest form, word
22:47
where it's a document where you
22:49
jot down your thoughts and you
22:52
do it in a more structured
22:54
way, word where a co-pilot AI
22:56
is helping you throughout structuring it
22:58
and in the end of a
23:00
day you behave like a CEO
23:03
which sets the strategy intent and
23:05
all of us on that piece
23:07
of paper essentially on these blank
23:09
canvas and you draw that vision.
23:11
Maybe it's even more than words,
23:14
it's just, you know, you generate
23:16
this vision of how your own
23:18
enterprise works. And, you know, I
23:20
look at different things around us,
23:22
and I see furniture or shoes
23:25
or whatever, and I think there
23:27
is taste ingrained into what kind
23:29
of shoes you like to make.
23:31
So in 10 years, if somebody
23:33
wants to become a creator of
23:36
the best brand of shoes. It
23:38
becomes about, that true becomes a
23:40
luxury object, which has ingrained taste
23:42
and intent in it. And then
23:44
a bunch of things in the
23:47
end will be, will happen on
23:49
its own. The really tough parts,
23:51
even manufacturing it and so on,
23:53
will happen on its own. But
23:55
what's your job in the future
23:58
is talking to other CEOs. will
24:00
not, but humans don't want
24:02
to lose that control. So you
24:04
will talk with other CEOs about
24:06
maybe doing a partnership with your
24:08
shoe brand and somebody else. You
24:10
will have to be still critical
24:12
about the intent of the other
24:14
person and you have to instil taste
24:17
and your own creative vision into that
24:19
shoe. Do you think that, you know, do
24:21
you think a billion people globally will
24:23
be? capable of programming a machine in
24:25
English language the way you describe or
24:27
in Wordware documents the way you describe
24:29
because it does require you know it's
24:31
it's almost like pseudo coding and you
24:33
know there is logic there's there's loops
24:35
and things like that I guess maybe
24:38
talk about today like do you need
24:40
to be technical in order to use
24:42
Wordware like who is the ICP today
24:44
and what is needed to move that
24:46
ICP so that you can reach a
24:48
billion developers yeah I think
24:50
Right now, what we've enabled
24:52
is people who are somewhat
24:54
technical, CEOs, technical PMs, high
24:57
up in the org chart
24:59
to engrain their own, like,
25:01
kind of think that they know
25:03
what needs to happen and get
25:05
there quicker. You know, so Max
25:08
from InstaGuard, for example, he
25:10
is a founder and he
25:12
spent four days just, you
25:15
know, refining his idea. in
25:17
Wordware, instead of hiring a
25:19
whole team, but he is
25:21
somewhat analytically minded. And for now
25:23
that's the case. We did not want
25:26
to make too much magic because the
25:28
models were not there. Right now what
25:30
we're doing is we're moving more
25:32
into that blank canvas when you
25:34
just describe the idea and we
25:36
take care of guessing the right
25:38
structure. And you still will be
25:41
able to, you know, in a
25:43
very fine-grained way, edit it. but
25:45
you will start playing a lot
25:47
more, we'll probably use O3 to
25:49
get you to the first draft
25:51
of how that flow works. And
25:53
when we kind of loop back
25:56
to the future of how it
25:58
will all look like and... really
26:00
whether we'll have one billion developers,
26:02
you know, working in word where
26:04
it becomes a much bigger question
26:06
here. It becomes a question of
26:08
like, will a billion people want
26:10
to do productive work? Like, you
26:12
know, we just talked about the
26:14
shoe. How many people will have
26:16
the drive to put out something
26:18
to the world and they will
26:20
want to express that creative vision?
26:23
Maybe in, you know, post resource
26:25
scarcity world, most of it will
26:27
want work. but I think we'll
26:29
still have the equivalent of billionaires
26:31
and it will be about influence,
26:33
it will be about taste, and
26:35
it will be about how you
26:37
utilize your own resources and how
26:39
do you multiply it to have
26:41
the equivalent of future money. And
26:43
I went a little bit deep
26:45
here, but I for what it's
26:47
worth, I think that I think
26:49
the innate drive to create is
26:51
like a deeply human drive and
26:53
I think that exists in a
26:56
post-cabilistic world. I also have that
26:58
opinion and I really believe in
27:00
humans. Like, I want them to
27:02
succeed. Like, somebody asked me, one
27:04
of our prospective employees asked me,
27:06
what, like, Philip, in 10 years,
27:08
what do you want there to
27:10
be, like, what have you done?
27:12
And I want to save, like,
27:14
the human creative vision. I don't
27:16
want everything to be AI. I
27:18
really have the pleasure when I
27:20
go to an artisan shop on
27:22
my holiday, and I know that
27:24
somebody put in the intent and
27:26
put in the work, and I
27:29
want to interact with the story
27:31
of it. Okay, so today your
27:33
ICP is the analytical creative, which
27:35
is a little bit of a
27:37
unicorn, and over time, as you
27:39
can lower, as the models get
27:41
better, as you iterate on your
27:43
interface, you'll lower the bar, so
27:45
they'll really be just more the
27:47
creative, is your ideal user, you're
27:49
going to lower the bar of
27:51
how analytical you need to be
27:53
in order to use word. Yes,
27:55
but at the same time, you
27:57
know, my use of the word
28:00
creative is not. to what most
28:02
people associated with right now. I
28:04
think a good creative is also
28:06
using growth channels in the right
28:08
manner. They are creative about everything
28:10
that they do in this new...
28:12
uncertain word of AI where everything
28:14
is changing. And, you know, I'm
28:16
not thinking only about an artist
28:18
that's painting on the canvas. It's,
28:20
I think creativity is, can basically
28:22
show itself in so many different
28:24
aspects of work. Let's talk about
28:26
user interfaces and, you know, the
28:28
future gooies, the gooies of the
28:30
future. right before we film this
28:33
podcast you made the analogy that
28:35
you know transformers of the new
28:37
transistor maybe say a little bit
28:39
more about that and what you
28:41
think the new gooey is going
28:43
to be. So I think the
28:45
analogy here is that if the
28:47
LLM is the well if Transformers
28:49
the the new transistor and it's
28:51
being packaged as the model the
28:53
model is kind of the mainframe
28:55
let's call it you know and
28:57
then we took our sweet time
28:59
to utilize the power of that
29:01
mainframe in a GUI that's accessible
29:03
by billions of people. You know,
29:06
there has been really two big
29:08
spikes there. The first was the
29:10
desktop, you know, and Apple came
29:12
in coming up with their GUI.
29:14
And the second one's mobile. And,
29:16
you know, right now, we are
29:18
almost like exposing the numbers and
29:20
the logic in a chat style
29:22
thing, and nobody has had a
29:24
better idea. We think that the
29:26
document style is better for doing
29:28
kind of more complex work. Because
29:30
often when you try to achieve
29:32
something you just give it two
29:34
sentences and the model just runs
29:36
on its own and it's just
29:39
enough the two sentences our lead
29:41
investor recently said that the two
29:43
sentences is just about enough for
29:45
a model to hang itself on
29:47
and You know you will get
29:49
something completely different than what you
29:51
actually want it and this is
29:53
a problem of like lovable and
29:55
devins of this world as well
29:57
But I basically think that there
29:59
are better is coming and you
30:01
know whether they will be based
30:03
on AR or you know there
30:05
will be an assistant that's listening
30:07
to everything that we do I
30:09
that was actually my first company
30:12
augmenting human memory of always on
30:14
listening devices using GPT to and
30:16
Bert I've been in this since
30:18
the GPT two days I mean
30:20
I my research wasn't into LSTM
30:22
which are the precursor to the
30:24
transformer architecture and I've been in
30:26
this for a while I think
30:28
nobody has yet delivered on this
30:30
I want everything that I hear
30:32
to be somewhere in a searchable
30:34
database that also has the perfect
30:36
context about me, you know, the
30:38
way that I want to do
30:40
things. And I think those affordances
30:42
and those like, we called it
30:45
gooey, but it's really the underlying
30:47
way of interacting with intelligence, is
30:49
not going to be mainly chat.
30:51
I just don't believe it. Programmable
30:53
documents. Do you think that is
30:55
fundamentally what word word looks like,
30:57
u.y. wise, and call it five
30:59
years? I think there is more
31:01
and more magic in it, and,
31:03
uh... I would believe that I
31:05
want people still to be able
31:07
to do that fine grain work.
31:09
You know, we've linked it with
31:11
George Lucas doing the movie, you
31:13
know, in a way you almost
31:15
want to firstly start with the
31:18
high level thing, the two centers
31:20
description, and then zoom in, and
31:22
zoom in, and zoom in, and
31:24
create, you know, modules which make
31:26
the best scene that is five
31:28
seconds, and then combine them together
31:30
in that way. So what I
31:32
would like Wordware to be is
31:34
to transcend. abstraction layers and you
31:36
know be able to zoom it
31:38
all out start with a sentence
31:40
and have it run maybe see
31:42
whether it's working in the right
31:44
manner and then as you see
31:46
that some things are not doing
31:49
the thing that you want them
31:51
to do is to be able
31:53
to zoom in and so and
31:55
see maybe you know four sentences
31:57
of exactly what is doing and
31:59
what are they inputs to this,
32:01
what it's trying to do in
32:03
the middle and what are the
32:05
outputs, you know, that's kind of
32:07
the most simple one level in,
32:09
and then you want to zoom
32:11
in more and more and more
32:13
as you redefine and reiterate on
32:15
your idea of how this should
32:17
be done. How did you arrive
32:19
at the current user interface? I
32:22
think it does feel really novel
32:24
compared to how others are, you
32:26
know, enabling AI builders today. How
32:28
did you arrive at the current
32:30
user interface? Was it more experimentation,
32:32
listening to users? Was it you
32:34
philosophizing about, you know, what it
32:36
should be? I think currently the...
32:38
the approach to creating these agents
32:40
was a block based on the
32:42
2D canvas. And once, you know,
32:44
I've been building agents for a
32:46
long time, you know, I think,
32:48
you know, March 2023, I put
32:50
out the first article about how
32:52
to build agents and me and
32:55
Robert, my co-founder, we've been in
32:57
this for a long time. And
32:59
the more, the better the models
33:01
got, the prompting became more difficult.
33:03
because you can do more complex
33:05
things with it. So at some
33:07
stage there was this movement of
33:09
like the prompt is going away,
33:11
so on. We actually really disagreed
33:13
with it. And that idea is
33:15
gone a little bit. It's like,
33:17
you know, we came back, did
33:19
a loop again, and be like
33:21
actually communicating your vision is really
33:23
important. And when we tried to
33:25
communicate our vision, which was a
33:28
little bit ahead of what the
33:30
models could have done at the
33:32
time, we started to notice that
33:34
the 2D canvas. It's just not
33:36
enough. Like if you do a
33:38
reflection loop inside of a reflection
33:40
loop, you run out of dimensions.
33:42
And we basically really like the
33:44
way that code is structured. Code
33:46
has an ability to express very
33:48
very complex concepts in a way
33:50
that is still like you can
33:52
still much. manipulate it and understand
33:54
it. Think about trying to structure
33:56
the, you know, the whole Uber
33:58
app with all of like everything
34:01
in it on the 2D canvas.
34:03
It would become so cluttered and
34:05
so messy. You know, you can
34:07
do the big picture thing, but
34:09
not really the, you know, you
34:11
don't want engineers to be interacting
34:13
in that way. You want the
34:15
engineers on the future World War
34:17
engineers to be interacting with something
34:19
that's easy to grasp the structure
34:21
of very complex systems. Whereas the
34:23
Uber app actually could probably be
34:25
described in pseudocode. And it seems
34:27
like you're, you know, you're getting
34:29
people closer to that vision versus
34:31
the 2D campus. Yes. And I
34:34
think, you know, the most important
34:36
part here is that Uber has
34:38
an agent equivalent and this is
34:40
what we're trying to build, you
34:42
know, if you... want an agent
34:44
to decide where is that person
34:46
going and where they starting their
34:48
journey and where they will accept
34:50
that charge or you know you
34:52
want to maybe make sure that
34:54
the charge is right for that
34:56
particular person. There is an agent
34:58
equivalent there and you know people
35:00
are going to like people can
35:02
build that agent on word where
35:04
it's not like you're going to
35:07
create that whole UI with with
35:09
with for Uber and I think
35:11
you know probably Uber. is the
35:13
right obstruction layer. You don't want
35:15
to be ordering an Uber through
35:17
a chatbot or through like a
35:19
voice-based thing or, you know, but
35:21
you might want an Uber to
35:23
be ordered for you if you
35:25
have a counter-invite. So, you know,
35:27
in a way that like for
35:29
your personal use Uber is nice
35:31
because you can click around and
35:33
the agent will not always know.
35:35
But I was coming here and
35:38
I wanted a way more, actually
35:40
way more kind of get that
35:42
far yet. But I wanted a
35:44
way more to be ordered and
35:46
to be ordered perfectly when I
35:48
need this. And it's almost like
35:50
a assistant, personal assistant would do
35:52
this for me. And now that
35:54
capability is open to everyone. So
35:56
we'll soon have these kind of
35:58
affordances and these kind of obstruction
36:00
layers there. I think that's a
36:02
great note to end on. Should
36:04
we end on a lightning round?
36:06
Let's go. Okay. One or two
36:08
sentence answers only. Okay, first question.
36:11
What is your most hot take
36:13
or contrarian taken AI not related
36:15
to word where or everything we
36:17
just discussed? Pre-training will still gonna
36:19
matter. And Deep Seek is a
36:21
little blimp that people liked to,
36:23
people jumped on because people love
36:25
a good drama and it was
36:27
connected to China and actually it
36:29
doesn't matter that much. Okay, I
36:31
know I said lightning round, but
36:33
you have to say more. What
36:35
do you mean it doesn't matter
36:37
that much? I mean, they utilized
36:39
some cool techniques and the rest
36:41
of their community is going to
36:44
learn from that. However, you know...
36:46
Like the fact that they like
36:48
trained it for a little bit
36:50
cheaper for like a lot cheaper
36:52
Does not involve all the experimentation
36:54
that they did before that and
36:56
You know I'm I don't know
36:58
if I'm supposed to say it
37:00
But I'm pretty sure they had
37:02
access to the best invidious as
37:04
well for that experimentation and It's
37:06
not that novel like people jumped
37:08
on it because they were like,
37:10
oh my God, China is taking
37:12
over the race and so on
37:14
and invidious stock price like plummeted.
37:17
And I just think it's another
37:19
place where some models were trained
37:21
that were open sourced and it's
37:23
not gonna, you know, we're not
37:25
gonna remember it in like a
37:27
year or like even six months
37:29
or maybe they will take over,
37:31
but the model doesn't really matter
37:33
that much. How you kind of
37:35
work with that best model out
37:37
there, that's what matters. That is
37:39
a hot take indeed. Okay. Next
37:41
question. Who's going to have the
37:43
best frontier model next year? Oof.
37:45
I think Open AI is always
37:47
super bullish and they always promise
37:50
a lot. And then I was
37:52
just going to talk with Sam
37:54
Altman on the YCAI retreat and
37:56
the O3 that the way that
37:58
he pictured it sounded great. But
38:00
I think we both know that
38:03
they over promise a little bit
38:05
a lot. And I love Entropic.
38:07
I think their kind of vision
38:09
and their kind of the way
38:11
that they've created this is great.
38:14
But recently Gemini 2.0 Pro with
38:16
their abilities to ingest 6,000
38:18
pages of PDF is really blowing
38:20
my mind. So end of the
38:23
story is. I have no clue.
38:25
This is a place where it's
38:27
a place where it's super fragmented
38:29
and people have zero loyalty. Pre-training
38:32
is hitting a wall. I think,
38:34
you know, famous people including Ilea
38:36
have been have included saying something
38:39
to that extent recently.
38:41
Agree or disagree? Disagree. Right now
38:43
I think, you know, it's the
38:45
intelligence of a model is
38:48
linked logarithmically to the resources
38:50
that is needed to train
38:52
it. But doing a 2X of
38:54
intelligence is on its own exponential.
38:58
Like if I'm smarter 2X
39:00
than somebody else, it doesn't
39:02
mean I'll do 2X of
39:04
the work. It means that
39:06
I'll find ways that probably
39:08
mean I'm a 10X or
39:10
even more. Favorite new AI
39:12
app, not word where? I would say I
39:14
started to edit content because we need
39:17
to explain and educate. people a little
39:19
bit more about both word were in
39:21
AI so the script is something that
39:23
I've been I've been loving and I
39:26
use granola every day and the newest
39:28
model that I'm really impressed is the
39:30
German I 2.0 pro I really like
39:32
it. That is that's a hot take
39:35
as well. I haven't heard much of that
39:37
from people. I think they came out like
39:39
four days ago so people have not
39:41
been playing around with it. Their PDF
39:43
capabilities are awesome. What application
39:45
or application category do you think
39:47
will really go mainstream and hit
39:50
this year? I would love
39:52
to see I'm personally very
39:54
very involved with that whole
39:56
AI having the context of
39:59
your life. and being able
40:01
to, you know, basically make better
40:03
decisions based on the context. And,
40:05
you know, I've rewind, which, you
40:07
know, I think they are called
40:10
limitless right now. I've ordered their
40:12
pendant, by the way, it's been
40:14
like a year and a half
40:16
and I still don't have it.
40:19
I don't know. Send it to
40:21
me or... If you're listening, please
40:23
send it. And I had to
40:26
change a color, because I know
40:28
they didn't have the color, but
40:30
I would love... for there to
40:32
be a provider which has a
40:35
lot more context and can do
40:37
the personal stuff for me. Don't
40:39
you think that's Apple over time?
40:41
I was just about to say,
40:44
I think, ideally that N421 model
40:46
or whatever it's called of the
40:48
AR glasses that they are trying
40:50
to push out there, which I
40:53
think Facebook has taken over a
40:55
little bit. Maybe we'll see early
40:57
stages of that and I think
40:59
they're the only ones. where the
41:02
privacy really like they have a
41:04
good brand around privacy and two
41:06
even if your new AR glasses
41:09
run out of battery it's still
41:11
cool to be wearing a $5,000
41:13
you know a piece of hardware
41:15
and maybe that's the UX but
41:18
I don't know what's that UX
41:20
and like a microphone so far
41:22
failed. Yeah. single piece of content that an
41:24
AI official should read or watch? I would
41:26
say all of the deep learning dot AI
41:28
resources. Everyone, like we have a bunch of
41:30
candidates apply for jobs. By the way, we're
41:33
hiring whatever I should be looking very, very
41:35
aggressively. So come join Wordware. But the deep
41:37
learning that AI resources are awesome and they
41:39
explain everything from from the bottom layer all
41:41
the way to the practical layer of how
41:43
to actually get it done. I also think
41:45
if you don't understand the the
41:47
underlying technology go Go see Free
41:50
Blue One Brown, incredible channel
41:52
on channel on YouTube, everything super
41:54
well everything super well. think think.
41:56
Wonderful. your lightning was full
41:58
of full of I didn't
42:00
even have to ask
42:02
you for a specific to
42:04
ask you for a specific thank
42:07
you so much for
42:09
coming on. I really
42:11
enjoyed chatting about on. know
42:13
How you see the
42:15
world evolving from how you
42:17
see the to from developers to word
42:19
you know you know, word where
42:21
engineers, if goes right, and appreciate you
42:24
you sitting down to
42:26
show your vision and
42:28
your hot and your hot you
42:30
for having me. Thank
42:32
you me. Thank you. You
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More