Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:01
Welcome to Practical AI,
0:03
the podcast that makes
0:05
artificial intelligence practical, productive, and
0:07
accessible to all. If you
0:10
like this show, you will
0:12
love the change log. It's
0:14
news on Mondays, deep technical
0:17
interviews on Wednesdays, and on
0:19
Fridays, an awesome talk show
0:22
for your weekend enjoyment. Find
0:24
us by searching for The
0:26
Change Log, wherever you get
0:29
your podcasts. Thanks to our
0:31
partners at fly.io. Launch your
0:33
AI apps in five minutes or less.
0:36
Learn how at fly.io. Welcome
0:45
to another fully connected episode
0:47
of the practical AI podcast.
0:49
In these fully connected episodes,
0:52
Chris and I keep you
0:54
updated with everything that's happening
0:56
in the AI world, if we
0:58
can. There's a lot. And we try
1:01
to give you some some learning resources
1:03
to level up your your machine learning
1:05
and AI game. I'm Daniel Weitnak. I'm
1:08
CEO at prediction guard and I'm joined
1:10
as always by my co-host Chris Benson
1:12
who is a principal AI research engineer
1:14
at Lockheed Martin. How you doing Chris?
1:17
I'm doing great. I just don't know
1:19
what we're going to talk about because
1:21
nothing ever happens in AI. There's nothing
1:23
in AI. There's never anything going on.
1:26
Elon has done nothing. Oh my gosh,
1:28
Elon is throwing spitwads at
1:30
people again. He's, let's say,
1:32
you know, he's been suing
1:35
Open AI and now he's
1:37
put his bid out, you
1:39
know, in the last few
1:41
days for Open AI. That's,
1:43
uh, what, yeah, we, we are, the, the
1:46
article that I saw was,
1:48
we are not for sale, ChatGPT
1:50
boss says. I know, good old
1:53
Sam. Sam said, we're not for
1:55
sale. because you know the two
1:57
of them really love each other. Oh yeah
1:59
definitely. Usk and Sam Altman, they
2:01
are best friends. Best friends. That's
2:04
how we'll report it here. That's
2:06
how we're reporting here, because we
2:09
always look for the upside in
2:11
the AI world here. Yeah, who
2:13
knows the motivations behind billionaires. It's
2:16
an interesting thing to watch. It
2:18
certainly spices up conversations in the
2:21
workday and is a nice point
2:23
of discussion with friends. You know,
2:25
so yeah, but that's about I
2:28
mean, that's about how I'm taking
2:30
it. Yes, that's that I would
2:32
agree with that. It's the spats
2:35
between billionaires just doesn't quite make
2:37
it to my, not being a
2:40
billionaire myself, it doesn't quite make
2:42
it onto my list of concerns.
2:44
Yeah. What do you think is
2:47
kind of the trajectory with players
2:49
like open AI, you know, there
2:52
is someone, someone was a. Well,
2:54
the other thing that happened in
2:56
the US was this fits into
2:59
maybe the other thing I was
3:01
going to ask as well, although
3:03
I got sidetracked in my mind,
3:06
because I remember that there was
3:08
a Super Bowl and there was
3:11
a Super Bowl commercial that Open
3:13
AI had, which, you know, it
3:15
was a cool commercial. I think
3:18
I didn't know what it was
3:20
going to be at first, because
3:23
it's just the artistic dots around,
3:25
you know, forming scenes. And then
3:27
I think I gradually realized that
3:30
this is. those little dots on
3:32
the open AI chat GPS app
3:34
that you know expand and someone
3:37
someone commented to me they spent
3:39
like 14 million dollars or whatever
3:42
a Super Bowl ad costs I
3:44
don't I forget how much it
3:46
was. But I was like, that's
3:49
really nothing compared to what Open
3:51
AI is is losing generally on
3:54
hosting models and infrastructure. So yeah,
3:56
that's what circles back to my
3:58
other question. which was, yeah, what
4:01
are your thoughts on, I mean,
4:03
if Elon doesn't buy Open AI,
4:05
what's the future? I honestly don't
4:08
know. And I got to be
4:10
honest with you, I'm not sure
4:12
that I care a whole lot.
4:15
I was thinking about that as
4:17
we were leading into this, is
4:19
that, you know, I mean, there's
4:21
not a protagonist here from my
4:24
standpoint. There's not a side that
4:26
I'm for or against so much.
4:28
you know, you have Elon with
4:31
all the, you know, the, the,
4:33
the, the adventure around Elon Musk,
4:35
and I say that word kind
4:38
of, kind of, tongue in cheek,
4:40
and then open AI and, you
4:42
know, you know, there is a
4:44
kernel of truth to what Elon
4:47
says when he talks about it
4:49
being, you know, going from being
4:51
the nonprofit with the grand vision
4:54
that it started out with in
4:56
the early days, and then it
4:58
has increasingly gone commercial and become
5:00
for-profit, you know, so it's another
5:03
big, it's another big giant AI
5:05
company, you know, like the others
5:07
and stuff. I'm watching it with
5:10
half an eye, like everybody else
5:12
in the world, but not sure,
5:14
I just don't. don't know and
5:17
I'm not terribly sure I care.
5:19
Yeah. Is there somebody out there
5:21
in our audience that is that
5:23
is deeply concerned about this? I
5:26
would love to I would love
5:28
to hear somebody who is not
5:30
Elon or Sam Aldman tell me
5:33
why this is a big deal.
5:35
Yeah, maybe we'll leave it at
5:37
that. It's a good point. I
5:40
am interested in, you know, some
5:42
of the dynamics like open AI
5:44
released there. deep research product. So
5:46
if you kind of look at
5:49
the trajectory of what they're releasing,
5:51
what they're doing, there's this deep
5:53
research product, which is really geared
5:56
towards this, you know, multi-step online
5:58
information research type of task. Yeah.
6:00
So, you know, going and looking
6:03
at various, you know, trends across
6:05
various sites with various data, reasoning,
6:07
certain information, consolidating that, you know,
6:09
contributing to some sort of research
6:12
project. And I find it interesting
6:14
that, you know, Open AI introduced
6:16
this. One of the dynamics I
6:19
love watching is Open AI releases
6:21
the. the application level product, so
6:23
like deep research. And then, so
6:26
I see the blog post by
6:28
hugging face, it was like the
6:30
day after. So they say yesterday,
6:32
Open AI released deep research. So
6:35
this is a blog post that
6:37
I'll link in the in the
6:39
show notes from hugging face. And
6:42
basically they just decided to make
6:44
sure that they could reproduce the
6:46
functionality with open source code. Maybe
6:48
some recently released models like Deep
6:51
Seek models or others in 24
6:53
hours. And then they wrote the
6:55
blog post and released it. I
6:58
don't know how long of a
7:00
24 hours it was, but you
7:02
know, you see that dynamic happening.
7:05
So you see that with deep
7:07
research and then you have, you
7:09
know, the open deep research thing.
7:11
You see kind of the operator
7:14
stuff where it's operating your. your
7:16
screen, your browser window. Now, earlier
7:18
today I was running, hugging face
7:21
small agents. They have a web
7:23
agent, which is essentially that it
7:25
spends up a browser window. It
7:28
does certain tasks for you in
7:30
the browser window. Like you can
7:32
type a prompt like, hey, find
7:34
the most recent episode of practical
7:37
AI. Summarize the topic and then
7:39
find. you know, seven other articles
7:41
of a related topic, list them
7:44
out in mark down format and.
7:46
you know, output that, you know,
7:48
something like that, where it requires
7:51
this sort of agent operating over
7:53
the internet. Super slick, super fun.
7:55
I would definitely recommend people if
7:57
they want to try that sort
8:00
of thing, try the small agents,
8:02
web agent. But yeah, you see
8:04
this kind of trend where at
8:07
the application level, some of this
8:09
is just, you know, it seems
8:11
like you can't develop a moat.
8:14
generally there. Now you might be
8:16
able to develop a kind of
8:18
moat as a company in a
8:20
specific domain or a vertical or
8:23
with certain knowledge or proprietary data,
8:25
right? But it's very hard at
8:27
that kind of general application level,
8:30
I would say. I think, you
8:32
know, I keep wondering, as open
8:34
AI had had a substantial lead
8:36
and it was taking quite a
8:39
period of time for a while
8:41
for open source options and application,
8:43
you know, level of things to
8:46
come about. And we've seen that
8:48
the, you know, that time interval
8:50
shrink tremendously here. So, you know,
8:53
and ironically at the same time
8:55
that Elon makes his 97 billion
8:57
dollar effort to buy open AI,
8:59
but you can't help but wonder
9:02
a little bit about what the
9:04
future business model looks like, you
9:06
know, to your point there about.
9:09
you know if it takes so
9:11
you never have time to create
9:13
a mode you know if if
9:16
you're one of the main players
9:18
now you can you know you
9:20
there's there's certainly business models for
9:22
other players to come in in
9:25
their industry as you just mentioned
9:27
and create capability because that's their
9:29
thing and it's not something that
9:32
the big boys are going to
9:34
go after but as we've seen
9:36
this interval between the commercial players
9:39
and open source shrink to almost
9:41
nothing How does that, what do
9:43
you think that means for the
9:45
business models going forward for the
9:48
Googles and the open AIs and
9:50
the entropics of the world? I
9:52
mean, I think part of it.
9:55
is maybe this sort of integration
9:57
in the kind of enterprise stack.
9:59
And what I mean by that
10:02
is, is the kind of bundle
10:04
effect that you get from something
10:06
like offerings from Microsoft. So, you
10:08
know, absolutely no one in the
10:11
world wants to use teams because
10:13
it's absolutely terrible. And I will
10:15
go on record as saying that.
10:18
Sorry for those that work on
10:20
it. I have to use it.
10:22
I have no choice. I guess,
10:24
you know, you have a podcast,
10:27
you have an opinion, but that's
10:29
my opinion. But, you know, I'm
10:31
also not going to pay hundreds
10:34
of thousands of dollars to slack
10:36
if I can just flip on
10:38
teams in, you know, in my
10:41
Microsoft tenant and they already have
10:43
all my data and all this
10:45
stuff. So the fact that they're
10:47
tying in, you know, co-pilot and
10:50
those... licenses around co-pilot and an
10:52
ecosystem that's already so embedded in
10:54
the enterprise world, there is a
10:57
very strong bundle effect there. And
10:59
yeah, it's very real, right? And
11:01
it doesn't mean that it's necessarily
11:04
the best solution, but it is
11:06
a solution depending on what you're
11:08
looking for, right? At that kind
11:10
of generic co-pilot level in a
11:13
case where you have you need
11:15
kind of single tenant meaning. in
11:17
theory, the terms and service are
11:20
my data is not being used
11:22
in certain ways. That kind of
11:24
gets that generic case, but again,
11:27
like the real business value that
11:29
a company has, the way I
11:31
see it is you've kind of
11:33
got these generic cases where someone
11:36
random is going to want to
11:38
find a word document or paste
11:40
in an email, then you've got
11:43
like the core business value. Right.
11:45
So a pharma company that has
11:47
their most sensitive tears of data
11:50
that are the, you know, lifeblood
11:52
of their company or a health-care
11:54
company or a finance company that
11:56
has certain classifications or regular. burdens
11:59
around certain tiers of data. It's
12:01
a whole nother thing to think
12:03
about integration of those tiers of
12:06
data into a generic system like
12:08
that, because they're not, you know,
12:10
they're a generic copilot system for
12:12
those kind of less sensitive tiers
12:15
of data. There's still something that
12:17
needs to be solved at those
12:19
other layers, which is where I
12:22
think, you know, vertical AI players,
12:24
but also, you know, tooling and
12:26
infrastructure players can can still make.
12:29
you know a lot of progress.
12:31
Do you think that the bundling
12:33
that you're describing that's, you know,
12:35
occurring between the vertical capabilities where
12:38
they're producing these and, you know,
12:40
and open AI going and doing
12:42
deep research or Google integrating Gemini
12:45
into, you know, the Google suite,
12:47
which they've been doing and trying
12:49
to drive a premium, you know,
12:52
from users for that? Is that
12:54
bundling going to be critical to
12:56
them going forward? Or do you
12:58
think that the open AIs of
13:01
the world and... and you know
13:03
and we've seen this historically with
13:05
Google maybe not in an AI
13:08
context always but driving into specialties
13:10
where they you know they open
13:12
up a new vertical underneath the
13:15
umbrella and stuff do you you
13:17
know is open AI gonna have
13:19
to do that to survive because
13:21
since it's gonna have open source
13:24
chomping and it's heels yeah the
13:26
general path yeah I don't know
13:28
I it could be by vertical
13:31
it could be I mean you
13:33
look at Palantir, for example, you
13:35
know, stock price soaring, most regular
13:38
people aren't using a Palantir co-pilot,
13:40
right, in their in their day-to-day,
13:42
but they have that a certain
13:44
market, particularly around, you know, DOD
13:47
or defense or other areas, they
13:49
have really put a lot into
13:51
serving that well with the less
13:54
generic but still fairly generic across
13:56
different use cases set of functionalities
13:58
and that that you know has
14:00
has served them well at least
14:03
from an outsider's perspective if I'm
14:05
if I'm looking at that so
14:07
it may be a specialization in
14:10
terms of tools or vertical it
14:12
might also just be a segment
14:14
of the market that you choose
14:17
to to focus on and is
14:19
kind of the bread and butter
14:21
it's interesting because you've got all
14:23
of these really end users direct
14:26
to consumer. traffic on open AI
14:28
and these things now where a
14:30
lot of what we had talked
14:33
about before with data science and
14:35
AI and machine learning was really
14:37
enterprise focus not direct to consumer.
14:40
So it allowed me to throw
14:42
one other layer onto this conversation
14:44
as we circle back around to
14:46
AGI ideas with you know kind
14:49
of having artificial generalized intelligence being
14:51
bantered about Sam Altman was just
14:53
saying that He was expecting GPT-5
14:56
to be smarter than he was
14:58
and so as we look at
15:00
that, you know I think GPT-3
15:03
is smarter than I am I
15:05
agree with you. But with that,
15:07
you know, with that, with that,
15:09
the AGI chase continuing at this
15:12
point, and you know, we've heard,
15:14
you know, with Deep Seek and
15:16
all these others going in and
15:19
talking about business models and bundling
15:21
and such and exploring new verticals,
15:23
how do you think that the
15:25
AGI race fits into that? Yeah,
15:28
maybe that's the piece that in
15:30
my mind I'm not. isn't really
15:32
entering into my mind much in
15:35
the same way that you don't
15:37
think about Elon so much, which
15:39
is probably good. Yeah, I think
15:42
it's an interesting question and there's
15:44
implications. The questions that come into
15:46
my mind at a more general
15:48
level, which is what you could
15:51
talk about it as AGI or
15:53
not, I don't know, but the
15:55
questions that come into my mind
15:58
are more the downstream effects of
16:00
some of these things. Are we
16:02
building systems that enhance human agency
16:05
rather than? replace it? Are we
16:07
building systems that allow us to
16:09
trust more in human institutions or
16:11
fear and distrust them more? Are
16:14
we, you know, are we building
16:16
systems that actually drive us more
16:18
into isolation as individuals or into
16:21
community together? I think those are
16:23
those are interesting kind of directions
16:25
that that are on my mind
16:28
as I think about the more
16:30
general side of this. Well, there's
16:32
no shortage of AI tools out
16:34
there, but I'm loving notion and
16:37
I'm loving notion AI. I use
16:39
notion every day. I love notion.
16:41
It helps you organize so much
16:44
for myself and for others. I
16:46
can make my own operating systems,
16:48
my own, you know, processes and
16:51
flows and things like that to
16:53
just make it easy to do,
16:55
checklists. flows, etc. that are very
16:57
complex and share those with my
17:00
team and others externally from our
17:02
organization. And notion on top of
17:04
it is just, wow, it's so
17:07
cool. I can search all of
17:09
my stuff in notion, all of
17:11
my docs, all of my things,
17:13
all of my workflows, my projects,
17:16
my work spaces, it's really astounding
17:18
what they've done with notioni. And
17:20
if you're new to notion, notion
17:23
is your one place to connect
17:25
your teams, your tools, your knowledge,
17:27
so that you're all empowered to
17:30
do your most meaningful work. And
17:32
unlike other specialized tools or legacy
17:34
suites that have you bouncing from
17:36
six different apps, notion seamlessly integrates
17:39
its infally flexible. And it's also
17:41
very beautiful and easy to use.
17:43
Mobile, desktop, web, shareable. It's just
17:46
all there. And the fully integrated
17:48
notion AI helps me and will
17:50
help you too, work faster, write
17:53
better, think bigger. and do tasks
17:55
that normally take you hours to
17:57
do it minutes or even seconds.
17:59
You can save time by writing
18:02
faster, by letting notion AI handle
18:04
that first draft and give you
18:06
some ideas to jumpstart a brainstorm
18:09
or to turn your messy notes,
18:11
I know my notes are sometimes
18:13
messy, into something polished. You can
18:16
even automate tedious tasks like summarizing
18:18
meeting notes or finding your next
18:20
steps to do. Notion AI does
18:22
all this and more and it
18:25
frees you up to do the
18:27
deep work you want to do.
18:29
The work really matters, the work
18:32
that is really profitable for you
18:34
and your company. And of course,
18:36
Notion is used by over half
18:39
of Fortune 500 companies and teams
18:41
that use Notion, send less email,
18:43
they cancel more meetings, they save
18:45
time searching for their work and
18:48
reduce spending on tools, which Kind
18:50
of helps everyone be on the
18:52
same page. Try Notion Today for
18:55
free when you go to notion.com/practical
18:57
AI. That's all lowercase letters. notion.com/practical
18:59
AI to try the powerful, easy
19:01
to use notion AI today. And
19:04
when you use our link, of
19:06
course, you are supporting this show.
19:08
And we love that. notion.com/practical AI.
19:14
Well Chris we we talked
19:16
a little bit about tools
19:18
and agents well agents generally
19:20
the web agents the deep
19:22
research things and we've kind
19:24
of talked about tool calling
19:26
and the connection to agents
19:28
at certain points on the
19:30
show but I don't think
19:33
we've really dug into you
19:35
know the detail in a
19:37
in a way that that
19:39
maybe will make things clear
19:41
for people. I still see
19:43
a lot of confusion around
19:45
this. Even, you know, in
19:47
my day today, as I'm
19:49
talking to customers, the question
19:51
of, well, how do I
19:53
make an LLLM talk to
19:56
this system, right? Or how
19:58
do I, you know, that.
20:00
research tool, how do I make
20:02
an LLLM go and do a thing,
20:04
right? That's often how
20:06
the question comes. And
20:09
what I think I
20:11
realize when I'm hearing
20:13
those questions is there's
20:15
kind of a fundamental
20:17
misunderstanding of what the LLLM
20:19
does and how it's tied into
20:22
a framework, which you might
20:24
call tool calling, you might
20:27
call agentic. the names kind of
20:29
get mushed around a lot these
20:31
days unfortunately. They do. I was
20:33
thinking that as you were saying
20:35
all that and then you got
20:37
that's literally what was in my
20:39
head in terms of the the
20:41
misuse of different names of this
20:43
technology is in what's what's doing
20:45
what so yeah yeah exactly so
20:47
in my mind so this is
20:49
I'm feeling very opinionated today I
20:51
don't I don't know why for
20:53
it excellent in my mind how
20:55
I kind of draw the lines
20:57
here, there's, you know, of course,
20:59
models, large language models,
21:02
they predict probable text,
21:04
they generate, they generate text
21:06
or images or whatever you
21:08
want them to generate, then
21:10
there's other systems kind of
21:13
over on the other side. So you
21:15
could think of, you know, your email
21:17
or your bank account or
21:19
an external system like an
21:21
Airbnb where I might want
21:23
to make a reservation or
21:25
my company's database, right, which
21:28
contains transactional data,
21:30
or another system that I use,
21:32
like HubSPOT, or all of these
21:34
types of things, or all of
21:36
these other things. And to ask
21:38
a question, well, how could I, how
21:41
could an LLLM go and create
21:43
a new deal for me and
21:45
HubSPOT? Right that's hurts me when when
21:47
you phrase it like that it but
21:49
the aus is pain in my head
21:51
Okay, but that's how that's how people
21:54
phrase it to be clear like I
21:56
get you know these questions or the
21:58
questions that come up every every day,
22:00
right? So how, the question is
22:03
often phrased, how do I make
22:05
the LLLM create a new deal
22:07
for me in Hub Spot? So
22:09
right in that phrasing, to your
22:12
point, I don't know, what makes
22:14
you, what makes you cringe about
22:16
that? It's just, that's a fingernails
22:19
on the chalkboard kind of moment
22:21
for me, is, you know, to
22:23
answer that question in the six
22:25
and a half years that we've
22:28
been doing the show. and we
22:30
have evolved through a number of
22:32
technologies, you know, that at each
22:35
point in time where the hot
22:37
thing, and inevitably people focus in
22:39
on just that for a while,
22:41
but right now we're at a
22:44
point where generative and LLLM the
22:46
last few years have been the
22:48
hot thing. and we forget that
22:51
they don't necessarily do everything out
22:53
there. It's not, you know, people
22:55
will say LLC. In fact, they
22:57
only do one thing. That's exactly
23:00
right. And not only that, but
23:02
there might be an AI architecture
23:04
that does, they could do the
23:07
thing that they want to talk
23:09
about, but it's not necessarily the
23:11
thing that they're talking about. And
23:13
they're misleading. It's not the model.
23:16
And so that's the fingers on
23:18
the chalkboard of, oh, we. there's
23:20
we've kind of talked about this
23:22
over the last year the tunnel
23:25
vision of the generative AI era
23:27
you know in terms of everyone
23:29
focusing on that but it's to
23:32
the point that there are other
23:34
technologies in the mix and there
23:36
is a technology that will do
23:38
the thing they want to do
23:41
they're just not picking the right
23:43
one in the way that they're
23:45
verbalizing it so yeah yeah so
23:48
let's maybe break this down into
23:50
components so so let's say there's
23:52
the elm You know, we'll just
23:54
talk about text now. Certainly there's
23:57
multimodal and all that stuff, but
23:59
just think about text. There's the
24:01
LLLM, which all it does is
24:04
complete probable text. So I could,
24:06
you know, ask it to auto
24:08
complete. I could ask it to
24:10
write something for me. I could
24:13
ask it to generate something for
24:15
me. That's what it does. Let's
24:17
say we'll take the Hub Spot
24:20
example since I used that. Hub
24:22
Spot for those that aren't familiar,
24:24
it's a popular CRM solution for
24:26
those that maybe aren't, you know,
24:29
don't wanna mess with Salesforce and
24:31
all of that world. So Hub
24:33
Spot, I can create a deal
24:36
associated with maybe a sales lead
24:38
I have, right? That is its
24:40
own software system that's hosted by
24:42
Hub Spot, right? And I actually.
24:45
I don't know this, but I
24:47
assume Hub Spot has an API,
24:49
a rest API, meaning you could
24:52
programmatically interact with Hub Spot. This
24:54
is how apps on Hub Spot
24:56
work, right? An app on Hub
24:58
Spot is regular good old-fashioned code
25:01
that maybe allows you to add
25:03
a feel, add these fields to
25:05
these records or retrieve this data
25:07
or report on this data. That's
25:10
just good old-fashioned code. It uses
25:12
the API. So this is a
25:14
separate system. And so there's really
25:17
no connection between. There can be
25:19
no connection directly between the LM,
25:21
which generates text, and this other
25:23
system out there that's a CRM
25:26
that does certain things. There's no
25:28
connection between the two. Except in
25:30
the middle of that, there can
25:33
be this process, which I would
25:35
generally say I would categorize as
25:37
tool calling generally or function calling,
25:39
which Let's say that you wrote
25:42
a good old-fashioned software function that
25:44
creates a deal in Hub Spot
25:46
via the rest API of Hub
25:49
Spot, right? That has nothing to
25:51
do with AI. It's just a
25:53
software function where you tell me
25:55
the email of the person, the
25:58
name, the company, I'm gonna go
26:00
in and create the deal in
26:02
Hub Spot via the API. So
26:05
there's a function, you give me
26:07
these arguments, I'm gonna create the
26:09
deal in Hub Spot. Okay, still
26:11
no connection to the LM, but
26:14
if I then ask the LM.
26:16
to say, hey, I have this
26:18
customer information, email name, etc. generate
26:21
the arguments for me to call
26:23
this function, which takes these specific
26:25
arguments, then the LLLM could generate
26:27
the necessary arguments to call that
26:30
function. And if you create a
26:32
link between the function and the
26:34
output of the LLLM, so the
26:36
LLLM is still not. really doing
26:39
anything other than generating text. But
26:41
in your code, you literally take
26:43
the output of the LM and
26:46
you put it into the input
26:48
of that function. Now, you could
26:50
put something on the front end
26:52
into the LM and have the
26:55
result be a flow of data
26:57
out of the LM into the
26:59
function and then into the Hub
27:02
Spot API. So that's sort of
27:04
how this tool calling function calling
27:06
thing works. Which makes perfect sense
27:08
and that's standard software development. You
27:11
know, that's the only thing that
27:13
is different there is the fact
27:15
that the function parameters that you're
27:18
using have been generated by the
27:20
LLLM, which is a generative model.
27:22
Perfect. That's what it does. And
27:24
there are some special things related
27:27
to this in the sense that,
27:29
you know, if you look back
27:31
in time at LLLMs. First, we
27:34
had kind of really good auto-complete
27:36
models, because that was a meta
27:38
task for people, you know, training
27:40
language models. Then people figured out,
27:43
oh, I kind of want to
27:45
use these as general instruction following
27:47
models, right? And so they developed
27:50
specific prompt formats and prompt data
27:52
sets to fine-tuned olms for specifically
27:54
instruction following. Right. So here's your
27:56
system message. Here's the mess, you
27:59
know, the message I'm providing you.
28:01
Give me the assistant response. And
28:03
they trained it on a bunch
28:05
of general instruction following things. Well,
28:08
they've done the same thing now
28:10
because they've really. Oh, a lot
28:12
of people want to do this
28:15
tool or function calling mechanism. So
28:17
certain people, including open AI in
28:19
a close sense, but others in
28:21
an open sense, like news research,
28:24
who we had on the show,
28:26
they have a data set called
28:28
Hermes. This includes a set of
28:31
prompts that are related to function
28:33
calling specifically. So they've given a
28:35
huge number of examples of function
28:37
calling prompts. to a model that
28:40
they would train like a llama
28:42
model. And now you have Hermes,
28:44
Lama 3170B. It's been fine-tuned to
28:47
follow that Hermes style prompt format
28:49
for function calling. Which means it
28:51
kind of has an advantage if
28:53
you like, or certain models that
28:56
have been trained with these examples
28:58
have an advantage specifically for that
29:00
function calling task, right? So there
29:03
is an AI element in the
29:05
sense that some models are better
29:07
at this than others because of
29:09
the way that they've been trained.
29:12
And there's certain prompt formats that
29:14
are special. and you'll get better
29:16
performance if you use those prompt
29:19
formats, or if you use a
29:21
model server like VLLM that supports
29:23
or has the imbuilt translation to
29:25
those prompt formats, etc. So there
29:28
is an AI element of it,
29:30
but it's only in the sense
29:32
that you're preparing the model for
29:35
this type of use case rather
29:37
than, you know, connecting, there's some
29:39
imbuilt connection of the model to
29:41
something external. So I'm curious, can
29:44
you tie in the tool calling
29:46
into what would be, you know,
29:48
might be considered a full, you
29:50
know, agentic implementation? What's the leap
29:53
there, if any? Yeah, interesting question,
29:55
because people use the term agent
29:57
very loosely. So some people would
30:00
say what I just described, even
30:02
just that chain of processing. So
30:04
I put something in. the front
30:06
end of the LM deal is
30:09
created in Hub Spot, that might
30:11
be considered an agent, my Hub
30:13
Spot deal creation agent. I would
30:16
say that's really just a tool-calling
30:18
example of how to use an
30:20
LM. In my mind, what separates
30:22
out the agentic side of things
30:25
is where you have some sort
30:27
of orchestration performed by the LM.
30:29
So what I mean by that
30:32
is you have a set of
30:34
tools. So let's say I have
30:36
access to Airbnb's API and kayaks
30:38
API and United Airlines API or
30:41
like whatever other travel things I
30:43
need to do, maybe my Gmail
30:45
for various things. And I say,
30:48
hey, I need to book a
30:50
car next week for my trip
30:52
to wherever, right? That input could
30:54
then be processed through the LM
30:57
not to call a single tool,
30:59
but first as an objective to
31:01
determine what tools to call and
31:04
in what sequence with what dependencies.
31:06
Try to do a first step
31:08
of that and then reevaluate and
31:10
then do the next step until
31:13
you reach the objective, right? So
31:15
first, in order to book my
31:17
thing, I need to know on
31:19
my flight is. So I go
31:22
to my Gmail and I look
31:24
for the confirmation, right? Or, you
31:26
know, second, I use that date
31:29
in the kayak API to look
31:31
for choices. And then I evaluate
31:33
those choices and then I use
31:35
it to book the reservation. So
31:38
there's a series of steps that
31:40
might call different tools. Or systems,
31:42
you know, it could be data
31:45
sources, unstructured or structured data sources
31:47
like a database or a rag
31:49
system. And so that thing that
31:51
I talked about like that hub
31:54
spot deal creation tool might be
31:56
one of those tools in an
31:58
agentic system where an agent. could
32:01
choose to use it at certain
32:03
points. And I'm being, I'm anthropomorphizing
32:05
here, it's not choosing anything, right?
32:07
But it's useful to talk about
32:10
it sometimes in that way, so
32:12
forgive me. It's choosing to use
32:14
that tool in one case and
32:17
maybe other tools and other sequences
32:19
in other cases. In my mind,
32:21
that that's what really distinguishes the
32:23
agentic side from just the tool
32:26
calling side. Well
32:38
Chris it's fun to talk about
32:40
some of the the agents thing
32:42
normally we wait till the end
32:44
of the episode to share some
32:47
learning resources but since we've been
32:49
talking about tool calling and agents
32:51
I just wanted to mention this
32:53
new course by hugging face so
32:56
they now have an agent's course
32:58
which I think was just released
33:00
and is coming out live on
33:02
YouTube if I understand correctly. And
33:05
so in the course they talk
33:07
about studying AI agents in theory,
33:09
design and practice, using established libraries
33:11
like small agents, link chain, llama
33:14
index, sharing your agents, evaluating your
33:16
agents, and then at the end
33:18
you earn a nice certificate. So
33:20
plug for the hugging face. agents
33:22
course if those of you out
33:25
there intrigued by some of the
33:27
tool calling and agent stuff it
33:29
seems like a good one. Yeah
33:31
as we record this yeah they're
33:34
actually doing it in about an
33:36
hour and 20 minutes from right
33:38
now as we record the ship
33:40
it'll be passed by the time
33:43
you're listening you missed it you
33:45
missed it sorry you're gonna have
33:47
the replay yeah but you can
33:49
do the replay yeah and it's
33:52
interesting You know, one of the
33:54
packages there that they mention is
33:56
called small agents, which is Israel.
33:58
great. I love using that that
34:01
package. It's a lot of fun.
34:03
And you know, I've even used
34:05
it in a in a
34:07
couple of really interesting internal
34:09
internal internal use cases at
34:12
prediction guard. So do me a
34:14
favor here and depending on if
34:16
so long as there's no secret
34:18
sauce moments there for prediction guard.
34:21
Can you can you plant a couple
34:23
of seeds on things that you've done?
34:25
You know, that people could explore in
34:27
terms of what you found useful and
34:29
hey, I did this thing and just
34:31
kind of let people get a sense
34:33
of how you're looking at it and
34:35
what things they might be able to
34:37
do so that they can ID it
34:39
on their own? Yeah, yeah, definitely. So
34:42
I'll speak somewhat generically here, so I
34:44
don't reveal certain things, but you know,
34:46
customer things, but one of the cases
34:48
that we actually experience
34:50
fairly often with
34:53
customers is they want to
34:55
build, you know, maybe it's they want
34:57
to build a chat bot that
34:59
has access to some, or has
35:02
access to some special knowledge
35:04
or can access special knowledge
35:06
in one way. So on the
35:08
one hand, if you have a
35:10
bunch of unstructured text,
35:12
right? That's a typical case where
35:14
you would use a rag workflow,
35:16
and you would put that into
35:18
a vector database. You can retrieve
35:20
it on the fly. That's a
35:22
rag chatbot. On the other side, there are
35:24
text to sequel methods, for
35:27
example, or API calling methods
35:29
that could allow you to
35:31
interact with your database. So
35:33
there's those methods. Sometimes, though,
35:35
you have a source of data, and
35:37
there's been a couple times for us
35:39
where it's maybe a... a web app that
35:42
doesn't have a really convenient
35:44
API, but has a really
35:46
complicated and annoying user interface.
35:48
And the company has this
35:50
web app that has a bunch of
35:52
knowledge in it, right? But there's really
35:54
no good way to extract all of
35:57
that content from the web app. It
35:59
has an annoying. interface so no
36:01
one wants to use it, right?
36:03
And so something like the small
36:06
agent's web agent, like a system
36:08
like that, and what the web
36:10
agent does is it executes a
36:12
series of tool calls that leverage
36:15
helium under the hood, which is
36:17
a package that allows you to
36:19
automate interactions with a browser. And
36:21
so if it's a web app,
36:24
it can basically spin up the
36:26
application in the browser. and then
36:28
interact with certain elements like search
36:30
for a certain thing or find
36:33
a certain component or an object,
36:35
summarize that output and output it
36:37
from the web agent. So one
36:39
of the interesting cases where we're
36:42
thinking about that is, is these
36:44
cases where a company has invested
36:46
a lot of money in some
36:48
system or application that's maybe a
36:51
legacy system that they have to
36:53
keep on using, right? But no
36:55
one really wants to engage it
36:58
with it because the UI sucks.
37:00
But it also doesn't have a
37:02
really nice API or way to
37:04
access the data in there. So
37:07
actually using an agent as a
37:09
kind of extra user that you
37:11
can control programmatically to interact with
37:13
the application is really an intriguing
37:16
kind of prospect to tie in
37:18
that knowledge and extract things from
37:20
the app. The other one that
37:22
I think comes up a lot
37:25
for us because we work. We
37:27
work in a lot of regulated
37:29
security privacy conscious context. That's kind
37:31
of what we do is prediction
37:34
guard and deploying secure infrastructure for
37:36
AI in people's companies. Often people
37:38
will want to once they now
37:41
have a private secure system tie
37:43
in their transactional databases to their
37:45
queries, right? That's often a text
37:47
to sequel type of operation where
37:50
you're querying a database, you're generating
37:52
a sequel query. That can be
37:54
error prone, right? Like you can
37:56
generate sequel. queries that don't execute
37:59
or potentially problematic sequel queries or
38:01
ones that are very computationally expensive.
38:03
And so you can tie in
38:05
other elements, agentic elements into that
38:08
where you kind of try to
38:10
answer the question iteratively with different
38:12
sequel queries until you reach an
38:14
objective having the agent, that's, this
38:17
is kind of an agentic way
38:19
to go about the text to
38:21
sequel. Or you could tie in
38:24
other tools like sequel query optimizers
38:26
and that sort of thing to
38:28
help in that process as well.
38:30
So on more on the enterprise
38:33
kind of business side those are
38:35
a couple things that have come
38:37
up for us. No, that sounds
38:39
interesting. It's. I'm just kind of
38:42
curious what you're thinking is as
38:44
how does this change the human
38:46
side of the workflow as you've
38:48
as you've seen you know in
38:51
recognizing these are some small use
38:53
cases and everything but you know
38:55
this is the beginning of the
38:57
agentic wave as we go forward
39:00
and I guess especially prompted with
39:02
the kinds of things that we're
39:04
seeing in the news these days
39:07
you know about evaluation of of
39:09
government departments and and just that
39:11
general that general notion of reassessment
39:13
for better for worse. How do
39:16
you think that that's going to,
39:18
you know, be taken into into
39:20
commercial spaces in terms of deploying
39:22
these agents? Will it change jobs
39:25
significantly? Do you think or do
39:27
you think it will just be
39:29
adding in without that kind of,
39:31
I'm kind of curious what your
39:34
lay of the landscape is? Yeah,
39:36
I mean, I think there will
39:38
be a shifting of jobs. I
39:40
think some of the things that
39:43
we've talked about specifically in those
39:45
examples are actually good examples of
39:47
expanded human agency because a lot
39:50
of times people don't do certain
39:52
tasks or can't do certain tasks
39:54
that they would like to do
39:56
as a part of their job
39:59
because of limitations of you know
40:01
really complicated UIs or that this
40:03
you know doing this and then
40:05
this and then that will take
40:08
me a ton of time and
40:10
I've got to jump to this
40:12
meeting. Right. And so I think
40:14
there's a lot of those things
40:17
where that is expanded human agency
40:19
of that. And so it's amplifying
40:21
the effect of that worker and
40:23
helping them feel like they have
40:26
superpowers because they really didn't want
40:28
to log into that application and
40:30
use it one more time. Right.
40:33
Yeah. So I think there's an
40:35
element of that. Now you could
40:37
make the argument while maybe they've
40:39
hired three people under them because
40:42
of those inefficiencies. to do some
40:44
of those tasks, which in some
40:46
ways is a shame because if
40:48
they're really just grunts, you know,
40:51
cranking through extraction of data from
40:53
horrible API or horrible user interfaces,
40:55
like you could, I mean, maybe
40:57
there's people that enjoy that all
41:00
day. I think generally that's not
41:02
a very dignified sort of way
41:04
to go about it. I'm realizing
41:06
that I'm kind of making generalities
41:09
here and there's the reality of
41:11
people's work. Not everyone kind of
41:13
gets to do the work that
41:16
they, you know, they might desire
41:18
to do or would give them
41:20
most dignity. So I want to
41:22
recognize that and I think there
41:25
will be a. there will be
41:27
a negative impact for portions, but
41:29
I'm hopeful that there's also this
41:31
positive impact. And even for people
41:34
that are maybe in less skilled
41:36
professions, if there's more of a
41:38
natural language way to access skilled
41:40
knowledge and kind of these amplifying
41:43
effects of AI, it could hopefully
41:45
open up new types of opportunities
41:47
within the market as well. I
41:49
would hope so. I mean, I
41:52
think that's certainly, I think I
41:54
suspect that we'll see all, just
41:56
as we do in life and
41:59
every other aspect, we will see
42:01
people enhancing human agency on that
42:03
kind of to the use cases
42:05
that you're talking about, and we'll
42:08
probably see people with that, you
42:10
know, that would rather take alternative
42:12
paths to that as well. I
42:14
think it will be a mixture
42:17
of the whole thing. So. Yeah,
42:19
yeah. As we kind of close
42:21
out here, and I guess we're
42:23
talking already about new trends and
42:26
other things, one thing I wanted
42:28
to note is Deloitte just put
42:30
out there in January, they put
42:32
out the state of Gen AI
42:35
in the Enterprise quarter four report.
42:37
which I've been going through. So
42:39
for those, maybe business leaders or
42:42
managers or other people that are
42:44
wanting to get a sense of
42:46
some of the things that are
42:48
being tracked across different industries in
42:51
the enterprise, there's a great report
42:53
there. I see, you know, for
42:55
example, they are tracking barriers to
42:57
developing and deploying Gen AI, worries
43:00
about complying with regulations, difficulty managing
43:02
risks, they're tracking certain. use cases,
43:04
volume of experiments and POCs, or
43:06
proof of concepts, benefit sought versus
43:09
benefit achieved, which is an interesting
43:11
one, and also Gen AI initiatives
43:13
where they're most active within certain
43:15
job functions, all of these sorts
43:18
of things and many more. So
43:20
if you're interested in those sorts
43:22
of insights, which I do think
43:25
are interesting to track. then that's
43:27
a great learning resource that will
43:29
link in the show notes and
43:31
hopefully people can find and proves
43:34
if they're interested. Definitely. Yeah. Well
43:36
Chris, it's been a great time.
43:38
I felt like I functioned well
43:40
in my tooling as a as
43:43
a podcast agent. So you did
43:45
good. You did so good that
43:47
who knows Elon Musk may be
43:49
coming after prediction guard any day
43:52
now. So yeah or maybe what
43:54
I'm saying is just being generated
43:56
by notebook. I'll. That could be
43:58
true. Yeah. Okay. Good conversation
44:01
today. All right.
44:03
Yeah. Thanks, yeah, Have a
44:05
good one. You Have a good
44:07
one. You too. All right, that is our show
44:09
for this week. If you All right.
44:11
That is our show
44:14
for this week. If you
44:16
haven't checked out our change
44:18
newsletter, head to changelog.com
44:20
slash news. There you'll
44:22
find 29 Yes, yes, 29
44:24
reasons why you should subscribe.
44:26
I'll tell you reason number 17,
44:29
you you might actually start
44:31
looking forward to Sounds like
44:33
like got got a case
44:35
of them.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More