Tool calling and agents

Tool calling and agents

Released Friday, 14th February 2025
Good episode? Give it some love!
Tool calling and agents

Tool calling and agents

Tool calling and agents

Tool calling and agents

Friday, 14th February 2025
Good episode? Give it some love!
Rate Episode

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:01

Welcome to Practical AI,

0:03

the podcast that makes

0:05

artificial intelligence practical, productive, and

0:07

accessible to all. If you

0:10

like this show, you will

0:12

love the change log. It's

0:14

news on Mondays, deep technical

0:17

interviews on Wednesdays, and on

0:19

Fridays, an awesome talk show

0:22

for your weekend enjoyment. Find

0:24

us by searching for The

0:26

Change Log, wherever you get

0:29

your podcasts. Thanks to our

0:31

partners at fly.io. Launch your

0:33

AI apps in five minutes or less.

0:36

Learn how at fly.io. Welcome

0:45

to another fully connected episode

0:47

of the practical AI podcast.

0:49

In these fully connected episodes,

0:52

Chris and I keep you

0:54

updated with everything that's happening

0:56

in the AI world, if we

0:58

can. There's a lot. And we try

1:01

to give you some some learning resources

1:03

to level up your your machine learning

1:05

and AI game. I'm Daniel Weitnak. I'm

1:08

CEO at prediction guard and I'm joined

1:10

as always by my co-host Chris Benson

1:12

who is a principal AI research engineer

1:14

at Lockheed Martin. How you doing Chris?

1:17

I'm doing great. I just don't know

1:19

what we're going to talk about because

1:21

nothing ever happens in AI. There's nothing

1:23

in AI. There's never anything going on.

1:26

Elon has done nothing. Oh my gosh,

1:28

Elon is throwing spitwads at

1:30

people again. He's, let's say,

1:32

you know, he's been suing

1:35

Open AI and now he's

1:37

put his bid out, you

1:39

know, in the last few

1:41

days for Open AI. That's,

1:43

uh, what, yeah, we, we are, the, the

1:46

article that I saw was,

1:48

we are not for sale, ChatGPT

1:50

boss says. I know, good old

1:53

Sam. Sam said, we're not for

1:55

sale. because you know the two

1:57

of them really love each other. Oh yeah

1:59

definitely. Usk and Sam Altman, they

2:01

are best friends. Best friends. That's

2:04

how we'll report it here. That's

2:06

how we're reporting here, because we

2:09

always look for the upside in

2:11

the AI world here. Yeah, who

2:13

knows the motivations behind billionaires. It's

2:16

an interesting thing to watch. It

2:18

certainly spices up conversations in the

2:21

workday and is a nice point

2:23

of discussion with friends. You know,

2:25

so yeah, but that's about I

2:28

mean, that's about how I'm taking

2:30

it. Yes, that's that I would

2:32

agree with that. It's the spats

2:35

between billionaires just doesn't quite make

2:37

it to my, not being a

2:40

billionaire myself, it doesn't quite make

2:42

it onto my list of concerns.

2:44

Yeah. What do you think is

2:47

kind of the trajectory with players

2:49

like open AI, you know, there

2:52

is someone, someone was a. Well,

2:54

the other thing that happened in

2:56

the US was this fits into

2:59

maybe the other thing I was

3:01

going to ask as well, although

3:03

I got sidetracked in my mind,

3:06

because I remember that there was

3:08

a Super Bowl and there was

3:11

a Super Bowl commercial that Open

3:13

AI had, which, you know, it

3:15

was a cool commercial. I think

3:18

I didn't know what it was

3:20

going to be at first, because

3:23

it's just the artistic dots around,

3:25

you know, forming scenes. And then

3:27

I think I gradually realized that

3:30

this is. those little dots on

3:32

the open AI chat GPS app

3:34

that you know expand and someone

3:37

someone commented to me they spent

3:39

like 14 million dollars or whatever

3:42

a Super Bowl ad costs I

3:44

don't I forget how much it

3:46

was. But I was like, that's

3:49

really nothing compared to what Open

3:51

AI is is losing generally on

3:54

hosting models and infrastructure. So yeah,

3:56

that's what circles back to my

3:58

other question. which was, yeah, what

4:01

are your thoughts on, I mean,

4:03

if Elon doesn't buy Open AI,

4:05

what's the future? I honestly don't

4:08

know. And I got to be

4:10

honest with you, I'm not sure

4:12

that I care a whole lot.

4:15

I was thinking about that as

4:17

we were leading into this, is

4:19

that, you know, I mean, there's

4:21

not a protagonist here from my

4:24

standpoint. There's not a side that

4:26

I'm for or against so much.

4:28

you know, you have Elon with

4:31

all the, you know, the, the,

4:33

the, the adventure around Elon Musk,

4:35

and I say that word kind

4:38

of, kind of, tongue in cheek,

4:40

and then open AI and, you

4:42

know, you know, there is a

4:44

kernel of truth to what Elon

4:47

says when he talks about it

4:49

being, you know, going from being

4:51

the nonprofit with the grand vision

4:54

that it started out with in

4:56

the early days, and then it

4:58

has increasingly gone commercial and become

5:00

for-profit, you know, so it's another

5:03

big, it's another big giant AI

5:05

company, you know, like the others

5:07

and stuff. I'm watching it with

5:10

half an eye, like everybody else

5:12

in the world, but not sure,

5:14

I just don't. don't know and

5:17

I'm not terribly sure I care.

5:19

Yeah. Is there somebody out there

5:21

in our audience that is that

5:23

is deeply concerned about this? I

5:26

would love to I would love

5:28

to hear somebody who is not

5:30

Elon or Sam Aldman tell me

5:33

why this is a big deal.

5:35

Yeah, maybe we'll leave it at

5:37

that. It's a good point. I

5:40

am interested in, you know, some

5:42

of the dynamics like open AI

5:44

released there. deep research product. So

5:46

if you kind of look at

5:49

the trajectory of what they're releasing,

5:51

what they're doing, there's this deep

5:53

research product, which is really geared

5:56

towards this, you know, multi-step online

5:58

information research type of task. Yeah.

6:00

So, you know, going and looking

6:03

at various, you know, trends across

6:05

various sites with various data, reasoning,

6:07

certain information, consolidating that, you know,

6:09

contributing to some sort of research

6:12

project. And I find it interesting

6:14

that, you know, Open AI introduced

6:16

this. One of the dynamics I

6:19

love watching is Open AI releases

6:21

the. the application level product, so

6:23

like deep research. And then, so

6:26

I see the blog post by

6:28

hugging face, it was like the

6:30

day after. So they say yesterday,

6:32

Open AI released deep research. So

6:35

this is a blog post that

6:37

I'll link in the in the

6:39

show notes from hugging face. And

6:42

basically they just decided to make

6:44

sure that they could reproduce the

6:46

functionality with open source code. Maybe

6:48

some recently released models like Deep

6:51

Seek models or others in 24

6:53

hours. And then they wrote the

6:55

blog post and released it. I

6:58

don't know how long of a

7:00

24 hours it was, but you

7:02

know, you see that dynamic happening.

7:05

So you see that with deep

7:07

research and then you have, you

7:09

know, the open deep research thing.

7:11

You see kind of the operator

7:14

stuff where it's operating your. your

7:16

screen, your browser window. Now, earlier

7:18

today I was running, hugging face

7:21

small agents. They have a web

7:23

agent, which is essentially that it

7:25

spends up a browser window. It

7:28

does certain tasks for you in

7:30

the browser window. Like you can

7:32

type a prompt like, hey, find

7:34

the most recent episode of practical

7:37

AI. Summarize the topic and then

7:39

find. you know, seven other articles

7:41

of a related topic, list them

7:44

out in mark down format and.

7:46

you know, output that, you know,

7:48

something like that, where it requires

7:51

this sort of agent operating over

7:53

the internet. Super slick, super fun.

7:55

I would definitely recommend people if

7:57

they want to try that sort

8:00

of thing, try the small agents,

8:02

web agent. But yeah, you see

8:04

this kind of trend where at

8:07

the application level, some of this

8:09

is just, you know, it seems

8:11

like you can't develop a moat.

8:14

generally there. Now you might be

8:16

able to develop a kind of

8:18

moat as a company in a

8:20

specific domain or a vertical or

8:23

with certain knowledge or proprietary data,

8:25

right? But it's very hard at

8:27

that kind of general application level,

8:30

I would say. I think, you

8:32

know, I keep wondering, as open

8:34

AI had had a substantial lead

8:36

and it was taking quite a

8:39

period of time for a while

8:41

for open source options and application,

8:43

you know, level of things to

8:46

come about. And we've seen that

8:48

the, you know, that time interval

8:50

shrink tremendously here. So, you know,

8:53

and ironically at the same time

8:55

that Elon makes his 97 billion

8:57

dollar effort to buy open AI,

8:59

but you can't help but wonder

9:02

a little bit about what the

9:04

future business model looks like, you

9:06

know, to your point there about.

9:09

you know if it takes so

9:11

you never have time to create

9:13

a mode you know if if

9:16

you're one of the main players

9:18

now you can you know you

9:20

there's there's certainly business models for

9:22

other players to come in in

9:25

their industry as you just mentioned

9:27

and create capability because that's their

9:29

thing and it's not something that

9:32

the big boys are going to

9:34

go after but as we've seen

9:36

this interval between the commercial players

9:39

and open source shrink to almost

9:41

nothing How does that, what do

9:43

you think that means for the

9:45

business models going forward for the

9:48

Googles and the open AIs and

9:50

the entropics of the world? I

9:52

mean, I think part of it.

9:55

is maybe this sort of integration

9:57

in the kind of enterprise stack.

9:59

And what I mean by that

10:02

is, is the kind of bundle

10:04

effect that you get from something

10:06

like offerings from Microsoft. So, you

10:08

know, absolutely no one in the

10:11

world wants to use teams because

10:13

it's absolutely terrible. And I will

10:15

go on record as saying that.

10:18

Sorry for those that work on

10:20

it. I have to use it.

10:22

I have no choice. I guess,

10:24

you know, you have a podcast,

10:27

you have an opinion, but that's

10:29

my opinion. But, you know, I'm

10:31

also not going to pay hundreds

10:34

of thousands of dollars to slack

10:36

if I can just flip on

10:38

teams in, you know, in my

10:41

Microsoft tenant and they already have

10:43

all my data and all this

10:45

stuff. So the fact that they're

10:47

tying in, you know, co-pilot and

10:50

those... licenses around co-pilot and an

10:52

ecosystem that's already so embedded in

10:54

the enterprise world, there is a

10:57

very strong bundle effect there. And

10:59

yeah, it's very real, right? And

11:01

it doesn't mean that it's necessarily

11:04

the best solution, but it is

11:06

a solution depending on what you're

11:08

looking for, right? At that kind

11:10

of generic co-pilot level in a

11:13

case where you have you need

11:15

kind of single tenant meaning. in

11:17

theory, the terms and service are

11:20

my data is not being used

11:22

in certain ways. That kind of

11:24

gets that generic case, but again,

11:27

like the real business value that

11:29

a company has, the way I

11:31

see it is you've kind of

11:33

got these generic cases where someone

11:36

random is going to want to

11:38

find a word document or paste

11:40

in an email, then you've got

11:43

like the core business value. Right.

11:45

So a pharma company that has

11:47

their most sensitive tears of data

11:50

that are the, you know, lifeblood

11:52

of their company or a health-care

11:54

company or a finance company that

11:56

has certain classifications or regular. burdens

11:59

around certain tiers of data. It's

12:01

a whole nother thing to think

12:03

about integration of those tiers of

12:06

data into a generic system like

12:08

that, because they're not, you know,

12:10

they're a generic copilot system for

12:12

those kind of less sensitive tiers

12:15

of data. There's still something that

12:17

needs to be solved at those

12:19

other layers, which is where I

12:22

think, you know, vertical AI players,

12:24

but also, you know, tooling and

12:26

infrastructure players can can still make.

12:29

you know a lot of progress.

12:31

Do you think that the bundling

12:33

that you're describing that's, you know,

12:35

occurring between the vertical capabilities where

12:38

they're producing these and, you know,

12:40

and open AI going and doing

12:42

deep research or Google integrating Gemini

12:45

into, you know, the Google suite,

12:47

which they've been doing and trying

12:49

to drive a premium, you know,

12:52

from users for that? Is that

12:54

bundling going to be critical to

12:56

them going forward? Or do you

12:58

think that the open AIs of

13:01

the world and... and you know

13:03

and we've seen this historically with

13:05

Google maybe not in an AI

13:08

context always but driving into specialties

13:10

where they you know they open

13:12

up a new vertical underneath the

13:15

umbrella and stuff do you you

13:17

know is open AI gonna have

13:19

to do that to survive because

13:21

since it's gonna have open source

13:24

chomping and it's heels yeah the

13:26

general path yeah I don't know

13:28

I it could be by vertical

13:31

it could be I mean you

13:33

look at Palantir, for example, you

13:35

know, stock price soaring, most regular

13:38

people aren't using a Palantir co-pilot,

13:40

right, in their in their day-to-day,

13:42

but they have that a certain

13:44

market, particularly around, you know, DOD

13:47

or defense or other areas, they

13:49

have really put a lot into

13:51

serving that well with the less

13:54

generic but still fairly generic across

13:56

different use cases set of functionalities

13:58

and that that you know has

14:00

has served them well at least

14:03

from an outsider's perspective if I'm

14:05

if I'm looking at that so

14:07

it may be a specialization in

14:10

terms of tools or vertical it

14:12

might also just be a segment

14:14

of the market that you choose

14:17

to to focus on and is

14:19

kind of the bread and butter

14:21

it's interesting because you've got all

14:23

of these really end users direct

14:26

to consumer. traffic on open AI

14:28

and these things now where a

14:30

lot of what we had talked

14:33

about before with data science and

14:35

AI and machine learning was really

14:37

enterprise focus not direct to consumer.

14:40

So it allowed me to throw

14:42

one other layer onto this conversation

14:44

as we circle back around to

14:46

AGI ideas with you know kind

14:49

of having artificial generalized intelligence being

14:51

bantered about Sam Altman was just

14:53

saying that He was expecting GPT-5

14:56

to be smarter than he was

14:58

and so as we look at

15:00

that, you know I think GPT-3

15:03

is smarter than I am I

15:05

agree with you. But with that,

15:07

you know, with that, with that,

15:09

the AGI chase continuing at this

15:12

point, and you know, we've heard,

15:14

you know, with Deep Seek and

15:16

all these others going in and

15:19

talking about business models and bundling

15:21

and such and exploring new verticals,

15:23

how do you think that the

15:25

AGI race fits into that? Yeah,

15:28

maybe that's the piece that in

15:30

my mind I'm not. isn't really

15:32

entering into my mind much in

15:35

the same way that you don't

15:37

think about Elon so much, which

15:39

is probably good. Yeah, I think

15:42

it's an interesting question and there's

15:44

implications. The questions that come into

15:46

my mind at a more general

15:48

level, which is what you could

15:51

talk about it as AGI or

15:53

not, I don't know, but the

15:55

questions that come into my mind

15:58

are more the downstream effects of

16:00

some of these things. Are we

16:02

building systems that enhance human agency

16:05

rather than? replace it? Are we

16:07

building systems that allow us to

16:09

trust more in human institutions or

16:11

fear and distrust them more? Are

16:14

we, you know, are we building

16:16

systems that actually drive us more

16:18

into isolation as individuals or into

16:21

community together? I think those are

16:23

those are interesting kind of directions

16:25

that that are on my mind

16:28

as I think about the more

16:30

general side of this. Well, there's

16:32

no shortage of AI tools out

16:34

there, but I'm loving notion and

16:37

I'm loving notion AI. I use

16:39

notion every day. I love notion.

16:41

It helps you organize so much

16:44

for myself and for others. I

16:46

can make my own operating systems,

16:48

my own, you know, processes and

16:51

flows and things like that to

16:53

just make it easy to do,

16:55

checklists. flows, etc. that are very

16:57

complex and share those with my

17:00

team and others externally from our

17:02

organization. And notion on top of

17:04

it is just, wow, it's so

17:07

cool. I can search all of

17:09

my stuff in notion, all of

17:11

my docs, all of my things,

17:13

all of my workflows, my projects,

17:16

my work spaces, it's really astounding

17:18

what they've done with notioni. And

17:20

if you're new to notion, notion

17:23

is your one place to connect

17:25

your teams, your tools, your knowledge,

17:27

so that you're all empowered to

17:30

do your most meaningful work. And

17:32

unlike other specialized tools or legacy

17:34

suites that have you bouncing from

17:36

six different apps, notion seamlessly integrates

17:39

its infally flexible. And it's also

17:41

very beautiful and easy to use.

17:43

Mobile, desktop, web, shareable. It's just

17:46

all there. And the fully integrated

17:48

notion AI helps me and will

17:50

help you too, work faster, write

17:53

better, think bigger. and do tasks

17:55

that normally take you hours to

17:57

do it minutes or even seconds.

17:59

You can save time by writing

18:02

faster, by letting notion AI handle

18:04

that first draft and give you

18:06

some ideas to jumpstart a brainstorm

18:09

or to turn your messy notes,

18:11

I know my notes are sometimes

18:13

messy, into something polished. You can

18:16

even automate tedious tasks like summarizing

18:18

meeting notes or finding your next

18:20

steps to do. Notion AI does

18:22

all this and more and it

18:25

frees you up to do the

18:27

deep work you want to do.

18:29

The work really matters, the work

18:32

that is really profitable for you

18:34

and your company. And of course,

18:36

Notion is used by over half

18:39

of Fortune 500 companies and teams

18:41

that use Notion, send less email,

18:43

they cancel more meetings, they save

18:45

time searching for their work and

18:48

reduce spending on tools, which Kind

18:50

of helps everyone be on the

18:52

same page. Try Notion Today for

18:55

free when you go to notion.com/practical

18:57

AI. That's all lowercase letters. notion.com/practical

18:59

AI to try the powerful, easy

19:01

to use notion AI today. And

19:04

when you use our link, of

19:06

course, you are supporting this show.

19:08

And we love that. notion.com/practical AI.

19:14

Well Chris we we talked

19:16

a little bit about tools

19:18

and agents well agents generally

19:20

the web agents the deep

19:22

research things and we've kind

19:24

of talked about tool calling

19:26

and the connection to agents

19:28

at certain points on the

19:30

show but I don't think

19:33

we've really dug into you

19:35

know the detail in a

19:37

in a way that that

19:39

maybe will make things clear

19:41

for people. I still see

19:43

a lot of confusion around

19:45

this. Even, you know, in

19:47

my day today, as I'm

19:49

talking to customers, the question

19:51

of, well, how do I

19:53

make an LLLM talk to

19:56

this system, right? Or how

19:58

do I, you know, that.

20:00

research tool, how do I make

20:02

an LLLM go and do a thing,

20:04

right? That's often how

20:06

the question comes. And

20:09

what I think I

20:11

realize when I'm hearing

20:13

those questions is there's

20:15

kind of a fundamental

20:17

misunderstanding of what the LLLM

20:19

does and how it's tied into

20:22

a framework, which you might

20:24

call tool calling, you might

20:27

call agentic. the names kind of

20:29

get mushed around a lot these

20:31

days unfortunately. They do. I was

20:33

thinking that as you were saying

20:35

all that and then you got

20:37

that's literally what was in my

20:39

head in terms of the the

20:41

misuse of different names of this

20:43

technology is in what's what's doing

20:45

what so yeah yeah exactly so

20:47

in my mind so this is

20:49

I'm feeling very opinionated today I

20:51

don't I don't know why for

20:53

it excellent in my mind how

20:55

I kind of draw the lines

20:57

here, there's, you know, of course,

20:59

models, large language models,

21:02

they predict probable text,

21:04

they generate, they generate text

21:06

or images or whatever you

21:08

want them to generate, then

21:10

there's other systems kind of

21:13

over on the other side. So you

21:15

could think of, you know, your email

21:17

or your bank account or

21:19

an external system like an

21:21

Airbnb where I might want

21:23

to make a reservation or

21:25

my company's database, right, which

21:28

contains transactional data,

21:30

or another system that I use,

21:32

like HubSPOT, or all of these

21:34

types of things, or all of

21:36

these other things. And to ask

21:38

a question, well, how could I, how

21:41

could an LLLM go and create

21:43

a new deal for me and

21:45

HubSPOT? Right that's hurts me when when

21:47

you phrase it like that it but

21:49

the aus is pain in my head

21:51

Okay, but that's how that's how people

21:54

phrase it to be clear like I

21:56

get you know these questions or the

21:58

questions that come up every every day,

22:00

right? So how, the question is

22:03

often phrased, how do I make

22:05

the LLLM create a new deal

22:07

for me in Hub Spot? So

22:09

right in that phrasing, to your

22:12

point, I don't know, what makes

22:14

you, what makes you cringe about

22:16

that? It's just, that's a fingernails

22:19

on the chalkboard kind of moment

22:21

for me, is, you know, to

22:23

answer that question in the six

22:25

and a half years that we've

22:28

been doing the show. and we

22:30

have evolved through a number of

22:32

technologies, you know, that at each

22:35

point in time where the hot

22:37

thing, and inevitably people focus in

22:39

on just that for a while,

22:41

but right now we're at a

22:44

point where generative and LLLM the

22:46

last few years have been the

22:48

hot thing. and we forget that

22:51

they don't necessarily do everything out

22:53

there. It's not, you know, people

22:55

will say LLC. In fact, they

22:57

only do one thing. That's exactly

23:00

right. And not only that, but

23:02

there might be an AI architecture

23:04

that does, they could do the

23:07

thing that they want to talk

23:09

about, but it's not necessarily the

23:11

thing that they're talking about. And

23:13

they're misleading. It's not the model.

23:16

And so that's the fingers on

23:18

the chalkboard of, oh, we. there's

23:20

we've kind of talked about this

23:22

over the last year the tunnel

23:25

vision of the generative AI era

23:27

you know in terms of everyone

23:29

focusing on that but it's to

23:32

the point that there are other

23:34

technologies in the mix and there

23:36

is a technology that will do

23:38

the thing they want to do

23:41

they're just not picking the right

23:43

one in the way that they're

23:45

verbalizing it so yeah yeah so

23:48

let's maybe break this down into

23:50

components so so let's say there's

23:52

the elm You know, we'll just

23:54

talk about text now. Certainly there's

23:57

multimodal and all that stuff, but

23:59

just think about text. There's the

24:01

LLLM, which all it does is

24:04

complete probable text. So I could,

24:06

you know, ask it to auto

24:08

complete. I could ask it to

24:10

write something for me. I could

24:13

ask it to generate something for

24:15

me. That's what it does. Let's

24:17

say we'll take the Hub Spot

24:20

example since I used that. Hub

24:22

Spot for those that aren't familiar,

24:24

it's a popular CRM solution for

24:26

those that maybe aren't, you know,

24:29

don't wanna mess with Salesforce and

24:31

all of that world. So Hub

24:33

Spot, I can create a deal

24:36

associated with maybe a sales lead

24:38

I have, right? That is its

24:40

own software system that's hosted by

24:42

Hub Spot, right? And I actually.

24:45

I don't know this, but I

24:47

assume Hub Spot has an API,

24:49

a rest API, meaning you could

24:52

programmatically interact with Hub Spot. This

24:54

is how apps on Hub Spot

24:56

work, right? An app on Hub

24:58

Spot is regular good old-fashioned code

25:01

that maybe allows you to add

25:03

a feel, add these fields to

25:05

these records or retrieve this data

25:07

or report on this data. That's

25:10

just good old-fashioned code. It uses

25:12

the API. So this is a

25:14

separate system. And so there's really

25:17

no connection between. There can be

25:19

no connection directly between the LM,

25:21

which generates text, and this other

25:23

system out there that's a CRM

25:26

that does certain things. There's no

25:28

connection between the two. Except in

25:30

the middle of that, there can

25:33

be this process, which I would

25:35

generally say I would categorize as

25:37

tool calling generally or function calling,

25:39

which Let's say that you wrote

25:42

a good old-fashioned software function that

25:44

creates a deal in Hub Spot

25:46

via the rest API of Hub

25:49

Spot, right? That has nothing to

25:51

do with AI. It's just a

25:53

software function where you tell me

25:55

the email of the person, the

25:58

name, the company, I'm gonna go

26:00

in and create the deal in

26:02

Hub Spot via the API. So

26:05

there's a function, you give me

26:07

these arguments, I'm gonna create the

26:09

deal in Hub Spot. Okay, still

26:11

no connection to the LM, but

26:14

if I then ask the LM.

26:16

to say, hey, I have this

26:18

customer information, email name, etc. generate

26:21

the arguments for me to call

26:23

this function, which takes these specific

26:25

arguments, then the LLLM could generate

26:27

the necessary arguments to call that

26:30

function. And if you create a

26:32

link between the function and the

26:34

output of the LLLM, so the

26:36

LLLM is still not. really doing

26:39

anything other than generating text. But

26:41

in your code, you literally take

26:43

the output of the LM and

26:46

you put it into the input

26:48

of that function. Now, you could

26:50

put something on the front end

26:52

into the LM and have the

26:55

result be a flow of data

26:57

out of the LM into the

26:59

function and then into the Hub

27:02

Spot API. So that's sort of

27:04

how this tool calling function calling

27:06

thing works. Which makes perfect sense

27:08

and that's standard software development. You

27:11

know, that's the only thing that

27:13

is different there is the fact

27:15

that the function parameters that you're

27:18

using have been generated by the

27:20

LLLM, which is a generative model.

27:22

Perfect. That's what it does. And

27:24

there are some special things related

27:27

to this in the sense that,

27:29

you know, if you look back

27:31

in time at LLLMs. First, we

27:34

had kind of really good auto-complete

27:36

models, because that was a meta

27:38

task for people, you know, training

27:40

language models. Then people figured out,

27:43

oh, I kind of want to

27:45

use these as general instruction following

27:47

models, right? And so they developed

27:50

specific prompt formats and prompt data

27:52

sets to fine-tuned olms for specifically

27:54

instruction following. Right. So here's your

27:56

system message. Here's the mess, you

27:59

know, the message I'm providing you.

28:01

Give me the assistant response. And

28:03

they trained it on a bunch

28:05

of general instruction following things. Well,

28:08

they've done the same thing now

28:10

because they've really. Oh, a lot

28:12

of people want to do this

28:15

tool or function calling mechanism. So

28:17

certain people, including open AI in

28:19

a close sense, but others in

28:21

an open sense, like news research,

28:24

who we had on the show,

28:26

they have a data set called

28:28

Hermes. This includes a set of

28:31

prompts that are related to function

28:33

calling specifically. So they've given a

28:35

huge number of examples of function

28:37

calling prompts. to a model that

28:40

they would train like a llama

28:42

model. And now you have Hermes,

28:44

Lama 3170B. It's been fine-tuned to

28:47

follow that Hermes style prompt format

28:49

for function calling. Which means it

28:51

kind of has an advantage if

28:53

you like, or certain models that

28:56

have been trained with these examples

28:58

have an advantage specifically for that

29:00

function calling task, right? So there

29:03

is an AI element in the

29:05

sense that some models are better

29:07

at this than others because of

29:09

the way that they've been trained.

29:12

And there's certain prompt formats that

29:14

are special. and you'll get better

29:16

performance if you use those prompt

29:19

formats, or if you use a

29:21

model server like VLLM that supports

29:23

or has the imbuilt translation to

29:25

those prompt formats, etc. So there

29:28

is an AI element of it,

29:30

but it's only in the sense

29:32

that you're preparing the model for

29:35

this type of use case rather

29:37

than, you know, connecting, there's some

29:39

imbuilt connection of the model to

29:41

something external. So I'm curious, can

29:44

you tie in the tool calling

29:46

into what would be, you know,

29:48

might be considered a full, you

29:50

know, agentic implementation? What's the leap

29:53

there, if any? Yeah, interesting question,

29:55

because people use the term agent

29:57

very loosely. So some people would

30:00

say what I just described, even

30:02

just that chain of processing. So

30:04

I put something in. the front

30:06

end of the LM deal is

30:09

created in Hub Spot, that might

30:11

be considered an agent, my Hub

30:13

Spot deal creation agent. I would

30:16

say that's really just a tool-calling

30:18

example of how to use an

30:20

LM. In my mind, what separates

30:22

out the agentic side of things

30:25

is where you have some sort

30:27

of orchestration performed by the LM.

30:29

So what I mean by that

30:32

is you have a set of

30:34

tools. So let's say I have

30:36

access to Airbnb's API and kayaks

30:38

API and United Airlines API or

30:41

like whatever other travel things I

30:43

need to do, maybe my Gmail

30:45

for various things. And I say,

30:48

hey, I need to book a

30:50

car next week for my trip

30:52

to wherever, right? That input could

30:54

then be processed through the LM

30:57

not to call a single tool,

30:59

but first as an objective to

31:01

determine what tools to call and

31:04

in what sequence with what dependencies.

31:06

Try to do a first step

31:08

of that and then reevaluate and

31:10

then do the next step until

31:13

you reach the objective, right? So

31:15

first, in order to book my

31:17

thing, I need to know on

31:19

my flight is. So I go

31:22

to my Gmail and I look

31:24

for the confirmation, right? Or, you

31:26

know, second, I use that date

31:29

in the kayak API to look

31:31

for choices. And then I evaluate

31:33

those choices and then I use

31:35

it to book the reservation. So

31:38

there's a series of steps that

31:40

might call different tools. Or systems,

31:42

you know, it could be data

31:45

sources, unstructured or structured data sources

31:47

like a database or a rag

31:49

system. And so that thing that

31:51

I talked about like that hub

31:54

spot deal creation tool might be

31:56

one of those tools in an

31:58

agentic system where an agent. could

32:01

choose to use it at certain

32:03

points. And I'm being, I'm anthropomorphizing

32:05

here, it's not choosing anything, right?

32:07

But it's useful to talk about

32:10

it sometimes in that way, so

32:12

forgive me. It's choosing to use

32:14

that tool in one case and

32:17

maybe other tools and other sequences

32:19

in other cases. In my mind,

32:21

that that's what really distinguishes the

32:23

agentic side from just the tool

32:26

calling side. Well

32:38

Chris it's fun to talk about

32:40

some of the the agents thing

32:42

normally we wait till the end

32:44

of the episode to share some

32:47

learning resources but since we've been

32:49

talking about tool calling and agents

32:51

I just wanted to mention this

32:53

new course by hugging face so

32:56

they now have an agent's course

32:58

which I think was just released

33:00

and is coming out live on

33:02

YouTube if I understand correctly. And

33:05

so in the course they talk

33:07

about studying AI agents in theory,

33:09

design and practice, using established libraries

33:11

like small agents, link chain, llama

33:14

index, sharing your agents, evaluating your

33:16

agents, and then at the end

33:18

you earn a nice certificate. So

33:20

plug for the hugging face. agents

33:22

course if those of you out

33:25

there intrigued by some of the

33:27

tool calling and agent stuff it

33:29

seems like a good one. Yeah

33:31

as we record this yeah they're

33:34

actually doing it in about an

33:36

hour and 20 minutes from right

33:38

now as we record the ship

33:40

it'll be passed by the time

33:43

you're listening you missed it you

33:45

missed it sorry you're gonna have

33:47

the replay yeah but you can

33:49

do the replay yeah and it's

33:52

interesting You know, one of the

33:54

packages there that they mention is

33:56

called small agents, which is Israel.

33:58

great. I love using that that

34:01

package. It's a lot of fun.

34:03

And you know, I've even used

34:05

it in a in a

34:07

couple of really interesting internal

34:09

internal internal use cases at

34:12

prediction guard. So do me a

34:14

favor here and depending on if

34:16

so long as there's no secret

34:18

sauce moments there for prediction guard.

34:21

Can you can you plant a couple

34:23

of seeds on things that you've done?

34:25

You know, that people could explore in

34:27

terms of what you found useful and

34:29

hey, I did this thing and just

34:31

kind of let people get a sense

34:33

of how you're looking at it and

34:35

what things they might be able to

34:37

do so that they can ID it

34:39

on their own? Yeah, yeah, definitely. So

34:42

I'll speak somewhat generically here, so I

34:44

don't reveal certain things, but you know,

34:46

customer things, but one of the cases

34:48

that we actually experience

34:50

fairly often with

34:53

customers is they want to

34:55

build, you know, maybe it's they want

34:57

to build a chat bot that

34:59

has access to some, or has

35:02

access to some special knowledge

35:04

or can access special knowledge

35:06

in one way. So on the

35:08

one hand, if you have a

35:10

bunch of unstructured text,

35:12

right? That's a typical case where

35:14

you would use a rag workflow,

35:16

and you would put that into

35:18

a vector database. You can retrieve

35:20

it on the fly. That's a

35:22

rag chatbot. On the other side, there are

35:24

text to sequel methods, for

35:27

example, or API calling methods

35:29

that could allow you to

35:31

interact with your database. So

35:33

there's those methods. Sometimes, though,

35:35

you have a source of data, and

35:37

there's been a couple times for us

35:39

where it's maybe a... a web app that

35:42

doesn't have a really convenient

35:44

API, but has a really

35:46

complicated and annoying user interface.

35:48

And the company has this

35:50

web app that has a bunch of

35:52

knowledge in it, right? But there's really

35:54

no good way to extract all of

35:57

that content from the web app. It

35:59

has an annoying. interface so no

36:01

one wants to use it, right?

36:03

And so something like the small

36:06

agent's web agent, like a system

36:08

like that, and what the web

36:10

agent does is it executes a

36:12

series of tool calls that leverage

36:15

helium under the hood, which is

36:17

a package that allows you to

36:19

automate interactions with a browser. And

36:21

so if it's a web app,

36:24

it can basically spin up the

36:26

application in the browser. and then

36:28

interact with certain elements like search

36:30

for a certain thing or find

36:33

a certain component or an object,

36:35

summarize that output and output it

36:37

from the web agent. So one

36:39

of the interesting cases where we're

36:42

thinking about that is, is these

36:44

cases where a company has invested

36:46

a lot of money in some

36:48

system or application that's maybe a

36:51

legacy system that they have to

36:53

keep on using, right? But no

36:55

one really wants to engage it

36:58

with it because the UI sucks.

37:00

But it also doesn't have a

37:02

really nice API or way to

37:04

access the data in there. So

37:07

actually using an agent as a

37:09

kind of extra user that you

37:11

can control programmatically to interact with

37:13

the application is really an intriguing

37:16

kind of prospect to tie in

37:18

that knowledge and extract things from

37:20

the app. The other one that

37:22

I think comes up a lot

37:25

for us because we work. We

37:27

work in a lot of regulated

37:29

security privacy conscious context. That's kind

37:31

of what we do is prediction

37:34

guard and deploying secure infrastructure for

37:36

AI in people's companies. Often people

37:38

will want to once they now

37:41

have a private secure system tie

37:43

in their transactional databases to their

37:45

queries, right? That's often a text

37:47

to sequel type of operation where

37:50

you're querying a database, you're generating

37:52

a sequel query. That can be

37:54

error prone, right? Like you can

37:56

generate sequel. queries that don't execute

37:59

or potentially problematic sequel queries or

38:01

ones that are very computationally expensive.

38:03

And so you can tie in

38:05

other elements, agentic elements into that

38:08

where you kind of try to

38:10

answer the question iteratively with different

38:12

sequel queries until you reach an

38:14

objective having the agent, that's, this

38:17

is kind of an agentic way

38:19

to go about the text to

38:21

sequel. Or you could tie in

38:24

other tools like sequel query optimizers

38:26

and that sort of thing to

38:28

help in that process as well.

38:30

So on more on the enterprise

38:33

kind of business side those are

38:35

a couple things that have come

38:37

up for us. No, that sounds

38:39

interesting. It's. I'm just kind of

38:42

curious what you're thinking is as

38:44

how does this change the human

38:46

side of the workflow as you've

38:48

as you've seen you know in

38:51

recognizing these are some small use

38:53

cases and everything but you know

38:55

this is the beginning of the

38:57

agentic wave as we go forward

39:00

and I guess especially prompted with

39:02

the kinds of things that we're

39:04

seeing in the news these days

39:07

you know about evaluation of of

39:09

government departments and and just that

39:11

general that general notion of reassessment

39:13

for better for worse. How do

39:16

you think that that's going to,

39:18

you know, be taken into into

39:20

commercial spaces in terms of deploying

39:22

these agents? Will it change jobs

39:25

significantly? Do you think or do

39:27

you think it will just be

39:29

adding in without that kind of,

39:31

I'm kind of curious what your

39:34

lay of the landscape is? Yeah,

39:36

I mean, I think there will

39:38

be a shifting of jobs. I

39:40

think some of the things that

39:43

we've talked about specifically in those

39:45

examples are actually good examples of

39:47

expanded human agency because a lot

39:50

of times people don't do certain

39:52

tasks or can't do certain tasks

39:54

that they would like to do

39:56

as a part of their job

39:59

because of limitations of you know

40:01

really complicated UIs or that this

40:03

you know doing this and then

40:05

this and then that will take

40:08

me a ton of time and

40:10

I've got to jump to this

40:12

meeting. Right. And so I think

40:14

there's a lot of those things

40:17

where that is expanded human agency

40:19

of that. And so it's amplifying

40:21

the effect of that worker and

40:23

helping them feel like they have

40:26

superpowers because they really didn't want

40:28

to log into that application and

40:30

use it one more time. Right.

40:33

Yeah. So I think there's an

40:35

element of that. Now you could

40:37

make the argument while maybe they've

40:39

hired three people under them because

40:42

of those inefficiencies. to do some

40:44

of those tasks, which in some

40:46

ways is a shame because if

40:48

they're really just grunts, you know,

40:51

cranking through extraction of data from

40:53

horrible API or horrible user interfaces,

40:55

like you could, I mean, maybe

40:57

there's people that enjoy that all

41:00

day. I think generally that's not

41:02

a very dignified sort of way

41:04

to go about it. I'm realizing

41:06

that I'm kind of making generalities

41:09

here and there's the reality of

41:11

people's work. Not everyone kind of

41:13

gets to do the work that

41:16

they, you know, they might desire

41:18

to do or would give them

41:20

most dignity. So I want to

41:22

recognize that and I think there

41:25

will be a. there will be

41:27

a negative impact for portions, but

41:29

I'm hopeful that there's also this

41:31

positive impact. And even for people

41:34

that are maybe in less skilled

41:36

professions, if there's more of a

41:38

natural language way to access skilled

41:40

knowledge and kind of these amplifying

41:43

effects of AI, it could hopefully

41:45

open up new types of opportunities

41:47

within the market as well. I

41:49

would hope so. I mean, I

41:52

think that's certainly, I think I

41:54

suspect that we'll see all, just

41:56

as we do in life and

41:59

every other aspect, we will see

42:01

people enhancing human agency on that

42:03

kind of to the use cases

42:05

that you're talking about, and we'll

42:08

probably see people with that, you

42:10

know, that would rather take alternative

42:12

paths to that as well. I

42:14

think it will be a mixture

42:17

of the whole thing. So. Yeah,

42:19

yeah. As we kind of close

42:21

out here, and I guess we're

42:23

talking already about new trends and

42:26

other things, one thing I wanted

42:28

to note is Deloitte just put

42:30

out there in January, they put

42:32

out the state of Gen AI

42:35

in the Enterprise quarter four report.

42:37

which I've been going through. So

42:39

for those, maybe business leaders or

42:42

managers or other people that are

42:44

wanting to get a sense of

42:46

some of the things that are

42:48

being tracked across different industries in

42:51

the enterprise, there's a great report

42:53

there. I see, you know, for

42:55

example, they are tracking barriers to

42:57

developing and deploying Gen AI, worries

43:00

about complying with regulations, difficulty managing

43:02

risks, they're tracking certain. use cases,

43:04

volume of experiments and POCs, or

43:06

proof of concepts, benefit sought versus

43:09

benefit achieved, which is an interesting

43:11

one, and also Gen AI initiatives

43:13

where they're most active within certain

43:15

job functions, all of these sorts

43:18

of things and many more. So

43:20

if you're interested in those sorts

43:22

of insights, which I do think

43:25

are interesting to track. then that's

43:27

a great learning resource that will

43:29

link in the show notes and

43:31

hopefully people can find and proves

43:34

if they're interested. Definitely. Yeah. Well

43:36

Chris, it's been a great time.

43:38

I felt like I functioned well

43:40

in my tooling as a as

43:43

a podcast agent. So you did

43:45

good. You did so good that

43:47

who knows Elon Musk may be

43:49

coming after prediction guard any day

43:52

now. So yeah or maybe what

43:54

I'm saying is just being generated

43:56

by notebook. I'll. That could be

43:58

true. Yeah. Okay. Good conversation

44:01

today. All right.

44:03

Yeah. Thanks, yeah, Have a

44:05

good one. You Have a good

44:07

one. You too. All right, that is our show

44:09

for this week. If you All right.

44:11

That is our show

44:14

for this week. If you

44:16

haven't checked out our change

44:18

newsletter, head to changelog.com

44:20

slash news. There you'll

44:22

find 29 Yes, yes, 29

44:24

reasons why you should subscribe.

44:26

I'll tell you reason number 17,

44:29

you you might actually start

44:31

looking forward to Sounds like

44:33

like got got a case

44:35

of them.

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features