Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:01
Welcome to Practical AI, the
0:03
podcast that makes artificial intelligence
0:06
practical, productive, and accessible
0:08
to all. If you like
0:10
this show, you will love
0:13
the change log. It's news
0:15
on Mondays, deep technical interviews
0:18
on Wednesdays, and on Fridays,
0:20
an awesome talk show for
0:23
your weekend enjoyment. Find us
0:25
by searching for The Change
0:28
Log wherever you get your
0:30
podcasts. Thanks to our partners
0:33
at fly.io. Launch your AI
0:35
apps in five minutes
0:37
or less. Learn how
0:40
at fly.io. Well, welcome
0:42
to another episode of
0:45
the practical AI podcast. This
0:47
is... Daniel Weinek, I'm CEO
0:49
of Prediction Guard, and I'm
0:51
really excited today to dig
0:54
a little bit more into
0:56
Gen A.I. Orchestration, agents, coding,
0:58
assistants, all of those things
1:00
with my guest, Pavel Veller,
1:02
who is chief technologist at
1:05
EPAM Systems. Welcome, Pavel. Great
1:07
to have you here. Thank
1:09
you. Hello, hello. Yeah, yeah.
1:11
Well, I mean, there's a
1:13
lot of topics even before we
1:16
kicked off the show, we were
1:18
chatting in the background about some
1:20
really interesting things. I'm wondering if
1:22
you could just kind of level
1:24
set us of people may or
1:26
may not have heard of E-PAM.
1:28
I think one of the things
1:30
that I saw that you all
1:32
were working on was this. Gen-A-I
1:34
orchestration platform dial. Maybe before we
1:36
get into some of the specifics
1:38
about that and other things that
1:40
you're interested in, maybe just give
1:42
us a background of, you know, what
1:44
EPAM is. I know you mentioned even
1:46
in our discussions that some of what
1:49
you're doing right now maybe wouldn't have
1:51
even been possible, you know, a couple
1:53
of years ago, and so things are
1:55
developing rapidly. Just level set the kind
1:57
of background of this area where you're
2:00
you're working? Sure, yeah, so EPAM
2:02
is a professional services organization. We're
2:04
global. We're in 50-something countries. 50,000
2:06
people globally work with clients. We
2:09
have been for, I think, 32
2:11
years to date. So we do
2:13
a lot of different things, as
2:15
you can imagine. And what I
2:18
was mentioning about doing things that
2:20
would not be possible is doing
2:22
things with Gen AI today. We
2:24
do a lot of work for
2:27
our own clients. We also do
2:29
work for ourselves, applying the same
2:31
technology because EPAM historically as a
2:33
company has been running on software
2:36
that we ourselves built. The philosophy
2:38
has always been that things that
2:40
do not differentiate you like an
2:42
accounting software or like CRM, you
2:44
would go and buy off the
2:47
shelf. Things that differentiate you, how
2:49
we actually work, how we operate,
2:51
how we execute projects, how we
2:53
hire people, how we create teams,
2:56
how we deploy teams, all of
2:58
that software has always been our
3:00
own since as early as late
3:02
90s. And we keep iterating on
3:05
that software for ourselves. And that
3:07
software today is very much AI
3:09
first. And a lot of things
3:11
we do, we do with AI
3:14
and really do because AI in
3:16
its current form exists. Interesting. Yeah.
3:18
And how I guess does... you
3:20
know I think when we initially
3:23
kind of was we were prompted
3:25
to to reach out to you
3:27
part of it was around this
3:29
this orchestration platform so talk a
3:32
little bit maybe generally not necessarily
3:34
about the platform per se although
3:36
we'll get into that but just
3:38
gen AI orchestration generally so you
3:41
talked about you know some of
3:43
these things that are becoming possible
3:45
where does orchestration fit in that
3:47
and what what do you mean
3:49
by orchestration? You probably think of
3:52
dial. You can Google it. We
3:54
do a lot of applied innovation
3:56
in general as a company. This
3:58
is one of the good examples
4:01
of applied innovation. to AI. The
4:03
best way to think of dial
4:05
would be, you guys all know
4:07
chat GPT, right? Chat GPT isn't
4:10
an LLLM. It's an application that
4:12
connects to an LLLM and gives
4:14
you certain functionalities. It can be
4:16
as simple as just chatting and
4:19
asking questions. It can be a
4:21
little more complex, uploading documents and
4:23
speaking to them, like talk to
4:25
my documents. It can be even
4:28
more complex when you start connecting
4:30
your own tools to it. We
4:32
see our clients not only do
4:34
this, but also want something like
4:37
this for their own business processes.
4:39
And this orchestration engine becomes, how
4:41
do I make it so that
4:43
I don't have 20 different teams
4:45
doing the same similar things over
4:48
and over again in their own
4:50
silos? How do I connect my
4:52
teams and their eyes and their
4:54
thoughts and results into a consolidated
4:57
ecosystem? And it's likely because of
4:59
Gen A.I. and because of what
5:01
we can do with conversation and
5:03
text becomes sort of conversation first.
5:06
You can think of conversation first
5:08
application mashups almost, right? Like you
5:10
talk, express a problem. What comes
5:12
back is not just the answer.
5:15
Maybe what comes back is UI
5:17
elements. Buttons you can click forms
5:19
you can fill out, things you
5:21
can do, as well as things
5:24
that are done for you by
5:26
agents automatically. So dial in that
5:28
sense is, well, by the way,
5:30
it is open source. You guys
5:33
can also go look, you know,
5:35
download and play with it. But
5:37
it is a, this like chat,
5:39
gPT like conversational application that has
5:42
many capabilities that go beyond. We
5:44
have dial apps. They predate MCP,
5:46
but the idea is that you,
5:48
so dial itself has a contract,
5:50
an API that you implement. You
5:53
basically come back with a streaming.
5:55
API that can receive a user
5:57
prompt and whatever you do you
5:59
do when you come back to
6:02
dial when with not just text.
6:04
It's a much more powerful payload
6:06
with UI elements, interactive elements, and
6:08
things that Dial will display for
6:11
me, the user, to continue my
6:13
interaction. And Dial becomes this sort
6:15
of center mass of how your
6:17
company can build, implement, integrate AI into
6:20
this single point of entry.
6:22
And then Dial goes, well,
6:24
from day one, Dial was
6:26
a load balancing model agnostic
6:28
proxy. Right, so every model, liberal
6:30
deployment has limits, you know,
6:32
tokens per minute, tokens per
6:35
day, whatever, request per minute,
6:37
you'll likely, if your large
6:39
organization will, with large,
6:41
different workflows, your AI appetite
6:43
will go well beyond a
6:45
single model deployment. You'd like
6:47
to load balance across multiple,
6:49
and then you'd like to try
6:51
different models, ideally with the same
6:54
API for you, the consumer. So
6:56
dial started like that. lowered balancing
6:58
model agnostic proxy, single point of
7:01
entry. We can log everything that
7:03
is prompted in the organization. We
7:05
can do an else on that separately
7:07
because that's very helpful to know
7:10
what kind of problems your teams
7:12
are trying to solve. And then it
7:14
evolved into this application hosting ecosystem.
7:16
Now it's evolving clearly towards what
7:18
MCP can bring in because now
7:20
you can connect a lot more
7:22
things to it through MCP. So
7:24
I think it's frightening at like
7:26
20-something clients by now. So just
7:28
a couple of follow-up questions. It's
7:30
been in the news a lot,
7:32
but just so people understand if
7:34
maybe they haven't seen it, what
7:36
are you referring to with MCP
7:38
and kind of how that relates
7:40
to some of this API interface
7:42
that you're enabling? Well, the easiest
7:44
is to Google it. You're going
7:46
to find it. It's on Claude.
7:49
Let me tell you how I
7:51
think about this, because not what
7:53
it actually like that's helpful. Yeah,
7:55
yeah, I think what in the
7:57
very simple term. So MCP allows to
7:59
connect. the existing software world to
8:01
LLMs. In a way, think like,
8:03
and I'm gonna, I don't wanna
8:06
hype it too much because it's
8:08
not yet a global standard or
8:10
anything. It's very early, early, early
8:12
days. It's been months, right? But
8:14
what, let's say, HDML and browsers
8:16
and HDTP, they enabled to connect,
8:18
well, us, people to software all
8:20
over the world. MCP does that.
8:23
but for LLLMs. So today, if
8:25
I want, if I today want
8:27
to be able to prompt my
8:29
application, that is in front of
8:31
an LLLM, to do things with
8:33
additional tools, let's say I wanted
8:35
to be able to search file
8:37
system based on what I prompted
8:40
and find a file and something
8:42
in that file, right? So my
8:44
application needs to be able to
8:46
do that. My option is what?
8:48
I can write that function. I
8:50
can then tell my LM, hey,
8:52
here's this function, you can call
8:54
if you want to, call it,
8:57
I'm going to call it, I'm
8:59
going to call it for you.
9:01
Great, that's one function. What if
9:03
I need to do something else?
9:05
I want to go talk to
9:07
my CRM system and get something
9:09
out of there. I'm going to
9:11
write that function. If I'm going
9:14
to write all the functions, I
9:16
can think of, it's going to
9:18
take me years, probably hundreds of
9:20
years. because there's a protocol called
9:22
the MCP. I'm going to bring
9:24
you MCP servers that other people
9:26
have built for my CRM system,
9:29
for my file system, for my
9:31
CLI, their MCP servers for everything.
9:33
IntelliJ exposes itself as an MCP
9:35
server to do things that IDE
9:37
can do. Now you can orchestrate
9:39
those things through a limb. So
9:41
you connect all this MCP servers
9:43
through an MCP client, this application
9:46
in front of an LEM, to
9:48
a limb. Expose the tools to
9:50
Alam. Alam can now ask the
9:52
client to call a tool. And
9:54
through the same CP protocol, the
9:56
client calls the server. The server
9:58
does the function that has been
10:00
written in that server. And boom,
10:03
Alam gets results. This connective tissue
10:05
that did not exist three months
10:07
ago, three months ago, everybody was
10:09
writing their own. And right now,
10:11
everybody, as far as I can
10:13
tell, writing MCP servers, and those
10:15
who talk to all the LMS,
10:17
they consume MCP servers. Yeah. And
10:20
maybe just give, even, so I
10:22
like the example that you gave
10:24
of sort of searching file systems.
10:26
What are just to kind of
10:28
expand people's understanding of some of
10:30
the possibilities? What are some of
10:32
the things that you've seen maybe
10:34
implemented in dial? as things that
10:37
are being orchestrated, you know, in
10:39
general terms. What are kind of
10:41
some of these things? Let me
10:43
give you a higher level and
10:45
much more sort of fruitful example,
10:47
okay? Yeah. We have our own
10:49
agented developer. It's called AI slash
10:52
run code me, because AI slash
10:54
run has multiple different agented systems.
10:56
Code me specifically coding oriented oriented
10:58
oriented oriented or oriented oriented oriented.
11:00
We have others oriented at other
11:02
parts of SDLC workflow. By the
11:04
way, you guys can go to
11:06
SWE Bench and look at verified
11:09
list. I believe Code Me, as
11:11
of now, takes fifth, it's number
11:13
five, on the list of all
11:15
the agents who compete for solving
11:17
open source defects and stuff. So
11:19
Code Me, as an agentic system,
11:21
has many different assistance in it.
11:23
Dial is a generic front door.
11:26
as a chat GPT, would like
11:28
to be able to run those
11:30
assistance for you as you talk
11:32
to dial. And until MCP, it
11:34
really couldn't, other than, hey, code
11:36
me, implement an API for all
11:38
of your assistance. Let me learn
11:40
to call all of your APIs.
11:43
Now the story is, hey, code
11:45
me, give me an MCP server
11:47
for you, which what they have
11:49
done. Dial as an MCP client
11:51
can now connect to. All code
11:53
me features all the assistance. expose
11:55
them as tools to an LLLM
11:57
and orchestrate them for me. So
12:00
I come into the chat, I
12:02
ask for something, that something includes
12:04
reading a code base and making
12:06
architecture sketches or proposals or evaluation,
12:08
right? And LLLM will ask CodeMe
12:10
assistance to go and read that
12:12
code base because there is a
12:14
feature in CodeMe that does it
12:17
and dial needs to only orchestrate.
12:19
but doesn't need to rebuild or
12:21
build from scratch. That's the idea.
12:23
So this is an example. Yeah.
12:25
Could you talk a little bit?
12:27
I'm asking selfish questions because sometimes
12:29
I get these asked of me.
12:32
And I'm always curious how people
12:34
how people answer this. So one
12:36
of the questions that I get
12:38
asked a lot in respect to
12:40
this topic is, okay, I have
12:42
tool or function or assistant one.
12:44
And then I have assistant two.
12:46
And then I kind of have
12:49
a few, right? And it's fairly
12:51
easy to route between them because
12:53
they're very distinct, right? But now
12:55
if you imagine, okay, well, now
12:57
I could call one of a
12:59
thousand assistants or functions or something
13:01
or, you know, later on, 10,000,
13:03
right? How does the sort of
13:06
scaling and routing kind of actually,
13:08
how is that affected as you
13:10
kind of expand the, the space
13:12
of things that you can do?
13:14
So that? I think, and again,
13:16
I can't know, and I don't
13:18
know, but I think that is
13:20
still the secret sauce. In a
13:23
way, that is still why there
13:25
is all of all of this
13:27
coding agents in SWE Bench, all
13:29
of them work with, let's say,
13:31
Claude Sonnet 35 or Claude Sonnet
13:33
37, or GPT, or GPT, or
13:35
LLLM is the same. And yet
13:37
results are clearly different. Some score
13:40
10 points higher than the other.
13:42
You go to cursor. ID cursor,
13:44
you ask it something, it does
13:46
something. You switch the mode. to
13:48
Max, they've introduced very recently, cursor
13:50
on Sonnet 3.7 and now on
13:52
Gemini 2.0, I think, they have
13:55
a Max mode, which is paper
13:57
use versus their normal monthly plans,
13:59
because Max will do more iterations,
14:01
will spend more tokens, will be
14:03
more expensive, will likely run through
14:05
more complex orchestrations of prompts and
14:07
tools and whatnot, to give you
14:09
better results. So how you build
14:12
the pyramid of choices for your
14:14
LLLM? How you, because yeah, you
14:16
will not ask LML, you will
14:18
not give it a thousand tools.
14:20
If you as a human, look
14:22
a thousand options and you lose
14:24
yourself, you know, 100 options in
14:26
it. I, again, I don't know.
14:29
I expect LLLM to have the
14:31
same sort of oops, overwhelmed effect.
14:33
You don't want to give it
14:35
a thousand tools. You want to
14:37
give it groups. You want to
14:39
say, you know, you know, pick
14:41
a group. And then within that,
14:43
so you want to do this
14:46
basically like a pyramid, like a
14:48
tree, but how you build it
14:50
and how you prompt it and
14:52
how you, how you do this,
14:54
now that's still on you. This
14:56
is the application that connects the
14:58
MCPs, the tools that it itself
15:00
has, the prompt that the user
15:03
has given the system instructions and
15:05
building the, some of the chain
15:07
of throttle limb can build. And
15:09
this is going to be a
15:11
very interesting balance. What do you
15:13
ask a little lamp to build?
15:15
how much of this sequencing of
15:17
steps will be on you in
15:20
your hands versus how much you're
15:22
going to delegate to a limb
15:24
and ask a limb to come
15:26
up with a sequence of steps.
15:28
And from what I've seen over
15:30
the last year, you're better off
15:32
delegating more total of limbs because
15:35
they get better at it. So
15:37
the more you control the sequence
15:39
yourself, the less the more sort
15:41
of inflexible it becomes, you better
15:43
off. delegating to a lamp, but
15:45
you don't expect it to just
15:47
figure out from one prompt. Then
15:49
I can give you that example
15:52
that I gave in the big...
15:54
if you want, about the failure.
15:56
Yeah, go for it. So I
15:58
use AI, so I built with
16:00
AI, right? But I also use
16:02
AI as a developer. So I'm
16:04
on cursor as my primary ID
16:06
these days. I use the AI
16:09
slash run code that I mentioned.
16:11
I play around with other things
16:13
like as they come up, I
16:15
quote code code and things. But
16:17
I also record what I do.
16:19
Little snippets, five 10 minutes videos
16:21
for my engineering audience at EPAM.
16:23
for the guys to just look
16:26
what it is that I'm doing,
16:28
learn from how I do it,
16:30
try to think the same way,
16:32
try to replicate, get on board
16:34
with using AI. So I started
16:36
out to do a task. I
16:38
wanted to record, I wanted to,
16:40
on record, get a productivity increase
16:43
with timer. My plan was I'm
16:45
going to estimate how long it
16:47
would take me, announce, let's say
16:49
two hours, do it with an
16:51
agent, and I always pause my
16:53
video when the agent is thinking,
16:55
because that's a boring step. But
16:58
the timer's going to get ticking.
17:00
And at the end, I'm going
17:02
to arrive at, let's say, an
17:04
hour, maybe 40 minutes out of
17:06
two, boom, that's the productivity gains.
17:08
And 30 minutes in, I completely
17:10
failed. I had to scrap everything
17:12
that they'll, and agents wrote for
17:15
me and stuck from scratch. And
17:17
my problem was, I over-promped it.
17:19
I thought I knew what I
17:21
wanted agent to do. They were
17:23
like, there was three steps. Like,
17:25
copy of this, write this, refactor
17:27
this, and you done. And it
17:29
did it did it. iterated for
17:32
10 minutes. It was the code
17:34
me, agentic developer that we have.
17:36
When I scrapped it and started
17:38
doing it myself, I did half
17:40
of it, stopped, and realized that
17:42
the other half was not needed.
17:44
It was stupid of me to
17:46
ask. So the correct approach would
17:49
have been to iterate, do the
17:51
first half, stop, rethink, and then
17:53
decide what to do next. But
17:55
the agent was given the instruction
17:57
to go all the way. So
17:59
it went all the way. And
18:01
this is the other thing with...
18:03
thousand instructions, right? You don't want
18:06
an agent to be asked to
18:08
do something that you think you
18:10
know, but you only really will
18:12
know as you iterate through. In
18:14
these cases as well, so like
18:16
I find your experience with the,
18:18
you know, balancing how you how
18:20
you prompt it, you know, how
18:23
far the agent goes, all of
18:25
this is intuition that that you're
18:27
kind of learning. One of the
18:29
things that was interesting. We just
18:31
had, Kyle, the CEO of GitHub
18:33
on, we were talking about agents
18:35
and encoding assistants. One of his
18:38
thoughts was also around the orchestration
18:40
after you have generated some code,
18:42
right? It's one thing to create
18:44
a project, create something new. But
18:46
most of software development kind of
18:48
happens past that point, right? And
18:50
I'm curious, as someone who is
18:52
really trialing these tools day in
18:55
and day out, kind of as
18:57
your daily driver and utilizing these
18:59
things, I think that's on people's
19:01
mind is, oh, cool, like I
19:03
can go into this tool, generate,
19:05
you know, a new project that.
19:07
maybe whatever it is, you know,
19:09
you always see the demo of
19:12
creating a new video game or
19:14
whatever the thing is, right? But
19:16
ultimately, like, I have a code
19:18
base that is very massive, right?
19:20
I'm maintaining it over time. You
19:22
know, most of the work is
19:24
more on that operational side. So
19:26
in your experience with this set
19:29
of tooling, what has been your
19:31
learning, you know, any insights there,
19:33
any any thoughts on kind of
19:35
where that side of things is
19:37
heading, especially for you know, you're
19:39
dealing with, I'm sure, real world
19:41
use cases with your customers who
19:43
have large code bases, right? So.
19:46
Well, that's great. I'm so glad
19:48
that you asked, because what I
19:50
do is actually that latter aspect,
19:52
I have a mono repo of
19:54
like 20 different things in it
19:56
that could have been. separate repos
19:58
of their own. So I have
20:00
a large code base that I
20:03
work with and I actually saw
20:05
our own developer agent occasionally choke
20:07
because it attempts to read too
20:09
much and it just chokes on
20:11
like tokens and limits and things
20:13
that it can do per minute or
20:15
per hour or something. So that's one
20:18
thing. But what I find myself doing
20:20
with cursor for example, I actually pinpoint
20:22
it very actively very often because I
20:25
wanted to work with these files when
20:27
it's something specific. I'll just
20:29
point the files at it and I'm going
20:31
to ask I'm going to prompt it in
20:33
context of this three or four files and
20:36
that limits how much it's going to go
20:38
out. But really back to your question to
20:40
me it's not about code bases that much
20:42
I don't think it's going to be well
20:44
maybe if I do something greenfield and funny
20:46
it's going to write it I'm going to run
20:48
it and if it works it's all I need
20:50
like it's correct it works great Today,
20:53
and it's still a mental shift, it's
20:55
still early, I'm still looking and thinking
20:57
of the code base that I write
20:59
with my agents as code base that
21:01
will be supported by other people,
21:03
likely with the agents, but people
21:06
still. So correct by itself is
21:08
not good enough. I wanted to
21:10
be aesthetically the same, I wanted to
21:12
follow the same patterns, I wanted to
21:14
make sense for my other developers who
21:16
will come in after me. I want
21:18
it to be as if it's the
21:20
code that I have written. or at least
21:23
more or less that I have written. And
21:25
that slows me down a little bit,
21:27
clearly, I'm sure. But the other thing
21:29
is, I am the bottleneck. An agent
21:32
will take minutes, small digit,
21:34
like single digit minutes, if not
21:36
less, to spit out whatever it
21:38
spits out. And oftentimes in
21:40
code basis, it's not a single
21:43
file. It's edits in multiple
21:45
places. Then I have to come in
21:47
and read it. Here's the difference.
21:49
When I write myself. My brain
21:51
has a timeline. I was thinking
21:54
as I was typing, as I was
21:56
thinking, as I was, I know how
21:58
I arrived, that what I have arrived.
22:00
that. I may decide that it's bullshit,
22:02
you know, scrap list are over, that
22:04
happens, we're all developers, but I know
22:06
how I arrived at where I am.
22:09
When I look at what agent's agent
22:11
produced for me, I have no idea
22:13
how it arrived at where I am.
22:16
I need to reverse engineer, like, why?
22:18
What did it do? It takes time.
22:20
I tried recording it. And I can't,
22:23
because I can't speak as I think
22:25
at the same time. The other thing
22:27
is, when I was doing that video
22:30
with a timer, I sort of, I
22:32
expected certain outcomes, but I also knew
22:34
that if it works, I'm going to
22:37
say this at the end, I'm going
22:39
to say, guys, look, it took me
22:41
20 minutes, let's say 30 minutes out
22:44
of an hour. So it's 2X, right?
22:46
Literally 2X productivity improvement. Amazing, isn't it?
22:48
But here's the thing. Within the 30
22:51
minutes that I've spent, the percentage of
22:53
time I spent critically thinking... was much
22:55
higher than normal. A percentage of time
22:57
I spent doing boilerplate is much lower,
23:00
because the agents did this. I really
23:02
critically thought about what to ask how
23:04
to prompt and then analyzing what it
23:07
did, thinking what to do next, do
23:09
I aid it, do I repromptu? Can
23:11
I sustain the same higher percent of
23:14
critical thinking for the full day to
23:16
get to X in the day? Probably
23:18
I can't. So what's probably going to
23:21
happen, I'm going to get to X.
23:23
But I'm going to use the time
23:25
in between as agent work to do
23:28
something else. My day will likely get
23:30
broken down into more smaller sections. My
23:32
overall daily productivity is likely to increase.
23:35
I'm likely to do more things in
23:37
parallel. Maybe I'll do some research, maybe
23:39
I'll answer more emails, right? But it's
23:41
going to be more chaotic. Also, likely
23:44
more taxing. I don't think we've learned
23:46
yet. I don't think we've had enough
23:48
experience yet. I don't think many people
23:51
talk about this yet. People talk about
23:53
this. Some of that, look what I've
23:55
built with agent! Ah! I wonder how
23:58
they're going to talk about how they've
24:00
worked for like six months with agents
24:02
and how six months that they've done
24:05
with agents is better than six months
24:07
without and how they feel at the
24:09
end of the day. And think about
24:12
in the zone, right? Will, I hope,
24:14
like, as engineers like to be, like,
24:16
you know, disconnected emails, whatever, get the
24:19
music on, IDE in front of you,
24:21
you're in it for like two hours.
24:23
With agents, you just can't. You prompt
24:26
an agent, it goes off doing something.
24:28
What do you do? Do you like
24:30
pull up your phone? And then your
24:32
productivity increases one way, your screen time
24:35
increases the other way. It's not a
24:37
good idea. What can you do? Like
24:39
what do you do? In this minute
24:42
and a half or three of, and
24:44
you don't know how long, right? Well,
24:46
you can see the outcomes coming out,
24:49
but the agents are still spinning, like,
24:51
so I'm sorry, it's a long answer
24:53
to your question. Yeah. And that's what
24:56
I don't yet have answers for. Yeah,
24:58
but I really hope to eventually through
25:00
experiments and recording and thinking arrive at
25:03
least what it means for me because
25:05
I cannot even tell you what it
25:07
means for me yet. Yeah. Yeah, I
25:10
mean, I experienced this yesterday too because
25:12
I'm I'm preparing various things for. for
25:14
investors, you know, do updating some competitive
25:16
analysis and that sort of thing. And,
25:19
you know, I just, when you have
25:21
whatever it is, I think it was
25:23
116 companies and I like, oh, I'm
25:26
going to update all of these things
25:28
for all of these companies. Like, you
25:30
know, obviously I'm going to use an
25:33
AI agent to do this is not
25:35
something I want to do manually, is
25:37
put in all of these things and
25:40
search websites. So, so I did that,
25:42
but to your point. was like, I
25:44
could, I could figure out how to
25:47
do a piece of that and get
25:49
it running. And then I see it
25:51
running and I, you know, I realize
25:54
that this will take however long it
25:56
is, right? minutes or whatever the time
25:58
frame is. And then you context switch
26:01
out of that to something else, which
26:03
for me I think was email or
26:05
whatever. I'm like, oh, this is going
26:07
to run. I'm going to go answer
26:10
some emails or something like that, which
26:12
in one way was productive, but then
26:14
I had to context switch back. Yeah.
26:17
Like, oh. Why did I output all
26:19
these things? Or, you know, it happened
26:21
to be that I wasn't watching the
26:24
output, right? And in one case, when
26:26
I ran it, I was like, oh,
26:28
well, I really should have had this,
26:31
it output, this column or this field,
26:33
but I didn't think of that before.
26:35
And I wasn't looking because I turned
26:38
away from the agent back to my
26:40
email, right? So yeah, I think this
26:42
is a really interesting set of problems
26:45
that is more of like a new.
26:47
Yeah, it's a new way of working
26:49
that hasn't been parsed out yet, right?
26:51
And I tried not to do it.
26:54
Like I tried, but then you sit
26:56
idle. Like you literally sit idle. It's
26:58
like, and it doesn't feel good. It
27:01
feels like, oh my God, why am
27:03
I not doing anything? Yeah, it's an
27:05
interesting dynamic. That's that's for sure. And
27:08
I've definitely seen people that show, you
27:10
know, having multiple. agents working on different
27:12
projects at the same time. And that,
27:15
when I see someone with two screens
27:17
and things like popping up all the
27:19
place, I, you know, there's no way
27:22
I could in my brain sort of
27:24
monitor all of that that's going on.
27:26
It must be very taxing first and
27:29
second, half of those merge requests, pull
27:31
requests from the agents will be, let's
27:33
say, sub par, frustration and you will
27:36
rise too. Like, you would think, man,
27:38
I would have done it already myself,
27:40
like, Emotionally, it is
27:42
a very different way of working emotionally. Yes.
27:44
And I really, I can't, well, I keep
27:47
thinking, I can't forget, I advise people also
27:49
to think, not just think about productivity gains,
27:51
not just think about delegating. agents and enjoying
27:54
the results. Think about how it changes the
27:56
dynamic of your day and how you think
27:58
about it afterwards, right? Yeah, yeah, that's interesting.
28:00
So I know we're circling kind of way
28:03
back the interesting discussion, but I do want
28:05
to make sure people can kind of find
28:07
some of what you're doing with with dial.
28:10
You mentioned kind of the open source piece
28:12
of this. What's sort of needed? from the
28:14
user perspective to kind of spin this up
28:16
and start testing it in any any for
28:19
those of those that are out there that
28:21
are interested in like trying things some things
28:23
with the project. What would you kind of
28:26
tell them as a as a starting point
28:28
and like what the process is like to
28:30
kind of get a system like this up
28:32
and running? I actually not sure I can
28:35
tell for dial specific life. Nobody is running
28:37
local dials. It's not something you run locally.
28:39
It's something that you run sort of centrally
28:42
in organization of size can be different but
28:44
you expose it to your people through like
28:46
a URL that they all can go to
28:48
and use like sort of use AI through
28:51
dial and do things through dial. Interesting. One
28:53
of the apps we built as an example
28:55
earlier it was last year was like talk
28:58
to your data. But if you look at
29:00
analytics like snowflakes over the world, they all
29:02
have something like this today, like semantic layer,
29:04
which you work on, and then through semantic
29:07
layer, through prompting, and through some query conversions,
29:09
and connectors to data warehouses and data lakes,
29:11
you get yourself a chat with your data,
29:13
like analytical reports, graphs, tables. So we built
29:16
that. That was built into dial. So you
29:18
go to dial. And then, again, imagine Chad's
29:20
IPT. Imagine Chad's IPT that allows you to
29:23
choose what model you talked to, right? Not
29:25
just open the eye models, but all the
29:27
other models that exist, as well as applications.
29:29
So go to this
29:32
chat gpt, which is
29:34
in our guest dial.
29:36
You select this data
29:39
heart AI, we call
29:41
it, which is our
29:43
talk to a data.
29:45
You start talking to
29:48
it. And this is
29:50
still your dial experience,
29:52
but you're really talking
29:55
to an app that
29:57
then talks to semantic
29:59
layer. Then it's, you
30:01
know, builds queries based
30:04
on your questions, runs
30:06
them, gets data back,
30:08
visualizes it in dial,
30:11
because dial has all
30:13
this visualizations capabilities. I
30:15
explain how it's not
30:17
just text coming back,
30:20
builds your charts, and
30:22
you can interact with
30:24
it. But again, you
30:27
don't run dial locally.
30:29
If you want to
30:31
explore what it is,
30:33
I hope, I expect
30:36
that if you go
30:38
to, I think it's
30:40
rail -epam.com. EPAM
30:43
-Rail. EPAM -Rail.com. Thank you. And you're
30:45
going to read about what it
30:47
is, and you're going to find
30:49
all the links to hopefully documentation
30:51
how to, you know, but
30:54
also most companies who we
30:56
work with, they want more than
30:58
just, hey, how do we
31:00
install it? They want, and now
31:02
we want to build with
31:04
it. And that's
31:06
where we come in with
31:08
professional services, and we can
31:10
build them things for their
31:12
dial so that they can
31:15
do the AI that matters
31:17
to them in their context,
31:19
with their data, with their
31:21
workflows, with their restrictions on
31:23
things they can and cannot
31:25
do, and yadda, yadda, yadda.
31:28
Yeah. And I'm wondering for
31:30
this kind of, if you
31:32
think about this zoo of
31:34
underlying applications or assistance, I'm
31:36
wondering, because you've obviously been
31:38
working in this area for
31:41
some time, do you have
31:43
any insights or learning around
31:45
kind of easy wins for,
31:47
you know, underlying functions or
31:49
agents that can be tied
31:51
into this sort of orchestration
31:54
layer or maybe like more
31:56
challenging ones, things that you've
31:58
learned over time and in
32:00
development. and working with these things in terms of, you
32:02
know, things that you could highlight as,
32:04
you know, easy types of winds
32:06
and things that, I mean, you
32:08
mentioned the workflow stuff around some
32:10
of what isn't yet kind of
32:12
figured out, but more on the
32:15
orchestration layer and the function calling,
32:17
you know, what are some areas
32:19
of challenge or things that might
32:21
not be figured out yet that
32:23
are that you think are interesting
32:25
to explore in the future. Let me
32:27
think so because my first thought
32:29
was to so you're asking about
32:31
connecting tools and functions Yeah and
32:33
LLLM and which over the function
32:36
so what what type of connectivity
32:38
sort of is easier and yeah
32:40
Is there anything that's out of
32:42
scope or more of a challenge
32:44
currently or is it fair game
32:46
for kind of you know, I
32:48
guess it's whatever you can build
32:51
in that function? in the assistant,
32:53
but yeah, what limitations are there
32:55
challenges in that kind of mode
32:57
of development of developing these underlying
32:59
functions or tools? I see.
33:01
So it's kind of a
33:04
twofold answer. If you take
33:06
the technicality aspect, like how
33:08
do I build a tool
33:10
that does X, the complexity
33:12
is really in X. Like if you
33:15
want to go and clear a
33:17
database, how hard is that? Well.
33:19
Not hard, right? I mean, connectivity
33:21
to the database, if you have
33:23
a query, you run it, you
33:25
get results back. So it's not
33:27
hard to do the technicality of
33:29
querying a database. Making it useful
33:31
and making the result useful
33:33
in context of users prompt
33:35
and conversation is a lot
33:37
more challenging. I had this, so I'm
33:40
running a service. You can actually,
33:42
it's actually has a public web
33:44
page called API. e-pem.com. It's our own,
33:46
so you will not really go past the
33:48
front page, but you'll understand what it is.
33:51
It's a collection of APIs that we built,
33:53
my team has built, that exposes a lot
33:55
of data. Remember I set EPEM runs on
33:57
internal software. So all of those applications. they
34:00
stream their data and their events
34:02
out into a global data hub.
34:04
Think big, big, big Kafka cluster.
34:06
But that's Kafka, so you can
34:08
read data out of it as
34:11
a Kafka consumer. But if you
34:13
want to have like more modern,
34:15
you know, API, search, look-up, this,
34:17
that, so we have an API
34:20
service, all the data. And somebody
34:22
came to me today and said,
34:24
hey, have you heard of MCP?
34:26
I'm like, yes, of course, I
34:28
have. Why don't you guys built
34:31
MCP for A-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a- My answer is
34:33
it is easy to build. api.org.com
34:35
speaks RSQL. I can build a
34:37
server that will take your query,
34:39
create RSQL, LM, will be able
34:42
to do that easily, run it,
34:44
give back the data. But I
34:46
said it's not going to be
34:48
useful, because this is single data
34:50
set APIs. Your questions are likely
34:53
analytical. You likely want to ask
34:55
something that expects me to do
34:57
summary by month, this this this
34:59
and give you like a... Which,
35:01
like, that's a very different question.
35:04
So you asked me about MCP
35:06
to an API, easy to do.
35:08
Make it useful for your actual
35:10
use case, much harder to do.
35:12
I likely need to do a
35:15
lot more than just connectivity of
35:17
tool to an LLLM. I need
35:19
to understand what you're asking, figure
35:21
out the orchestration that is required,
35:23
maybe custom apps, maybe something else.
35:26
And then you start hitting authentication,
35:28
legacy apps, all the other roadblocks.
35:30
And in a way, the talk
35:32
to your data is an amazing
35:34
prototype that we built. And I
35:37
have a video about this, but
35:39
we sort of stopped because we
35:41
clearly sensed how steep the curve
35:43
is to get it to like
35:45
actual, because what we wanted to
35:48
do, what we envisioned we could
35:50
do, was analytics democratized. So you
35:52
don't have to go to the
35:54
analytical team, ask them to build
35:56
you a new PowerBI report. And
35:59
them spending a week doing so,
36:01
you can just come into dial
36:03
and say, hey! you know, show
36:05
me this this and this. And
36:07
yes, we technically can do it.
36:10
But to be able to do
36:12
this for all kinds of questions,
36:14
you can ask about our data,
36:16
that's a much harder thing to
36:18
do. So yeah. And it also,
36:21
yeah, to your point, underlying systems
36:23
might have limitations. I think in
36:25
analytics related use cases that we've
36:27
encountered with our customers, you know,
36:29
often I'll just ask the question
36:32
around. Hey, if you gave this
36:34
database schema or whatever it is
36:36
to, you know, a reasonably educated,
36:38
you know, college intern or whatever
36:40
that is, and you ask, you
36:43
know, what columns would be relevant
36:45
to query based on this, you
36:47
know, based on this natural language
36:49
query, you know, you can pretty
36:51
easily tease out well. I look
36:54
at all these columns I have
36:56
field 157 and custom underscore new
36:58
underscore field you know there's no
37:00
way for just someone off yet
37:03
you know to know anything about
37:05
that and so it's not really
37:07
a limitation of what's possible in
37:09
terms of the technicality like you
37:11
said it's more of you know
37:14
you're not always set up for
37:16
success in terms of utility like
37:18
you mentioned. And for data that's
37:20
where semantic layer comes in. So
37:22
if you have. descriptions of your
37:25
columns, of your tables, with business
37:27
meaning, then connecting that semantic layer
37:29
with some data samples to ILM
37:31
will allow it to write the
37:33
query that you thought was impossible
37:36
to write because it is impossible
37:38
without the semantic layer sort of
37:40
can explain the data that you
37:42
have in business terms. in the
37:44
language that the questions will be
37:47
asked of your assistant. And that's
37:49
what allows us to do this
37:51
talk to your data analytics. Yeah.
37:53
Well, I know that we've talked
37:55
about a lot of things. I
37:58
think you are probably seeing a
38:00
good number of use cases across
38:02
your clients at EPAM and also
38:04
your own experiments with dial and
38:06
other things. I'm wondering as you
38:09
as you kind of lay in
38:11
bed at night or whenever you're
38:13
thinking about the future of AI
38:15
or maybe it's all the time
38:17
or maybe it's not at night,
38:20
but yeah, as you kind of.
38:22
See what is, to your point,
38:24
just bringing it all the way
38:26
back to the beginning, you see
38:28
what is possible to do now,
38:31
which even six months, a year
38:33
ago, whatever it was, you know,
38:35
was not possible. What kind of
38:37
is most exciting for you? or
38:39
most interesting for you to see
38:42
how it plays out in the
38:44
next, you know, six to 12
38:46
months. What is kind of constantly
38:48
on your mind of where things
38:50
are going? Sounds like, you know,
38:53
the how we work with these
38:55
tools is one of those things.
38:57
We already talked about that a
38:59
little bit, but what else is,
39:01
you know, exciting for you or
39:04
encouraging in terms of how you
39:06
see these things developing? My answer
39:08
may surprise you. you know, think
39:10
or anticipate any new greatness to
39:12
come. I actually mostly worry. And
39:15
I worry because I know that
39:17
my thinking is linear. Like most
39:19
of us, even though looking back
39:21
we know that technology has been
39:23
evolving rather exponentially, our ability to
39:26
project into the future and think
39:28
what's coming next is linear. So
39:30
I am unlikely to properly anticipate.
39:32
and get ready for and then
39:34
expect right and wait for what's
39:37
to come. I am sure to
39:39
be surprised and I guess as
39:41
everybody else I'll be doing my
39:43
best to hold on to not
39:46
fall off. So I worry seeing
39:48
how the entry barriers rise. It's
39:50
harder for more junior people to
39:52
get in today. When I'm asked
39:54
about skills I recommend that people
39:57
focus on as far as trying
39:59
to be better prepared for the
40:01
future, I always answered with the
40:03
same things. I always say fundamentals
40:05
and then critical system thinking. And
40:08
fundamentals you can read about a
40:10
lot. but you really master them
40:12
when you work with them yourself.
40:14
Not when someone else works with
40:16
them for you. And not having
40:19
them is likely going to constrain
40:21
you from being able to properly
40:23
curate and orchestrate all this powerful
40:25
AI agents. And when they get
40:27
so powerful they don't need you
40:30
to curate and orchestrate them, then
40:32
what does it do to you
40:34
as an engineer? And maybe that's
40:36
not the right thinking, but this
40:38
is what. I think about at
40:41
night like you asked when I
40:43
think about AI and what's coming.
40:45
I am excited as an engineer.
40:47
I like using all of this.
40:49
I just don't know how it's
40:52
going to reshape the industry and
40:54
how it's going to change my
40:56
work, you know, in years to
40:58
come. Yeah, well, I think it's
41:00
something even in talking through with
41:03
you kind of some of the
41:05
work that you and I have
41:07
been doing with agents and how
41:09
that really has triggered a lot
41:11
of questions in our own mind
41:14
of what is the proper way
41:16
of working around this and I
41:18
think there is going to be
41:20
a what you know that is
41:22
a widespread issue that people are
41:25
going to have to navigate so
41:27
yeah I think it's I think
41:29
it's very valid and we'll We
41:31
will be interested to see how
41:33
it develops and would love to
41:36
have you back on the show
41:38
to have your learnings again in
41:40
six or 12 months of how
41:42
it's shaking out for you. Really
41:44
appreciate you joining. It's been a
41:47
great conversation. Thank you very much.
41:49
It's been the pleasure. All right,
41:51
that is our show for this
41:53
week. If you haven't checked
41:55
right, that is
41:58
our show for
42:00
this week. If
42:02
you haven't checked
42:04
out our changelog
42:06
newsletter, head to
42:09
changelog.com slash news.
42:11
There you'll find
42:13
29 reasons, yes,
42:15
29 reasons why
42:17
you should subscribe. I'll
42:20
tell you reason number 17, you
42:22
might actually start looking forward
42:24
to Mondays. Sounds like somebody's got
42:26
a case of the the Mondays. 28
42:28
more reasons are waiting for you
42:31
for you at.com/news. Thanks again to our
42:33
partners partners at fly.io to break cylinder for
42:35
the beats and to you
42:37
for listening. That is all for
42:39
now, but we'll talk to
42:41
you again next time. now, but we'll talk
42:43
to you again next time.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More