Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:06
Krishna Gade: Welcome and thank you everyone for joining us today's AI Explained.
0:11
Today's topic is AI Security and Observability for Agentic Workflows.
0:17
Everyone touts that this year is going to be the year of AI agents.
0:23
Let's see how we need to address these issues.
0:27
I am your host today. I'm Krishna Gade.
0:30
I'm one of the Founders and CEO of Fiddler AI.
0:34
Again, please put your questions in your Q&A box at any time
0:37
during the fireside chat. Today's session will also be recorded and sent to all the
0:41
attendees after the session. Okay.
0:44
So with, without further ado, um, I want to welcome Karthik Bharti, um,
0:50
General Manager for AI Ops and Governance for Amazon SageMaker AI at AWS.
0:57
Karthik, um, if you could turn on your camera, um,
1:02
Karthik Bharathy: Hey Krishna Krishna Gade: to a thank you.
1:04
Welcome to AI Explain. So it's just a brief bio of Karthik.
1:07
Karthik is a leader with over 20 years of experience driving
1:10
innovation in AI and ML. Um, as a General Manager for AI Ops and Governance for SageMaker AI, Karthik leads
1:16
the development of cutting edge generative AI capabilities in Amazon SageMaker AI.
1:23
Karthik, uh, you know, thank you so much for joining us.
1:25
Uh, maybe let's start you know, with your background.
1:28
You know, how has your role in AI Ops and Governance at AWS shaped your
1:33
perspective on you know, monitoring and securing AI workflows in the enterprises.
1:39
Karthik Bharathy: Yeah, that's a great question, Krishna. And, um, if I think about how AI Ops and governance has evolved over the years,
1:49
um, in fact, a lot of the changes has been in tandem with innovations that we've
1:56
seen in AI ML over the last few years, you know, starting with traditional ML
2:02
systems to more recently GenAI and Uh, agentic workflows as as you aptly put,
2:09
um, and throughout these years from what I've seen, um, there are three things
2:14
that that stand out really one is, um, security and governance are built
2:21
into ML workflows from the ground up.
2:24
It's, it's not an afterthought anymore.
2:27
Um, what essentially that means is, uh, enterprises are thinking about,
2:32
uh, robust data governance techniques, access controls, and, and how they, uh,
2:38
incorporate audit trails from day one.
2:42
Um, and, and effective security isn't about just, uh, you
2:46
know, protecting your models. It's about creating a comprehensive system that includes, uh, uh, looking at your,
2:53
uh, monitoring in an automated manner.
2:55
Uh, doing version control and, and also having audit trails.
3:00
The, um, second one I'll call out is, uh, the need for 窶各nd-to-end observability,
3:08
um, and this is across both the data and the ML workflows, um, right from,
3:14
you know, how data is ingested, how you can have lineage starting from all the
3:18
way to data to ML, um, and all the way to, uh, observability during, um, model
3:24
deployment to look for drift and so on.
3:27
Um, and, and finally, I would call out the third thing is, uh, while all these
3:32
sophisticated tooling is in place, um, you want to have the necessary, um,
3:38
human element, um, to sort of oversee the process, uh, while it's automated,
3:43
there are critical junctures where human, uh, oversight is needed, and that
3:48
helps in the decision making process. Krishna Gade: Awesome.
3:52
So, uh, you know, being at, you know, at the helm of SageMaker, you're probably
3:56
seeing the current state of AI in the enterprise, you know, it's adoption.
4:00
Um, you know, how would you describe it? You know, did you shed some light for our audience?
4:05
Karthik Bharathy: Yeah, yeah. Um, I, I think if you look at it again, the last four or five years, right, the,
4:11
the enterprise landscape is evolving.
4:14
Uh. pretty rapidly, right? And you can notice, um, several distinct patterns, right?
4:20
Um, and for what's worth, like, we are in the third year
4:24
of generative AI was, right? Uh, I think the first year was more around, hey, there's this cool thing,
4:29
like, what can GenAI do, right?
4:31
But last year, Based on customer conversations, we saw that, um, customer
4:37
conversation moving from, "Hey, what is GenAI" to, "Hey, is this right for me?"
4:43
And how can I adapt this, um, into having a real impact for my business?
4:49
Um, and this year, um, we are hearing customers want
4:53
to go big with generative AI.
4:55
You know, both in terms of going wide and going deep and, you know,
4:59
deploying these systems at scale. and also leverage the promise of agentic AI that can create
5:06
tangible business value, right? And as, as we see more of these systems being developed, like AI systems,
5:16
we, there is a need to integrate these different AI systems so you can
5:23
orchestrate more complex workflows and while at the same time you want to keep in
5:29
mind aspects of security and reliability.
5:32
So that's definitely one trend and the other one I would call out is as
5:37
you bring in these systems and want to make complex decision making, you
5:40
want to do so in an automated manner.
5:44
While keeping in mind, hey, there is transparency and accountability, right?
5:48
So there is increasing, uh, increasingly customers are looking for ways
5:54
to have human oversight and they want to scale their AI operations.
5:58
That's right. Yeah, especially in the. Regulated industry, which we play in, uh, there is some, um, cautious approach
6:05
behind, you know, you know, with respect to the usage of generative AI or AI
6:09
agents, like the whole human in the loop. Krishna Gade: Um, so I guess that's that begs the question, right?
6:14
What potential are you seeing for these agentic AI systems?
6:17
You know, how are they going to transform the business operations?
6:20
Any real life examples that would be amazing.
6:23
Karthik Bharathy: Yeah, yeah, I think there are quite a few, right?
6:26
And, um, let me first break it down into sort of the different patterns we see,
6:31
um, based on the customer conversations in AWS, and then sort of look at, um,
6:37
examples for each of those, right? Um, so with agentic AI, um, And the business value that provides it fall
6:46
largely in three different categories.
6:49
Um, the first one, um, would be using agentic AI to accelerate,
6:56
um, workplace productivity, right?
6:58
So think of these as, um, day to day repetitive tasks that employees
7:04
are doing, and they want to automate this and, and gain the advantage,
7:09
uh, of using such an agentic system.
7:12
Right. A good example is NFL media.
7:16
They use business agents today to help their producers and editors to
7:21
accelerate their content production. They have a research tool that allows them to gather insights from
7:29
video footage from a specific place.
7:33
And, um, what essentially that provides is, uh, when, when you're
7:38
onboarding a new hire, it reduces the training time, um, by up to 67%.
7:43
And, um, um, and when employees, uh, their employees ask questions, um, about what,
7:49
what's going on, that can be surfaced in less than 10 minutes, uh, of what
7:54
used to take, um, close to 24 hours.
7:57
So that's one such example.
7:59
And, uh, more closer to the to the software world.
8:02
We're all familiar with coding assistance.
8:06
And many of you may have already used coding assistance
8:08
in one shape or the other. Um, and and largely, well, they help with, um, building better code or providing.
8:17
Um, documentation or explaining, um, existing code, it's, it's not just
8:21
about the code itself, but it's, it's more about automating the entire
8:25
software development lifecycle, um, including, you know, upgrading software.
8:30
Um, or, um, modernizing a legacy application for
8:34
Krishna Gade: Migrating to new languages. Karthik Bharathy: Absolutely.
8:36
Absolutely. So case in point, um, within Amazon, we had, um, these agents for
8:43
transforming our code base from an older version of Java to a newer version.
8:48
And, uh, there was savings of, you know, a mammoth, like 4, 4,500
8:53
developer years worth of effort, right? Roughly translate to, um, you know, 260 million annual CapEx savings.
9:01
Um, so that's that's the first trend I would think in terms of using it
9:06
to accelerate workplace productivity. The second one would be in transforming business workflows
9:14
and uncovering new insights, right?
9:17
What I mean by that is, uh, as enterprises are adopting agents, they
9:22
want to streamline their operations and gain insights on their data, right?
9:27
Um, and the example that comes to mind is, is.
9:31
Of cognizant. They're using business agents to automate mortgage compliance workflows, and they've
9:38
seen improvements of more than 50 percent that reduces, um, uh, errors and rework.
9:44
Um, similarly. Um, Moody's is another great example.
9:48
They've used a multi agentic system, um, that looks at
9:53
generating credit risk reports. Um, and again, the benefit is, uh, what used to, uh, take humans about one
10:02
week to generate a specific report is now cut down to just one hour, right?
10:06
So that's the magnitude of, um, impact that, that customers are seeing.
10:11
Finally, the third one I would call out is more in the research area
10:15
that's sort of, uh, fueling, you know, industry transformation and innovation.
10:20
Um, a good example there is from Genentech.
10:24
Um, they've deployed a genetic solution running on AWS, um,
10:29
and, and they're improving their, uh, drug research process.
10:32
So what they've done is, uh, Their solution roughly automates, um, you
10:38
know, huge, about five years worth of research, right, across different
10:42
therapeutic areas, uh, right, and then it, what it does, it helps them speed
10:46
up the, uh, drug target identification and also improve their research
10:52
efficiency, um, ultimately leading to, you know, faster drug development.
10:55
So, um, net net, we're seeing systems, agentic systems deployed
11:00
broadly in these three categories. Krishna Gade: Absolutely.
11:03
So it's like workplace productivity, business transformations, and
11:06
then, you know, new, you know, new, new product innovations.
11:09
Um, so one thing that you mentioned and business transformations, you
11:13
know, you mentioned a few examples, especially like generating credit
11:17
reports and claims processing, right? These are, you know, high stakes AI use cases.
11:21
So there is a need for, you know, security, transparency into
11:25
how, you know, AI is working. You know, what are some of the challenges that we're trying to address.
11:29
Yeah, you think our organizations are facing when they're implementing
11:33
agentic workflows for these, you know, for these use cases or in
11:35
general, other use cases too? Karthik Bharathy: Yeah, yeah, I think that's a, that's a great call out, right?
11:41
So, um, while you're looking at these, uh, systems and I think
11:45
they're definitely Um, you know, security and visibility challenges
11:50
that organization need to look into.
11:53
Um, I'll, I'll call out a few that, that we have seen and, uh, by no means
11:57
this is comprehensive, but it, it sort of comes down to, um, the stage of
12:02
the ML workflow, if you will, right?
12:04
And, uh, If you think about it at the very beginning, when you're trying to
12:08
use a specific model, um, it's, it's quite possible that the data that's
12:14
you being used, either to train a model or, you know, fine tune model, use rag,
12:19
whatever technique that you use, um, to use a data that's, that's not authentic.
12:23
And this might just compromise the performance of the model.
12:26
That's definitely. Um, you know, it's concerning at the same time, harder to detect, but
12:32
until the model is being used and the interactions that are going on.
12:35
So that's one category. Um, the second would be when, um, you know, the model is being used,
12:43
and again, it depends on the model. And in the case of a proprietary model, where the model weights are not exposed.
12:50
Um, it might be an actor trying to attempt to reverse engineer saying, what is the
12:55
specific, um, uh, weights that were used at what level and so on and so forth.
12:59
And that essentially, um, um, you know, exposes.
13:05
The how of the model, if you will, um, and the third one, I would think
13:09
is when the model is actually being used, um, and there can, you know,
13:14
actors can attempt to, uh, extract information, which otherwise the model
13:19
would not emit, uh, it might be sensitive information about the training data,
13:23
it might be information that may not be what the model is intended for, or
13:27
the use case that's being deployed. Um, so.
13:30
Um, net net, I think, um, uh, organizations would need to protect
13:35
against these model weights.
13:37
Um, you know, how necessary controls around, um, uh, access,
13:42
um, you know, ensure that there's data privacy and so on.
13:45
And more importantly, uh, ensure that there's this
13:48
observability that's 窶各nd-to-end. So you can, you are having necessary checks to see how the model is performing.
13:54
Um, and more often than not, you probably have a sandbox environment
13:58
where you're testing it, have tooling, you know, there are a few tools like
14:02
Bedrock Guardrails is an excellent tool. So you sort of incorporate that, you know, Fiddler has an observability tool as well.
14:08
So these provide sufficient insights into what is going on in the system, being
14:13
it agentic or an automated workflow, and you sort of take actions based on
14:17
Krishna Gade: Absolutely. So I think you touched upon a few things like, you know, uh,
14:21
adversarial attacks on models. And now there's this whole, um, field of AI security and model security coming up.
14:28
Um, you know, I remember like, I think a few, few weeks ago when DeepSeek
14:31
launched, everyone was producing benchmarks about how accurate it is
14:35
or how close it is to accuracy when it comes to closed source models.
14:39
But it was pretty vulnerable for security attacks.
14:41
People were able to easily, you know, make it, uh, leak PII content
14:45
and whatnot in RAG workflows. So how, how, how do you think about, you know, uh, what, what are some of
14:51
the, uh, you know, best practices that organizations should think about AI
14:55
security and what, what, what are some of the, you know, how do you think about
14:59
that versus application level security in general that has been around for a while?
15:04
Karthik Bharathy: Yeah, I think, um, at the end of the day, you need a
15:09
comprehensive security approach, right?
15:12
You want to operate at the different levels.
15:15
Um, you mentioned about model level security, right?
15:19
So let's start from there. Um, so when you're thinking about the model, um, like I mentioned, you want
15:25
to protect the model weights, right? Um, and in addition to model weights, you want to protect
15:32
the, um, access to the data.
15:34
Um, you know, ensuring that the data is, is, is authentic and so on.
15:39
Um, and to address these, you would, you would use, like, you would encrypt where
15:43
the model is being stored, the actual file um, or, uh, to your point on adversarial
15:48
examples, you would, you would have a test environment where you would exercise
15:52
the model, monitor its output for, for some of these adversarial examples.
15:57
And um, at the end of the day, you need continuous monitoring, right?
16:01
Um, just to look at the input and output patterns, but also look for
16:04
drifts, drifts in the model, drifts in the data, and have necessarily, uh,
16:09
necessary, um, uh, alerts, so you can trigger, like, a retraining, for example.
16:14
So that's at the model level. Um, at, at the application level, I think, um, there, there are the well
16:21
known security practices, like, you know, you enforce access controls, you have
16:25
encryption in place, um, You have logging off the interaction patterns and so on.
16:31
Um, but in addition to that, tooling is often needed.
16:34
Uh, like I mentioned, the Bedrock example, uh, Bedrock Guardrails example earlier.
16:39
You, you want to think about how you audit certain topics, um, be it at
16:44
an input level or the output level. What is relevant to your use case?
16:48
What should not be emitted? Or if there's certain information that's being emitted like a
16:54
PII data, how do you redact the information and so on and so forth.
16:57
So I think net net, the two layers of model security and application level
17:03
security need to integrate seamlessly.
17:06
So in many ways, uh, uh, these are complementary than treating
17:11
it as separate constants. Krishna Gade: Awesome.
17:14
That's great. So I guess, uh, you know, we talked about, uh, a little bit about, uh,
17:19
some high stakes use cases, right? So when it comes to, uh, you know, transparency of AI decisions for these,
17:25
you know, for regulators or business stakeholders, how do you think, you
17:28
know, this is going to change when, you know, agents come about and, um,
17:33
you know, and organizations employ agentic workflows and what happens
17:37
to the transparency behind AI?
17:42
Karthik Bharathy: Yeah, I think fundamentally, um, enterprise would
17:48
benefit from having a governance model.
17:53
That's, that's more federated, right?
17:55
Meaning you have standards, policies in place, that sort of dictate
18:02
how these systems need to be developed across the organization.
18:08
But at the same time, you want to provide enough flexibility, uh, where
18:14
each team or business unit can adapt these standards in a way that they can
18:20
implement for their specific use cases.
18:22
So that's sort of the trade off. And it's, it's a good one, uh, in the sense that you want
18:28
to provide the flexibility. Um, of developing these different systems across different units.
18:33
Um, and there are, again, tools, like, for example, uh, just purely taking
18:38
the example of SageMaker here, you have SageMaker projects where you can
18:42
automate, um, the ML workflow, say, how should it be standardized, what
18:46
pipelines do you need to use, what models, and what quality, and so on.
18:51
Krishna Gade: So the governance is like both a tools problem as
18:54
well as a people's problem, right? Like, you know, essentially, you know, many companies do not have,
18:58
The governance structures today, you know, to sort of ensure that
19:01
you know, AI is tested, monitored and secured and securely operated.
19:05
You know, what, what are some of the best practices that you have seen
19:08
in terms of, you know, customers employing AI governance today across,
19:12
you know, different business units? Karthik Bharathy: I think fundamentally, um, at the highest level of abstraction,
19:19
you have, um, you know, business stakeholders, like the so called risk
19:23
officers, if you will, who understand the domain of what is being developed,
19:29
and they would enforce certain standards on what needs to be Um, at her too.
19:36
And it's important that they work in tandem with the technical team
19:39
who are well versed with what's being done with the model, right?
19:43
For example, a model may have a toxic score of like 0.1.
19:48
But what does that mean is from a use case perspective, whether this
19:53
model can be approved and deployed an organization is very specific
19:57
to the domain they're operating. Um, I think successful organizations have a good mix of both where, um, you have the
20:07
necessary tooling, um, where these, uh, different levels, for example, toxicity
20:12
is being, uh, uh, monitored for, they're being documented, uh, either through a
20:17
model card or you have enough properties, maybe in a model registry, for example.
20:22
And this translates into visibility from the risk officer who can effectively
20:28
say whether this model or the system is approved for deployment or not.
20:32
So the two systems working together, I think, definitely is a recipe for success.
20:37
Krishna Gade: Got it. So are there any specific metrics that you recommend that organizations need to
20:42
track, like whether it's about security or, you know, governance of AI, you
20:47
know, when they're testing it or, you know, when deploying to production?
20:51
Karthik Bharathy: Yeah, so if you look at the metrics again at the technical
20:57
level, you have a set of metrics right at the most foundation level.
21:02
Um, you know, if you have to document it, document what the
21:05
model is doing as a model card. You would look at, um, the purpose of the model, what data it's trained on,
21:12
uh, what is the validation rules, what is the quality of the model, and so on.
21:16
Um, and going a little bit beyond that, um, you may want to document
21:21
how the model is, um, uh, emitting or predicting a response, right?
21:26
So, for example, with, with, with, you may want to look at explainability.
21:31
Approaches like you may look at a SHAP score, for example, or you may look at
21:34
a LIME score, for example, and these may be documented with the model that
21:38
those are good metrics to look at. And again, with GenAI, you can look at additional metrics around
21:44
toxicity, fairness, and so on. You can test these models.
21:49
You can have periodic evaluations on the level.
21:52
of these metrics and test it against, um, standardized data
21:56
sets that are available today. Or you can use custom data sets that are very specific to your, um, use case.
22:03
Um, and then again, at the business level, you want to interpret these
22:06
as saying, uh, with a combination of these, uh, objective metrics.
22:12
How does the subjective standards and policies play in and what does
22:16
that mean from a risk perspective? Krishna Gade: So there is always this tension, uh, within the
22:22
organizations to adopt AI faster.
22:24
Versus doing it right, right? So there's this like, you know, how do you make sure you do it properly
22:29
so that you don't get into trouble? Like what, how should like, you know, organizations think about
22:35
like, you know, this balance? Karthik Bharathy: Yeah, I think um, that, that's a key one, right?
22:41
I think there's no one easy answer, if you will, right?
22:43
And the, and the key to balance, uh, robustness in, in having those
22:48
security controls with the operational, uh, efficiency lies in having some,
22:54
having the right guardrails, right? Instead of, uh, creating or looking at the problem as saying, "Hey, here's
22:59
one way to do it," or one set of, "Hey, this is risky versus non-risky."
23:04
You're probably looking at, uh, a, uh, a set of, um.
23:09
A range of values, if you will, right in terms of how to look at risk.
23:13
Um, a good example would be, uh, let's say you have the model or the system deployed.
23:18
Um, and you notice that certain changes introduce a higher risk.
23:23
Um, it's better to trigger additional approval workflows, um, rather than,
23:30
um, you know, just waiting on it and saying, here's a single way to do it.
23:34
In contrast, if the same set of changes result in a relatively lower risk,
23:40
Um, you may want to proceed through standardized approvals instead of, you
23:44
know, requiring additional approvals. Um, a good example again would be, let's say, there's a drift in the model,
23:50
right, which is fairly common and you have an observability solution in place.
23:55
If the drift is, um, not significant from the current state of the model,
24:00
you may be okay with treating that as an alert and being in the know how of
24:05
what is being happened and you may just trigger a retraining of the workflow.
24:09
But on the other hand, if the drift is significant and it exceeds what
24:13
is the threshold that you've defined, um, you may trigger additional
24:17
approvals or in, in, in some extreme cases, you might even consider
24:21
rolling back to the previous version.
24:24
Uh, so those are, uh, different options that you can consider and, and, and the
24:28
key is to maintain that configurable. So you can trade off between the, the rigor and the robustness of security
24:34
control with the efficiency that it Krishna Gade: Right.
24:37
So when it comes to evaluation of AI, right? So in the past for classical machine learning, you could do things
24:42
like ROC curve, AUC scores, you know, precision recall, and maybe
24:46
even do like SHAP, SHAP plots and understand the feature importance.
24:50
But now with in a generative AI and agentic workflows, evaluating the
24:54
performance is not straightforward, right? You know, there's no ground truth.
24:58
So how do you, you know, can you shed some light on like, you know,
25:01
how, uh, customers are going about this, you know, in, in, in sort
25:05
of in the, in, in, in sectors that you have been exposed to so far and
25:08
what are some of the best practices? Karthik Bharathy: I think the ones that I've seen are the areas where
25:15
customers are exploring are, um, evaluating the system 窶各nd-to-end, right?
25:21
There's no one unique metric like going back to the example
25:24
that mentioned earlier. Um, concretely, you can think off having a pipeline, um, that that triggers, um,
25:33
on either manually or on a periodic way.
25:36
And that evaluates the model on certain dimensions, right?
25:40
Um, and, and evaluation is sort of a broad topic.
25:43
But, um, if, if there are certain aspects of the model that you want,
25:48
let, let's say, be it fairness. Or, um, uh, robust, um, toxicity, for example, like you can look at
25:55
evaluating, for example, a model against a toxigen model and seeing, hey, if
26:00
these inputs were send to the model.
26:02
What is the output? And once you know the expected output and the actual output, you
26:07
can actually see the difference. Okay, the model is working on expected lines.
26:12
Therefore, this is the score that you want to assign for
26:14
that particular category, right? So developing that comprehensive pipelining workflow and making sure you
26:21
have observability and each of the places and saying as a system, you do it first
26:26
at the model level, and then you do it at the system level when there are multiple
26:29
models interacting with each other. And then saying, given the behavior of the system, what what is the sort
26:36
of the score that you want to, in some cases, you know, you can be creative
26:41
and creating a composite score. It purely depends on how how much weight that you assign to each of
26:47
these individuals score to create this composite score and how you gauge that
26:51
composite score with respect to use case. Krishna Gade: Especially for agentic workflows when in some cases when they
26:58
are automating the decision process right in the enterprise space, there
27:02
is a need to measure like whether the decisions are optimal or not.
27:05
You know, uh, it's a pretty hard problem.
27:08
Uh, any, any, any thoughts on that?
27:10
Like, you know, for example, you know, this.
27:13
So take, take the example that you mentioned, like the claims
27:15
processing workflow, right? Like which, which was probably like much more manual in the past.
27:20
Now it's like, you know, automated.
27:22
How, how can, how can, you know, customers measure like, you know, if it's
27:26
working properly and if it's actually working optimally for the business?
27:30
Karthik Bharathy: Yeah, while while you can have, um, you know, objective
27:35
metrics at the end of the day, it's the business use case, right?
27:38
And I think, um, it would involve human, um, in, um, human processes or seeing
27:46
the sort of outputs from the system.
27:49
Um, and I think that the key is to have the necessary hooks in place, right?
27:56
For example, while on one end you want to enforce controls on like
28:01
what data is being accessed or what output is being generated or like
28:05
what toxicity is the scoring or the evaluation model being done, you want
28:10
to make sure there's human insight. Um, and every decision, especially in the early phases of when the system is
28:17
deployed, you want to have these human evaluation on on the system output.
28:21
Um, more importantly, you also want to have some sort of a pause switch, if you
28:26
will, to say that if the model deviates from the known patterns, what is the
28:32
way to quickly have the humans come in and have this pause switch or even a
28:37
kill switch for that matter to make sure that corrective actions can be taken.
28:43
Krishna Gade: Yeah. And so, so I think basically, you know, this might change from
28:46
industry to industry, right? So, you know, like for example, you know, what do you want to measure or
28:51
what do you want to control around AI?
28:55
Can can be different for different domains, you know, have you seen any,
28:59
any sort of, uh, insights, like, for example, finance versus healthcare
29:03
versus like, you know, you know, some other industries, like what do what they
29:07
make care about in terms of, uh, the measuring and putting security controls.
29:13
Karthik Bharathy: Yeah, it's, it's, um, more than the industry.
29:17
I think, um, like you call out, it also depends on what set of policies
29:22
and standards they're hearing to. And then, yes, it goes by also the regions in which they are like the EU
29:29
AI Act or of ISO 42001, the different regulations that that come in.
29:35
So there's no one size fits all, but the more effective use cases that
29:41
I've seen, or the ones that have been deployed successfully, factor in both
29:45
the subjectiveness of The standards that require you, uh, to adhere to
29:51
certain things like, hey, where the data are stored, um, and sort of answer
29:55
the different questions related to the standard along with the objectiveness
29:59
of the metrics that's being tracked. Um, so the more successful use cases, um, uh, and they do vary across like
30:06
the healthcare and financial services. Um, and, you know, even in the case of retail, there are examples where
30:12
a combination of the two is needed. Krishna Gade: So what are some of the warning signs that, you know, one
30:19
can, one can like actually see that an agentic system may have security
30:25
vulnerabilities or monitoring gaps? Like how can an organization be aware of that?
30:30
Karthik Bharathy: Yeah, I think the first one to look for, um, um,
30:37
is, is the data quality, right? You want to make sure, um, you know, the, the model, um, data input and
30:45
the model, what, what it's trained on is, uh, secure and robust.
30:48
That's, that's, uh, that's important.
30:51
And once you have those in place, um, I think you want to have an effective
30:57
testing strategy, um, to ensure that you, you defend against adversarial attacks.
31:03
Um, so there's even, even if, for example, there's a manipulation in the input, you
31:08
want to make sure that the security of the model and the system is taken care of.
31:12
Um, and then the one that we talked about about on model drift and looking
31:19
for any degradations and performance.
31:21
So continuously monitoring and looking for those key parameters is important.
31:26
Um, and from, uh. The system application standpoint, um, you want to ensure that,
31:33
uh, the API endpoints are. Are, uh, secured, um, again, data transmission is secure
31:39
and so on and so forth. And you have robust, um, controls for both the authentication
31:44
and authorization piece. Um, at the end of the day, I would think of it as, uh, as an employee, right?
31:50
An employee badges in and the employee in many organization
31:55
badges out of the building as well. And the next time you come in, you badge in again.
31:58
So you sort of re authenticate and make sure.
32:01
That, you know, you are aware of, like, this person who is authorized
32:05
to do this particular job. It's very similar to an agentic system.
32:09
Um, so you want to ensure that, um, another one that comes to mind is, uh, um,
32:15
the principle of least privilege, right? You, you provide access only when that's needed, right?
32:21
And very similar, again, to the employee example that I called out.
32:25
Um, an employee may not have access to all data, but when it's needed, You sort of
32:29
ensure that, hey, the person who really needs that information has access to it.
32:33
So those would be some signs to look for when you're designing situations.
32:37
Krishna Gade: Got it. So there's an audience question here. So any specific frameworks, tools you're using for agentic workflows
32:42
to evaluate robustness and accuracy? This is probably a good time to talk about our partnership
32:46
between SageMaker and Fiddler. You know, can you share your thoughts on that?
32:51
Karthik Bharathy: Yeah, no, absolutely. Like, we're thrilled to be working with Fiddler.
32:57
And at the outset, you know, partnership is something that's absolutely critical
33:02
for AWS and SageMaker specifically.
33:05
Um, as we look at extending the core AI/ML capabilities, um, and
33:11
to provide specialized solutions for different industry needs.
33:14
I think, uh, partnering with a company like Fiddler is absolutely paramount.
33:19
Um, and what the intent is really simple, right?
33:22
We want to make sure the best of class solutions are available
33:25
to our customers, right? So with Fiddler, we've combined the power of SageMaker AI, where you can train
33:31
and deploy your models with Fiddler AI, Fiddler AI, which brings in observability
33:36
to monitor and improve the ML models. So, so net net customers have a one click way to do observability with SageMaker AI.
33:45
Um, and this experience is available in SageMaker Unified Studio.
33:49
It provides a seamless experience and, and I'm pretty excited about
33:54
how customers can use these two capabilities in a seamless manner.
33:58
Krishna Gade: Absolutely. Yeah, we share the same excitement.
34:00
And for those of you who are on AWS SageMaker today on this call, feel
34:04
free to, you know, use the one click experience and one click integration
34:08
that we built together with working with AWS with Fiddler for monitoring
34:13
and evaluation of your AI models. So let's actually, you know, maybe take a few more audience questions here.
34:18
There are some questions around different industries.
34:21
There's a question actually about code migration.
34:23
We touched, touched upon it early in the call.
34:26
What are some of the best practices for verifying large code changes or
34:30
migrating from one language to another? This is, I think, using AI based code migration.
34:35
Karthik Bharathy: Yeah. Um, I think the specifics actually depend on the language itself, right?
34:41
Depending on whether you're looking at a more modern language or like a traditional
34:46
language, like COBOL, for example, right? Um, so I think given that
34:53
While the migration is being assisted, you want to look for, um, patterns of
34:59
like translation between the two systems.
35:02
Sometimes the logic may be inherently complex, so there's human in the loop,
35:07
there's assisted AI that comes into play.
35:09
Um, you should definitely try out some of the tooling that's already available.
35:13
Um, with Amazon queue, um, we recently launched a train went the ability to
35:19
look at the system workflow 窶各nd-to-end.
35:21
Um, and there are obviously pieces around security that that's very
35:26
specific to the organization as well.
35:29
Um, so in terms of best practices, I believe there's also a,
35:33
um, detailed documentation. Um, we can find a way to share that with you.
35:38
Um, on, on what does need to be looked at as, as you do
35:45
Krishna Gade: And so, uh, there's another question on like specific industry.
35:48
Could you shed some lights on, on business use cases within financial services
35:52
or FinOps and plus, uh, you know, uh, where AI observability makes sense.
35:57
Karthik Bharathy: Yeah, I think there are quite a few.
36:00
Um, you know, the top two or three that come to mind are the automated
36:05
financial reporting that I called out. Um, you know, I mentioned about, uh, uh, Moody's use case about generating
36:13
credit reports or cognizance use case about mortgage compliance workflows.
36:18
Um, demand forecasting, uh, is another one, um, that's, it's sort
36:23
of relevant in the context of, uh, financial services as well.
36:27
Um, and more generally, I would say incident management that applies
36:31
across different industries is, is also relevant as you look at more data.
36:36
And you want to uncover insights from that data.
36:40
Krishna Gade: And then another question from the insurance industry, you know,
36:43
beyond models, what recommendation of metrics would you have, for
36:46
instance, for claims processing? Can you explain specific measures you suggested to clients and share what
36:52
your assessment is on the quality improvements of business outcomes.
36:56
Karthik Bharathy: To be honest, I'm not from the insurance industry,
36:59
so I'm commenting on that.
37:01
Um, that said, I'm happy to take that question back and come back
37:05
if we have the contact information. I don't, I don't represent the insurance industry, so I just don't
37:09
want to give out the wrong answer. Krishna Gade: So Priya, feel free to reach out to Uh, Karthik on further information.
37:17
Awesome. So I guess, uh, you know, finally, as we sort of get into the the last
37:21
few minutes of the podcast, right? So, you know, what, what are some of the things like, you know, maybe like,
37:28
uh, sort of a life cycle workflow when, you know, You know, organizations that
37:33
thinking about, because life is moving very fast in the last few years, you
37:37
know, you were talking about ML and all of a sudden there's generative AI.
37:40
Now there's AI agents, like, you know, when an organization thinking
37:44
about it, how do they go about it? You know, you know, implementing these things, what should be the priority?
37:49
What should they be the best practices? Karthik Bharathy: I think the playbook, if you will, right.
37:55
It's certainly, there are a few common things across these different systems,
38:00
and I'm sure there will be a lot more coming in the next few years.
38:03
Um, but fundamentally, I think what has not changed is starting with data, right?
38:07
Um, I, I can't emphasize this enough that the, the better your data is,
38:12
you know, the better pretty much your AI model, the genetic system, all of
38:16
the goodness that's, that's out there. Um, so have a robust data infrastructure, quality data, um, that feeds into
38:23
your machine learning processes. Um, and if you're starting off with, um, you know, GenAI and agentic system, uh,
38:32
I would start with one high value use case, uh, prototype it to your business
38:37
problem, demonstrate the value quickly, um, and taking it to the next level,
38:43
you want to establish the, um, the necessary MLOps foundations saying, how
38:49
does monitoring play into the system? What does versioning mean?
38:52
Um, how can I go from one version to the other?
38:55
These are fundamental as you think about taking a system from just a
38:58
POC to a production, these play in and just building on that and very
39:03
relevant to the topic of today is, uh, looking at the governance frameworks.
39:07
Um, what does it mean to have a simple.
39:10
Approval workflow that needs to be set up as you're scaling through the system.
39:15
And a lot of this, um, also requires that you invest in
39:18
your own team and train them. So they are aware of the different elements of taking or going live with
39:24
these different systems in place. Um, just with AWS, there are enough training and certification.
39:30
I'm sure Fiddler has their training and certification available.
39:33
So those help you build your internal expertise.
39:36
And, and finally. Um, plan for scale, right?
39:39
What worked for you when you started off with a small system may not be
39:44
applicable when you go for like a 10x or 100x of what you intend to build.
39:49
But there are, the goodness is there are enough enterprise features at AWS,
39:54
SageMaker, in Fiddler, that help you scale as you go through this journey.
39:59
In converse, what you want to avoid is rushing through a system quickly to
40:05
demonstrate value, not having a good data or like a data quality approach, um,
40:12
not in engaging a lot of stakeholders.
40:15
Um, and, and then you, you have very little insight into how you would do
40:19
maintenance or upgrade or deployment.
40:21
So the, that, that is a recipe, recipe for failure.
40:25
So as long as you avoid that so on the fundamentals.
40:28
Krishna Gade: No more colloquially don't do vibe checking, you know, vibe checking and vibe testing of your models, you know, actually,
40:33
and actually know what you're doing. That's a great point.
40:37
So I guess actually it's a very related question. Someone is asking, you know, things are moving really fast, even for
40:41
us in the technology area, right? Like, you know, what type of problems in two to three years within AI agents
40:46
that you, that will keep you up at night?
40:48
You know, what do you foresee? Karthik Bharathy: Yeah, so that's, that's, um, there's so much I could
40:54
predict, but see, that's a question, you know, I ask myself every day, right?
40:58
Fundamentally, you know, I go back to, um, you know, back when I
41:03
joined AWS many, many years ago, um, there was this interesting quote by
41:08
Jeff that still resonates with me.
41:10
It's something around, "hey, what, what will not change" as opposed to saying,
41:14
"hey, what will change that part?" The second part, what will change is sort of, you know, each of us can
41:19
debate like for hours or days together.
41:22
Um, but what does not change is fundamentally customers asking for better
41:27
value and what it translates to something that's more performance, something
41:31
that's robust, something that's secure. Those fundamentals are not going to change or something that's cheaper, right?
41:36
Uh, uh, like the way Bezos put it, no one's going to come to you and say,
41:41
hey, give me something that's more expensive or slower to perform, right?
41:44
So fundamentally looking at the system, um, and seeing what value it
41:49
adds to your business use case, what does it translate to your customers?
41:53
I think those would be paramount as you look at these, uh, innovations
41:56
that are happening in GenAI industries. Krishna Gade: Yeah, and there's an innovation happening across
42:00
like small and big players, right? So there's, you know, there's a question around how, you know, small, you know,
42:05
there's a lot of, you know, new AI agentic applications that are coming up.
42:09
And, you know, how do you think like they're playing within the,
42:12
you know, within the, you know, the big, you know, big players in
42:15
ecosystem, you know, you know, building also building agentic workflows.
42:19
Any thoughts on that? How AWS might be encouraging on the ecosystem side as well?
42:24
Karthik Bharathy: Yeah, absolutely. And I think, um, one is definitely through the partners.
42:28
We work closely with, um, companies like Fiddler.
42:32
I think the second dimension to that question is um, AWS providing
42:37
the choice to the customers, right? So there is not a single model that we say, hey, this is what you need to do.
42:42
That's something that um, you as a customer can decide,
42:45
um, right from DeepSeek. to, um, the latest Llama models to our own in house Amazon Nova models.
42:53
You have all of those available to experiment and try for your use case.
42:57
I'm sure a lot of it will be applicable even in the world tomorrow, where
43:01
you have the choice in choosing the best of what's applicable for you.
43:05
Krishna Gade: Awesome. Great. I think, uh, with that, we are coming to the end of the podcast.
43:10
You know, thank you so much, Karthik for spending time with us today.
43:13
Um, you know, I think one of the things that I took away is that quote
43:16
that you mentioned that Jeff said, like, what is not going to change?
43:20
And I, I believe what is not going to change with AI is going
43:23
to be, whether it's your simple statistical model or deep learning
43:26
model or generative AI or AI agents.
43:30
You need to test it properly. You need to monitor it properly and you need to make sure it's, you
43:34
know, it's secure and, and, and it's working for your, for your business.
43:38
So, and so I think that's, that's not going to change.
43:40
Um, I think I, and so I think, you know, that's, that's kind of where
43:44
our partnership with Amazon comes in. And so, you know, thank you so much for being on the show today and, um,
43:50
you know, look forward to, you know, further more conversations in the future.
43:54
Karthik Bharathy: Thank you for having me, Krishna. This was great chatting with you.
43:56
Krishna Gade: Awesome. Thank you. Thanks everyone.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More