AI Observability and Security for Agentic Workflows with Karthik Bharathy

AI Observability and Security for Agentic Workflows with Karthik Bharathy

Released Thursday, 20th March 2025
Good episode? Give it some love!
AI Observability and Security for Agentic Workflows with Karthik Bharathy

AI Observability and Security for Agentic Workflows with Karthik Bharathy

AI Observability and Security for Agentic Workflows with Karthik Bharathy

AI Observability and Security for Agentic Workflows with Karthik Bharathy

Thursday, 20th March 2025
Good episode? Give it some love!
Rate Episode

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:06

Krishna Gade: Welcome and thank you everyone for joining us today's AI Explained.

0:11

Today's topic is AI Security and Observability for Agentic Workflows.

0:17

Everyone touts that this year is going to be the year of AI agents.

0:23

Let's see how we need to address these issues.

0:27

I am your host today. I'm Krishna Gade.

0:30

I'm one of the Founders and CEO of Fiddler AI.

0:34

Again, please put your questions in your Q&A box at any time

0:37

during the fireside chat. Today's session will also be recorded and sent to all the

0:41

attendees after the session. Okay.

0:44

So with, without further ado, um, I want to welcome Karthik Bharti, um,

0:50

General Manager for AI Ops and Governance for Amazon SageMaker AI at AWS.

0:57

Karthik, um, if you could turn on your camera, um,

1:02

Karthik Bharathy: Hey Krishna Krishna Gade: to a thank you.

1:04

Welcome to AI Explain. So it's just a brief bio of Karthik.

1:07

Karthik is a leader with over 20 years of experience driving

1:10

innovation in AI and ML. Um, as a General Manager for AI Ops and Governance for SageMaker AI, Karthik leads

1:16

the development of cutting edge generative AI capabilities in Amazon SageMaker AI.

1:23

Karthik, uh, you know, thank you so much for joining us.

1:25

Uh, maybe let's start you know, with your background.

1:28

You know, how has your role in AI Ops and Governance at AWS shaped your

1:33

perspective on you know, monitoring and securing AI workflows in the enterprises.

1:39

Karthik Bharathy: Yeah, that's a great question, Krishna. And, um, if I think about how AI Ops and governance has evolved over the years,

1:49

um, in fact, a lot of the changes has been in tandem with innovations that we've

1:56

seen in AI ML over the last few years, you know, starting with traditional ML

2:02

systems to more recently GenAI and Uh, agentic workflows as as you aptly put,

2:09

um, and throughout these years from what I've seen, um, there are three things

2:14

that that stand out really one is, um, security and governance are built

2:21

into ML workflows from the ground up.

2:24

It's, it's not an afterthought anymore.

2:27

Um, what essentially that means is, uh, enterprises are thinking about,

2:32

uh, robust data governance techniques, access controls, and, and how they, uh,

2:38

incorporate audit trails from day one.

2:42

Um, and, and effective security isn't about just, uh, you

2:46

know, protecting your models. It's about creating a comprehensive system that includes, uh, uh, looking at your,

2:53

uh, monitoring in an automated manner.

2:55

Uh, doing version control and, and also having audit trails.

3:00

The, um, second one I'll call out is, uh, the need for 窶各nd-to-end observability,

3:08

um, and this is across both the data and the ML workflows, um, right from,

3:14

you know, how data is ingested, how you can have lineage starting from all the

3:18

way to data to ML, um, and all the way to, uh, observability during, um, model

3:24

deployment to look for drift and so on.

3:27

Um, and, and finally, I would call out the third thing is, uh, while all these

3:32

sophisticated tooling is in place, um, you want to have the necessary, um,

3:38

human element, um, to sort of oversee the process, uh, while it's automated,

3:43

there are critical junctures where human, uh, oversight is needed, and that

3:48

helps in the decision making process. Krishna Gade: Awesome.

3:52

So, uh, you know, being at, you know, at the helm of SageMaker, you're probably

3:56

seeing the current state of AI in the enterprise, you know, it's adoption.

4:00

Um, you know, how would you describe it? You know, did you shed some light for our audience?

4:05

Karthik Bharathy: Yeah, yeah. Um, I, I think if you look at it again, the last four or five years, right, the,

4:11

the enterprise landscape is evolving.

4:14

Uh. pretty rapidly, right? And you can notice, um, several distinct patterns, right?

4:20

Um, and for what's worth, like, we are in the third year

4:24

of generative AI was, right? Uh, I think the first year was more around, hey, there's this cool thing,

4:29

like, what can GenAI do, right?

4:31

But last year, Based on customer conversations, we saw that, um, customer

4:37

conversation moving from, "Hey, what is GenAI" to, "Hey, is this right for me?"

4:43

And how can I adapt this, um, into having a real impact for my business?

4:49

Um, and this year, um, we are hearing customers want

4:53

to go big with generative AI.

4:55

You know, both in terms of going wide and going deep and, you know,

4:59

deploying these systems at scale. and also leverage the promise of agentic AI that can create

5:06

tangible business value, right? And as, as we see more of these systems being developed, like AI systems,

5:16

we, there is a need to integrate these different AI systems so you can

5:23

orchestrate more complex workflows and while at the same time you want to keep in

5:29

mind aspects of security and reliability.

5:32

So that's definitely one trend and the other one I would call out is as

5:37

you bring in these systems and want to make complex decision making, you

5:40

want to do so in an automated manner.

5:44

While keeping in mind, hey, there is transparency and accountability, right?

5:48

So there is increasing, uh, increasingly customers are looking for ways

5:54

to have human oversight and they want to scale their AI operations.

5:58

That's right. Yeah, especially in the. Regulated industry, which we play in, uh, there is some, um, cautious approach

6:05

behind, you know, you know, with respect to the usage of generative AI or AI

6:09

agents, like the whole human in the loop. Krishna Gade: Um, so I guess that's that begs the question, right?

6:14

What potential are you seeing for these agentic AI systems?

6:17

You know, how are they going to transform the business operations?

6:20

Any real life examples that would be amazing.

6:23

Karthik Bharathy: Yeah, yeah, I think there are quite a few, right?

6:26

And, um, let me first break it down into sort of the different patterns we see,

6:31

um, based on the customer conversations in AWS, and then sort of look at, um,

6:37

examples for each of those, right? Um, so with agentic AI, um, And the business value that provides it fall

6:46

largely in three different categories.

6:49

Um, the first one, um, would be using agentic AI to accelerate,

6:56

um, workplace productivity, right?

6:58

So think of these as, um, day to day repetitive tasks that employees

7:04

are doing, and they want to automate this and, and gain the advantage,

7:09

uh, of using such an agentic system.

7:12

Right. A good example is NFL media.

7:16

They use business agents today to help their producers and editors to

7:21

accelerate their content production. They have a research tool that allows them to gather insights from

7:29

video footage from a specific place.

7:33

And, um, what essentially that provides is, uh, when, when you're

7:38

onboarding a new hire, it reduces the training time, um, by up to 67%.

7:43

And, um, um, and when employees, uh, their employees ask questions, um, about what,

7:49

what's going on, that can be surfaced in less than 10 minutes, uh, of what

7:54

used to take, um, close to 24 hours.

7:57

So that's one such example.

7:59

And, uh, more closer to the to the software world.

8:02

We're all familiar with coding assistance.

8:06

And many of you may have already used coding assistance

8:08

in one shape or the other. Um, and and largely, well, they help with, um, building better code or providing.

8:17

Um, documentation or explaining, um, existing code, it's, it's not just

8:21

about the code itself, but it's, it's more about automating the entire

8:25

software development lifecycle, um, including, you know, upgrading software.

8:30

Um, or, um, modernizing a legacy application for

8:34

Krishna Gade: Migrating to new languages. Karthik Bharathy: Absolutely.

8:36

Absolutely. So case in point, um, within Amazon, we had, um, these agents for

8:43

transforming our code base from an older version of Java to a newer version.

8:48

And, uh, there was savings of, you know, a mammoth, like 4, 4,500

8:53

developer years worth of effort, right? Roughly translate to, um, you know, 260 million annual CapEx savings.

9:01

Um, so that's that's the first trend I would think in terms of using it

9:06

to accelerate workplace productivity. The second one would be in transforming business workflows

9:14

and uncovering new insights, right?

9:17

What I mean by that is, uh, as enterprises are adopting agents, they

9:22

want to streamline their operations and gain insights on their data, right?

9:27

Um, and the example that comes to mind is, is.

9:31

Of cognizant. They're using business agents to automate mortgage compliance workflows, and they've

9:38

seen improvements of more than 50 percent that reduces, um, uh, errors and rework.

9:44

Um, similarly. Um, Moody's is another great example.

9:48

They've used a multi agentic system, um, that looks at

9:53

generating credit risk reports. Um, and again, the benefit is, uh, what used to, uh, take humans about one

10:02

week to generate a specific report is now cut down to just one hour, right?

10:06

So that's the magnitude of, um, impact that, that customers are seeing.

10:11

Finally, the third one I would call out is more in the research area

10:15

that's sort of, uh, fueling, you know, industry transformation and innovation.

10:20

Um, a good example there is from Genentech.

10:24

Um, they've deployed a genetic solution running on AWS, um,

10:29

and, and they're improving their, uh, drug research process.

10:32

So what they've done is, uh, Their solution roughly automates, um, you

10:38

know, huge, about five years worth of research, right, across different

10:42

therapeutic areas, uh, right, and then it, what it does, it helps them speed

10:46

up the, uh, drug target identification and also improve their research

10:52

efficiency, um, ultimately leading to, you know, faster drug development.

10:55

So, um, net net, we're seeing systems, agentic systems deployed

11:00

broadly in these three categories. Krishna Gade: Absolutely.

11:03

So it's like workplace productivity, business transformations, and

11:06

then, you know, new, you know, new, new product innovations.

11:09

Um, so one thing that you mentioned and business transformations, you

11:13

know, you mentioned a few examples, especially like generating credit

11:17

reports and claims processing, right? These are, you know, high stakes AI use cases.

11:21

So there is a need for, you know, security, transparency into

11:25

how, you know, AI is working. You know, what are some of the challenges that we're trying to address.

11:29

Yeah, you think our organizations are facing when they're implementing

11:33

agentic workflows for these, you know, for these use cases or in

11:35

general, other use cases too? Karthik Bharathy: Yeah, yeah, I think that's a, that's a great call out, right?

11:41

So, um, while you're looking at these, uh, systems and I think

11:45

they're definitely Um, you know, security and visibility challenges

11:50

that organization need to look into.

11:53

Um, I'll, I'll call out a few that, that we have seen and, uh, by no means

11:57

this is comprehensive, but it, it sort of comes down to, um, the stage of

12:02

the ML workflow, if you will, right?

12:04

And, uh, If you think about it at the very beginning, when you're trying to

12:08

use a specific model, um, it's, it's quite possible that the data that's

12:14

you being used, either to train a model or, you know, fine tune model, use rag,

12:19

whatever technique that you use, um, to use a data that's, that's not authentic.

12:23

And this might just compromise the performance of the model.

12:26

That's definitely. Um, you know, it's concerning at the same time, harder to detect, but

12:32

until the model is being used and the interactions that are going on.

12:35

So that's one category. Um, the second would be when, um, you know, the model is being used,

12:43

and again, it depends on the model. And in the case of a proprietary model, where the model weights are not exposed.

12:50

Um, it might be an actor trying to attempt to reverse engineer saying, what is the

12:55

specific, um, uh, weights that were used at what level and so on and so forth.

12:59

And that essentially, um, um, you know, exposes.

13:05

The how of the model, if you will, um, and the third one, I would think

13:09

is when the model is actually being used, um, and there can, you know,

13:14

actors can attempt to, uh, extract information, which otherwise the model

13:19

would not emit, uh, it might be sensitive information about the training data,

13:23

it might be information that may not be what the model is intended for, or

13:27

the use case that's being deployed. Um, so.

13:30

Um, net net, I think, um, uh, organizations would need to protect

13:35

against these model weights.

13:37

Um, you know, how necessary controls around, um, uh, access,

13:42

um, you know, ensure that there's data privacy and so on.

13:45

And more importantly, uh, ensure that there's this

13:48

observability that's 窶各nd-to-end. So you can, you are having necessary checks to see how the model is performing.

13:54

Um, and more often than not, you probably have a sandbox environment

13:58

where you're testing it, have tooling, you know, there are a few tools like

14:02

Bedrock Guardrails is an excellent tool. So you sort of incorporate that, you know, Fiddler has an observability tool as well.

14:08

So these provide sufficient insights into what is going on in the system, being

14:13

it agentic or an automated workflow, and you sort of take actions based on

14:17

Krishna Gade: Absolutely. So I think you touched upon a few things like, you know, uh,

14:21

adversarial attacks on models. And now there's this whole, um, field of AI security and model security coming up.

14:28

Um, you know, I remember like, I think a few, few weeks ago when DeepSeek

14:31

launched, everyone was producing benchmarks about how accurate it is

14:35

or how close it is to accuracy when it comes to closed source models.

14:39

But it was pretty vulnerable for security attacks.

14:41

People were able to easily, you know, make it, uh, leak PII content

14:45

and whatnot in RAG workflows. So how, how, how do you think about, you know, uh, what, what are some of

14:51

the, uh, you know, best practices that organizations should think about AI

14:55

security and what, what, what are some of the, you know, how do you think about

14:59

that versus application level security in general that has been around for a while?

15:04

Karthik Bharathy: Yeah, I think, um, at the end of the day, you need a

15:09

comprehensive security approach, right?

15:12

You want to operate at the different levels.

15:15

Um, you mentioned about model level security, right?

15:19

So let's start from there. Um, so when you're thinking about the model, um, like I mentioned, you want

15:25

to protect the model weights, right? Um, and in addition to model weights, you want to protect

15:32

the, um, access to the data.

15:34

Um, you know, ensuring that the data is, is, is authentic and so on.

15:39

Um, and to address these, you would, you would use, like, you would encrypt where

15:43

the model is being stored, the actual file um, or, uh, to your point on adversarial

15:48

examples, you would, you would have a test environment where you would exercise

15:52

the model, monitor its output for, for some of these adversarial examples.

15:57

And um, at the end of the day, you need continuous monitoring, right?

16:01

Um, just to look at the input and output patterns, but also look for

16:04

drifts, drifts in the model, drifts in the data, and have necessarily, uh,

16:09

necessary, um, uh, alerts, so you can trigger, like, a retraining, for example.

16:14

So that's at the model level. Um, at, at the application level, I think, um, there, there are the well

16:21

known security practices, like, you know, you enforce access controls, you have

16:25

encryption in place, um, You have logging off the interaction patterns and so on.

16:31

Um, but in addition to that, tooling is often needed.

16:34

Uh, like I mentioned, the Bedrock example, uh, Bedrock Guardrails example earlier.

16:39

You, you want to think about how you audit certain topics, um, be it at

16:44

an input level or the output level. What is relevant to your use case?

16:48

What should not be emitted? Or if there's certain information that's being emitted like a

16:54

PII data, how do you redact the information and so on and so forth.

16:57

So I think net net, the two layers of model security and application level

17:03

security need to integrate seamlessly.

17:06

So in many ways, uh, uh, these are complementary than treating

17:11

it as separate constants. Krishna Gade: Awesome.

17:14

That's great. So I guess, uh, you know, we talked about, uh, a little bit about, uh,

17:19

some high stakes use cases, right? So when it comes to, uh, you know, transparency of AI decisions for these,

17:25

you know, for regulators or business stakeholders, how do you think, you

17:28

know, this is going to change when, you know, agents come about and, um,

17:33

you know, and organizations employ agentic workflows and what happens

17:37

to the transparency behind AI?

17:42

Karthik Bharathy: Yeah, I think fundamentally, um, enterprise would

17:48

benefit from having a governance model.

17:53

That's, that's more federated, right?

17:55

Meaning you have standards, policies in place, that sort of dictate

18:02

how these systems need to be developed across the organization.

18:08

But at the same time, you want to provide enough flexibility, uh, where

18:14

each team or business unit can adapt these standards in a way that they can

18:20

implement for their specific use cases.

18:22

So that's sort of the trade off. And it's, it's a good one, uh, in the sense that you want

18:28

to provide the flexibility. Um, of developing these different systems across different units.

18:33

Um, and there are, again, tools, like, for example, uh, just purely taking

18:38

the example of SageMaker here, you have SageMaker projects where you can

18:42

automate, um, the ML workflow, say, how should it be standardized, what

18:46

pipelines do you need to use, what models, and what quality, and so on.

18:51

Krishna Gade: So the governance is like both a tools problem as

18:54

well as a people's problem, right? Like, you know, essentially, you know, many companies do not have,

18:58

The governance structures today, you know, to sort of ensure that

19:01

you know, AI is tested, monitored and secured and securely operated.

19:05

You know, what, what are some of the best practices that you have seen

19:08

in terms of, you know, customers employing AI governance today across,

19:12

you know, different business units? Karthik Bharathy: I think fundamentally, um, at the highest level of abstraction,

19:19

you have, um, you know, business stakeholders, like the so called risk

19:23

officers, if you will, who understand the domain of what is being developed,

19:29

and they would enforce certain standards on what needs to be Um, at her too.

19:36

And it's important that they work in tandem with the technical team

19:39

who are well versed with what's being done with the model, right?

19:43

For example, a model may have a toxic score of like 0.1.

19:48

But what does that mean is from a use case perspective, whether this

19:53

model can be approved and deployed an organization is very specific

19:57

to the domain they're operating. Um, I think successful organizations have a good mix of both where, um, you have the

20:07

necessary tooling, um, where these, uh, different levels, for example, toxicity

20:12

is being, uh, uh, monitored for, they're being documented, uh, either through a

20:17

model card or you have enough properties, maybe in a model registry, for example.

20:22

And this translates into visibility from the risk officer who can effectively

20:28

say whether this model or the system is approved for deployment or not.

20:32

So the two systems working together, I think, definitely is a recipe for success.

20:37

Krishna Gade: Got it. So are there any specific metrics that you recommend that organizations need to

20:42

track, like whether it's about security or, you know, governance of AI, you

20:47

know, when they're testing it or, you know, when deploying to production?

20:51

Karthik Bharathy: Yeah, so if you look at the metrics again at the technical

20:57

level, you have a set of metrics right at the most foundation level.

21:02

Um, you know, if you have to document it, document what the

21:05

model is doing as a model card. You would look at, um, the purpose of the model, what data it's trained on,

21:12

uh, what is the validation rules, what is the quality of the model, and so on.

21:16

Um, and going a little bit beyond that, um, you may want to document

21:21

how the model is, um, uh, emitting or predicting a response, right?

21:26

So, for example, with, with, with, you may want to look at explainability.

21:31

Approaches like you may look at a SHAP score, for example, or you may look at

21:34

a LIME score, for example, and these may be documented with the model that

21:38

those are good metrics to look at. And again, with GenAI, you can look at additional metrics around

21:44

toxicity, fairness, and so on. You can test these models.

21:49

You can have periodic evaluations on the level.

21:52

of these metrics and test it against, um, standardized data

21:56

sets that are available today. Or you can use custom data sets that are very specific to your, um, use case.

22:03

Um, and then again, at the business level, you want to interpret these

22:06

as saying, uh, with a combination of these, uh, objective metrics.

22:12

How does the subjective standards and policies play in and what does

22:16

that mean from a risk perspective? Krishna Gade: So there is always this tension, uh, within the

22:22

organizations to adopt AI faster.

22:24

Versus doing it right, right? So there's this like, you know, how do you make sure you do it properly

22:29

so that you don't get into trouble? Like what, how should like, you know, organizations think about

22:35

like, you know, this balance? Karthik Bharathy: Yeah, I think um, that, that's a key one, right?

22:41

I think there's no one easy answer, if you will, right?

22:43

And the, and the key to balance, uh, robustness in, in having those

22:48

security controls with the operational, uh, efficiency lies in having some,

22:54

having the right guardrails, right? Instead of, uh, creating or looking at the problem as saying, "Hey, here's

22:59

one way to do it," or one set of, "Hey, this is risky versus non-risky."

23:04

You're probably looking at, uh, a, uh, a set of, um.

23:09

A range of values, if you will, right in terms of how to look at risk.

23:13

Um, a good example would be, uh, let's say you have the model or the system deployed.

23:18

Um, and you notice that certain changes introduce a higher risk.

23:23

Um, it's better to trigger additional approval workflows, um, rather than,

23:30

um, you know, just waiting on it and saying, here's a single way to do it.

23:34

In contrast, if the same set of changes result in a relatively lower risk,

23:40

Um, you may want to proceed through standardized approvals instead of, you

23:44

know, requiring additional approvals. Um, a good example again would be, let's say, there's a drift in the model,

23:50

right, which is fairly common and you have an observability solution in place.

23:55

If the drift is, um, not significant from the current state of the model,

24:00

you may be okay with treating that as an alert and being in the know how of

24:05

what is being happened and you may just trigger a retraining of the workflow.

24:09

But on the other hand, if the drift is significant and it exceeds what

24:13

is the threshold that you've defined, um, you may trigger additional

24:17

approvals or in, in, in some extreme cases, you might even consider

24:21

rolling back to the previous version.

24:24

Uh, so those are, uh, different options that you can consider and, and, and the

24:28

key is to maintain that configurable. So you can trade off between the, the rigor and the robustness of security

24:34

control with the efficiency that it Krishna Gade: Right.

24:37

So when it comes to evaluation of AI, right? So in the past for classical machine learning, you could do things

24:42

like ROC curve, AUC scores, you know, precision recall, and maybe

24:46

even do like SHAP, SHAP plots and understand the feature importance.

24:50

But now with in a generative AI and agentic workflows, evaluating the

24:54

performance is not straightforward, right? You know, there's no ground truth.

24:58

So how do you, you know, can you shed some light on like, you know,

25:01

how, uh, customers are going about this, you know, in, in, in sort

25:05

of in the, in, in, in sectors that you have been exposed to so far and

25:08

what are some of the best practices? Karthik Bharathy: I think the ones that I've seen are the areas where

25:15

customers are exploring are, um, evaluating the system 窶各nd-to-end, right?

25:21

There's no one unique metric like going back to the example

25:24

that mentioned earlier. Um, concretely, you can think off having a pipeline, um, that that triggers, um,

25:33

on either manually or on a periodic way.

25:36

And that evaluates the model on certain dimensions, right?

25:40

Um, and, and evaluation is sort of a broad topic.

25:43

But, um, if, if there are certain aspects of the model that you want,

25:48

let, let's say, be it fairness. Or, um, uh, robust, um, toxicity, for example, like you can look at

25:55

evaluating, for example, a model against a toxigen model and seeing, hey, if

26:00

these inputs were send to the model.

26:02

What is the output? And once you know the expected output and the actual output, you

26:07

can actually see the difference. Okay, the model is working on expected lines.

26:12

Therefore, this is the score that you want to assign for

26:14

that particular category, right? So developing that comprehensive pipelining workflow and making sure you

26:21

have observability and each of the places and saying as a system, you do it first

26:26

at the model level, and then you do it at the system level when there are multiple

26:29

models interacting with each other. And then saying, given the behavior of the system, what what is the sort

26:36

of the score that you want to, in some cases, you know, you can be creative

26:41

and creating a composite score. It purely depends on how how much weight that you assign to each of

26:47

these individuals score to create this composite score and how you gauge that

26:51

composite score with respect to use case. Krishna Gade: Especially for agentic workflows when in some cases when they

26:58

are automating the decision process right in the enterprise space, there

27:02

is a need to measure like whether the decisions are optimal or not.

27:05

You know, uh, it's a pretty hard problem.

27:08

Uh, any, any, any thoughts on that?

27:10

Like, you know, for example, you know, this.

27:13

So take, take the example that you mentioned, like the claims

27:15

processing workflow, right? Like which, which was probably like much more manual in the past.

27:20

Now it's like, you know, automated.

27:22

How, how can, how can, you know, customers measure like, you know, if it's

27:26

working properly and if it's actually working optimally for the business?

27:30

Karthik Bharathy: Yeah, while while you can have, um, you know, objective

27:35

metrics at the end of the day, it's the business use case, right?

27:38

And I think, um, it would involve human, um, in, um, human processes or seeing

27:46

the sort of outputs from the system.

27:49

Um, and I think that the key is to have the necessary hooks in place, right?

27:56

For example, while on one end you want to enforce controls on like

28:01

what data is being accessed or what output is being generated or like

28:05

what toxicity is the scoring or the evaluation model being done, you want

28:10

to make sure there's human insight. Um, and every decision, especially in the early phases of when the system is

28:17

deployed, you want to have these human evaluation on on the system output.

28:21

Um, more importantly, you also want to have some sort of a pause switch, if you

28:26

will, to say that if the model deviates from the known patterns, what is the

28:32

way to quickly have the humans come in and have this pause switch or even a

28:37

kill switch for that matter to make sure that corrective actions can be taken.

28:43

Krishna Gade: Yeah. And so, so I think basically, you know, this might change from

28:46

industry to industry, right? So, you know, like for example, you know, what do you want to measure or

28:51

what do you want to control around AI?

28:55

Can can be different for different domains, you know, have you seen any,

28:59

any sort of, uh, insights, like, for example, finance versus healthcare

29:03

versus like, you know, you know, some other industries, like what do what they

29:07

make care about in terms of, uh, the measuring and putting security controls.

29:13

Karthik Bharathy: Yeah, it's, it's, um, more than the industry.

29:17

I think, um, like you call out, it also depends on what set of policies

29:22

and standards they're hearing to. And then, yes, it goes by also the regions in which they are like the EU

29:29

AI Act or of ISO 42001, the different regulations that that come in.

29:35

So there's no one size fits all, but the more effective use cases that

29:41

I've seen, or the ones that have been deployed successfully, factor in both

29:45

the subjectiveness of The standards that require you, uh, to adhere to

29:51

certain things like, hey, where the data are stored, um, and sort of answer

29:55

the different questions related to the standard along with the objectiveness

29:59

of the metrics that's being tracked. Um, so the more successful use cases, um, uh, and they do vary across like

30:06

the healthcare and financial services. Um, and, you know, even in the case of retail, there are examples where

30:12

a combination of the two is needed. Krishna Gade: So what are some of the warning signs that, you know, one

30:19

can, one can like actually see that an agentic system may have security

30:25

vulnerabilities or monitoring gaps? Like how can an organization be aware of that?

30:30

Karthik Bharathy: Yeah, I think the first one to look for, um, um,

30:37

is, is the data quality, right? You want to make sure, um, you know, the, the model, um, data input and

30:45

the model, what, what it's trained on is, uh, secure and robust.

30:48

That's, that's, uh, that's important.

30:51

And once you have those in place, um, I think you want to have an effective

30:57

testing strategy, um, to ensure that you, you defend against adversarial attacks.

31:03

Um, so there's even, even if, for example, there's a manipulation in the input, you

31:08

want to make sure that the security of the model and the system is taken care of.

31:12

Um, and then the one that we talked about about on model drift and looking

31:19

for any degradations and performance.

31:21

So continuously monitoring and looking for those key parameters is important.

31:26

Um, and from, uh. The system application standpoint, um, you want to ensure that,

31:33

uh, the API endpoints are. Are, uh, secured, um, again, data transmission is secure

31:39

and so on and so forth. And you have robust, um, controls for both the authentication

31:44

and authorization piece. Um, at the end of the day, I would think of it as, uh, as an employee, right?

31:50

An employee badges in and the employee in many organization

31:55

badges out of the building as well. And the next time you come in, you badge in again.

31:58

So you sort of re authenticate and make sure.

32:01

That, you know, you are aware of, like, this person who is authorized

32:05

to do this particular job. It's very similar to an agentic system.

32:09

Um, so you want to ensure that, um, another one that comes to mind is, uh, um,

32:15

the principle of least privilege, right? You, you provide access only when that's needed, right?

32:21

And very similar, again, to the employee example that I called out.

32:25

Um, an employee may not have access to all data, but when it's needed, You sort of

32:29

ensure that, hey, the person who really needs that information has access to it.

32:33

So those would be some signs to look for when you're designing situations.

32:37

Krishna Gade: Got it. So there's an audience question here. So any specific frameworks, tools you're using for agentic workflows

32:42

to evaluate robustness and accuracy? This is probably a good time to talk about our partnership

32:46

between SageMaker and Fiddler. You know, can you share your thoughts on that?

32:51

Karthik Bharathy: Yeah, no, absolutely. Like, we're thrilled to be working with Fiddler.

32:57

And at the outset, you know, partnership is something that's absolutely critical

33:02

for AWS and SageMaker specifically.

33:05

Um, as we look at extending the core AI/ML capabilities, um, and

33:11

to provide specialized solutions for different industry needs.

33:14

I think, uh, partnering with a company like Fiddler is absolutely paramount.

33:19

Um, and what the intent is really simple, right?

33:22

We want to make sure the best of class solutions are available

33:25

to our customers, right? So with Fiddler, we've combined the power of SageMaker AI, where you can train

33:31

and deploy your models with Fiddler AI, Fiddler AI, which brings in observability

33:36

to monitor and improve the ML models. So, so net net customers have a one click way to do observability with SageMaker AI.

33:45

Um, and this experience is available in SageMaker Unified Studio.

33:49

It provides a seamless experience and, and I'm pretty excited about

33:54

how customers can use these two capabilities in a seamless manner.

33:58

Krishna Gade: Absolutely. Yeah, we share the same excitement.

34:00

And for those of you who are on AWS SageMaker today on this call, feel

34:04

free to, you know, use the one click experience and one click integration

34:08

that we built together with working with AWS with Fiddler for monitoring

34:13

and evaluation of your AI models. So let's actually, you know, maybe take a few more audience questions here.

34:18

There are some questions around different industries.

34:21

There's a question actually about code migration.

34:23

We touched, touched upon it early in the call.

34:26

What are some of the best practices for verifying large code changes or

34:30

migrating from one language to another? This is, I think, using AI based code migration.

34:35

Karthik Bharathy: Yeah. Um, I think the specifics actually depend on the language itself, right?

34:41

Depending on whether you're looking at a more modern language or like a traditional

34:46

language, like COBOL, for example, right? Um, so I think given that

34:53

While the migration is being assisted, you want to look for, um, patterns of

34:59

like translation between the two systems.

35:02

Sometimes the logic may be inherently complex, so there's human in the loop,

35:07

there's assisted AI that comes into play.

35:09

Um, you should definitely try out some of the tooling that's already available.

35:13

Um, with Amazon queue, um, we recently launched a train went the ability to

35:19

look at the system workflow 窶各nd-to-end.

35:21

Um, and there are obviously pieces around security that that's very

35:26

specific to the organization as well.

35:29

Um, so in terms of best practices, I believe there's also a,

35:33

um, detailed documentation. Um, we can find a way to share that with you.

35:38

Um, on, on what does need to be looked at as, as you do

35:45

Krishna Gade: And so, uh, there's another question on like specific industry.

35:48

Could you shed some lights on, on business use cases within financial services

35:52

or FinOps and plus, uh, you know, uh, where AI observability makes sense.

35:57

Karthik Bharathy: Yeah, I think there are quite a few.

36:00

Um, you know, the top two or three that come to mind are the automated

36:05

financial reporting that I called out. Um, you know, I mentioned about, uh, uh, Moody's use case about generating

36:13

credit reports or cognizance use case about mortgage compliance workflows.

36:18

Um, demand forecasting, uh, is another one, um, that's, it's sort

36:23

of relevant in the context of, uh, financial services as well.

36:27

Um, and more generally, I would say incident management that applies

36:31

across different industries is, is also relevant as you look at more data.

36:36

And you want to uncover insights from that data.

36:40

Krishna Gade: And then another question from the insurance industry, you know,

36:43

beyond models, what recommendation of metrics would you have, for

36:46

instance, for claims processing? Can you explain specific measures you suggested to clients and share what

36:52

your assessment is on the quality improvements of business outcomes.

36:56

Karthik Bharathy: To be honest, I'm not from the insurance industry,

36:59

so I'm commenting on that.

37:01

Um, that said, I'm happy to take that question back and come back

37:05

if we have the contact information. I don't, I don't represent the insurance industry, so I just don't

37:09

want to give out the wrong answer. Krishna Gade: So Priya, feel free to reach out to Uh, Karthik on further information.

37:17

Awesome. So I guess, uh, you know, finally, as we sort of get into the the last

37:21

few minutes of the podcast, right? So, you know, what, what are some of the things like, you know, maybe like,

37:28

uh, sort of a life cycle workflow when, you know, You know, organizations that

37:33

thinking about, because life is moving very fast in the last few years, you

37:37

know, you were talking about ML and all of a sudden there's generative AI.

37:40

Now there's AI agents, like, you know, when an organization thinking

37:44

about it, how do they go about it? You know, you know, implementing these things, what should be the priority?

37:49

What should they be the best practices? Karthik Bharathy: I think the playbook, if you will, right.

37:55

It's certainly, there are a few common things across these different systems,

38:00

and I'm sure there will be a lot more coming in the next few years.

38:03

Um, but fundamentally, I think what has not changed is starting with data, right?

38:07

Um, I, I can't emphasize this enough that the, the better your data is,

38:12

you know, the better pretty much your AI model, the genetic system, all of

38:16

the goodness that's, that's out there. Um, so have a robust data infrastructure, quality data, um, that feeds into

38:23

your machine learning processes. Um, and if you're starting off with, um, you know, GenAI and agentic system, uh,

38:32

I would start with one high value use case, uh, prototype it to your business

38:37

problem, demonstrate the value quickly, um, and taking it to the next level,

38:43

you want to establish the, um, the necessary MLOps foundations saying, how

38:49

does monitoring play into the system? What does versioning mean?

38:52

Um, how can I go from one version to the other?

38:55

These are fundamental as you think about taking a system from just a

38:58

POC to a production, these play in and just building on that and very

39:03

relevant to the topic of today is, uh, looking at the governance frameworks.

39:07

Um, what does it mean to have a simple.

39:10

Approval workflow that needs to be set up as you're scaling through the system.

39:15

And a lot of this, um, also requires that you invest in

39:18

your own team and train them. So they are aware of the different elements of taking or going live with

39:24

these different systems in place. Um, just with AWS, there are enough training and certification.

39:30

I'm sure Fiddler has their training and certification available.

39:33

So those help you build your internal expertise.

39:36

And, and finally. Um, plan for scale, right?

39:39

What worked for you when you started off with a small system may not be

39:44

applicable when you go for like a 10x or 100x of what you intend to build.

39:49

But there are, the goodness is there are enough enterprise features at AWS,

39:54

SageMaker, in Fiddler, that help you scale as you go through this journey.

39:59

In converse, what you want to avoid is rushing through a system quickly to

40:05

demonstrate value, not having a good data or like a data quality approach, um,

40:12

not in engaging a lot of stakeholders.

40:15

Um, and, and then you, you have very little insight into how you would do

40:19

maintenance or upgrade or deployment.

40:21

So the, that, that is a recipe, recipe for failure.

40:25

So as long as you avoid that so on the fundamentals.

40:28

Krishna Gade: No more colloquially don't do vibe checking, you know, vibe checking and vibe testing of your models, you know, actually,

40:33

and actually know what you're doing. That's a great point.

40:37

So I guess actually it's a very related question. Someone is asking, you know, things are moving really fast, even for

40:41

us in the technology area, right? Like, you know, what type of problems in two to three years within AI agents

40:46

that you, that will keep you up at night?

40:48

You know, what do you foresee? Karthik Bharathy: Yeah, so that's, that's, um, there's so much I could

40:54

predict, but see, that's a question, you know, I ask myself every day, right?

40:58

Fundamentally, you know, I go back to, um, you know, back when I

41:03

joined AWS many, many years ago, um, there was this interesting quote by

41:08

Jeff that still resonates with me.

41:10

It's something around, "hey, what, what will not change" as opposed to saying,

41:14

"hey, what will change that part?" The second part, what will change is sort of, you know, each of us can

41:19

debate like for hours or days together.

41:22

Um, but what does not change is fundamentally customers asking for better

41:27

value and what it translates to something that's more performance, something

41:31

that's robust, something that's secure. Those fundamentals are not going to change or something that's cheaper, right?

41:36

Uh, uh, like the way Bezos put it, no one's going to come to you and say,

41:41

hey, give me something that's more expensive or slower to perform, right?

41:44

So fundamentally looking at the system, um, and seeing what value it

41:49

adds to your business use case, what does it translate to your customers?

41:53

I think those would be paramount as you look at these, uh, innovations

41:56

that are happening in GenAI industries. Krishna Gade: Yeah, and there's an innovation happening across

42:00

like small and big players, right? So there's, you know, there's a question around how, you know, small, you know,

42:05

there's a lot of, you know, new AI agentic applications that are coming up.

42:09

And, you know, how do you think like they're playing within the,

42:12

you know, within the, you know, the big, you know, big players in

42:15

ecosystem, you know, you know, building also building agentic workflows.

42:19

Any thoughts on that? How AWS might be encouraging on the ecosystem side as well?

42:24

Karthik Bharathy: Yeah, absolutely. And I think, um, one is definitely through the partners.

42:28

We work closely with, um, companies like Fiddler.

42:32

I think the second dimension to that question is um, AWS providing

42:37

the choice to the customers, right? So there is not a single model that we say, hey, this is what you need to do.

42:42

That's something that um, you as a customer can decide,

42:45

um, right from DeepSeek. to, um, the latest Llama models to our own in house Amazon Nova models.

42:53

You have all of those available to experiment and try for your use case.

42:57

I'm sure a lot of it will be applicable even in the world tomorrow, where

43:01

you have the choice in choosing the best of what's applicable for you.

43:05

Krishna Gade: Awesome. Great. I think, uh, with that, we are coming to the end of the podcast.

43:10

You know, thank you so much, Karthik for spending time with us today.

43:13

Um, you know, I think one of the things that I took away is that quote

43:16

that you mentioned that Jeff said, like, what is not going to change?

43:20

And I, I believe what is not going to change with AI is going

43:23

to be, whether it's your simple statistical model or deep learning

43:26

model or generative AI or AI agents.

43:30

You need to test it properly. You need to monitor it properly and you need to make sure it's, you

43:34

know, it's secure and, and, and it's working for your, for your business.

43:38

So, and so I think that's, that's not going to change.

43:40

Um, I think I, and so I think, you know, that's, that's kind of where

43:44

our partnership with Amazon comes in. And so, you know, thank you so much for being on the show today and, um,

43:50

you know, look forward to, you know, further more conversations in the future.

43:54

Karthik Bharathy: Thank you for having me, Krishna. This was great chatting with you.

43:56

Krishna Gade: Awesome. Thank you. Thanks everyone.

Rate

Join Podchaser to...

  • Rate podcasts and episodes
  • Follow podcasts and creators
  • Create podcast and episode lists
  • & much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.
,

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features