AI Agent: From Prototype to Production

Using Human in the Loop

Scott Moss

Scott Moss

Superfilter AI
AI Agent: From Prototype to Production

Check out a free preview of the full AI Agent: From Prototype to Production course

The "Using Human in the Loop" Lesson is part of the full, AI Agent: From Prototype to Production course featured in this preview video. Here's what you'd learn in this lesson:

Scott explains why Human in the Loop (HITL) is a critical safety pattern in AI systems where specific actions require explicit human approval before execution. Approvals can be synchronous, asynchronous, or tiered. Several factors for designing the approval flow are also discussed.

Preview
Close

Transcript from the "Using Human in the Loop" Lesson

[00:00:00]
>> Scott Moss: Let's keep it moving. So no code here, cuz we're gonna implement that here. And Human in the Loop, this is where it gets kind of fun. So you'll see this a lot, HITL, that means Human in the Loop. You will see that a lot, understanding Human in the Loop, basically, it's basically a safety mechanism, right?

[00:00:18]
This is how we prevent AIs from pressing the nuke button, we have to say yes before they press the nuke button. That's essentially what this is. By having a Human in the Loop, we prevent the AIs from doing destructive things. So what are destructive things? Anything that's not a query.

[00:00:34]
If like if you think about CRUD, the only thing that's non-destructive in CRUD is R, read, everything else is destructive, creating, updating and deleting. We don't want the AI creating anything, updating anything or deleting anything without our approval. And I think from the bubble that I live in San Francisco, that is what everybody's moving towards too, of like, well, at first we thought AI was gonna automate everything and be really good.

[00:01:00]
Now we're like, we don't wanna take the human element out of it. We don't wanna get rid of humans. We really want humans to be in here and have full control. We just want the AI to augment them. So it's like pro-Human in the Loop now. It's like, that's what everyone's doing.

[00:01:13]
I wouldn't be surprised if it's gonna be startups that only do this. So this is a thing. And the way that that services from a UX pattern is mostly like an approval. Do you approve the AI of doing this thing? Yes or No, essentially. So it's something that we built into our system.

[00:01:29]
I don't know, well, I'll take that back. I do know of other ways that people do it. I haven't seen people do in other ways. I just know that they do it that way, and I'm not sure how many people do it the way that I'm gonna be showing you today.

[00:01:43]
But Human in the Loop is basically an approval system. So it's really cool because it protects you from a legal standpoint. If you're creating AI, what can come to you like, your AI like just spent all my money. And it's like, well, you would have to have approved that, because it doesn't do it unless you have an approval, just like a person wouldn't, right?

[00:02:02]
So you have that, but also, yeah, it can save you money as well if the things are just like a runaway AI, but also, yeah, safety mechanism to make sure, imagine if you had an AI that had a tool that had delete access to your SQL production database.

[00:02:19]
For some reason, you made that, I don't know why. And then there was no approval, and then it hallucinated and somebody asked it to get ice cream, but it did that instead. Okay, you just dropped your whole database. So having that approval is pretty necessary in my opinion.

[00:02:37]
Cool, so several ways to implement a human in a loop. So there's the synchronous approval, which is what we're gonna be doing. So this is basically, there is a direct pause in the execution. Nothing can go forward. The very next thing you say is gonna be considered a response to this approval.

[00:02:56]
There's nothing else you can do. You're completely blocked. That's what we're gonna be doing. I think that's the simplest way. But typically, the way that looks is, agent decides to use a protected tool or not. We're actually not gonna do that. So we're not even gonna teach the agent what tool needs to be protected or not.

[00:03:14]
We're just gonna intervene ourselves deterministically, because it's gonna happen 100% of time if we do it that way. If we try to teach the agent which tool needs to be protected through system prompts and tool descriptions, you've got to eval that. It may or may not decide that this tool that you're trying to run needs an approval, and then it will still do things on accident, and you still might need to manually intervene.

[00:03:40]
>> Audience 1: It's the protections of the approval if it has to decide if it should-
>> Scott Moss: Exactly, if the AI has to decide if he's an approval, what happens if it doesn't decide that? I guess there's a benefit of the AI knowing that there is this concept of approvals.

[00:03:57]
So you might be able to do that like, hey, there's this concept of approvals. You don't have to decide the approvals are, just letting you know if you see a message saying not approved from a tool call response, just know that, like the user didn't approve that. So that way it knows and it can respond and in a way where it's not confused, because if you don't tell it that, and then you respond back from a tool call and you're saying it's not approved, they might be like, oops, couldn't do this, when, in reality, it had no concept of an approval.

[00:04:22]
If you told it there was an approval, it might be like, sorry, I see you didn't approve that, my bad, right?
>> Audience 1: So just maintaining the continuity of the conversation [CROSSTALK].
>> Scott Moss: Exactly, it's like, yeah, you got to loop this person in. Otherwise they know what the hell's going on.

[00:04:34]
It's literally like talking to a person. System pauses, the human reviews are approved. The best thing about this system is that your AI could send you an approval and then you don't interact with it for a year and you come back and it's still waiting on that approval, the very next thing you say is still gonna be that approval.

[00:04:52]
So there is no expiration time on this. And that can be good or bad, depends on your system, but I think that's what's cool about it. And then the system either executes the action if it's approved, or it does not do the action if it's not approved. And then the agent just continues the conversation.

[00:05:10]
It's like, cool, wasn't approved, here's your answer. Or, it was approved, did it, got the data, continued to loop, here's your answer, right? So you have that, asynchronous queue, a little harder to do, but this is for longer running processes or systems with multiple approvers. You have multiple people that have to approve stuff.

[00:05:30]
Agent submits an action to an approval queue, right? The system continues, and you still can talk to your agent. You can still do whatever you want because there's no approval queue over here. This work that needs to be approved to be done is irrelevant of the conversation you're having with your agent.

[00:05:46]
They're two different things. It's just like, like for instance, a document signing request. Maybe if you had a tool that linked with Docusign or something like that, you needed someone to sign the document. So your agent called that tool, but it needed to get an approval from three other people on your team.

[00:06:00]
You're gonna put that on the approval requests and you can still talk to your agent. It's not gonna block you from doing anything like the synchronous one would, because you have nothing to do with this document, or maybe you already signed, or whatever. So, the approvers process that queue at their own convenience.

[00:06:18]
So whatever system you have for letting people know that they need an approval, whether you email them, whether they also have agents that your agent talks to, whether they get a text message, whether it's in your app. However, they're gonna approve it when they get to it. It has nothing to do with you.

[00:06:32]
And then once it is approved, it happens automatically. It gets completed, and then the results get fed back into the context of the original agent that called the message. So basically it gets fed back into the system prompt, this approval message that I created a while ago, has since then been approved.

[00:06:54]
And I see that now. So it might respond back in its next message. I saw that you approved the thing. Or so and so signed the thing, just letting you know here's an update, right? So it can get pretty powerful, if you think about it, how that works.

[00:07:09]
So you got that, this tiered approval system. I put this in here because I thought it was kind of crazy, and we tried to do something like this, but we didn't really need to. But this is kind of like the same thing, except for there's different tiers. So low-risk stuff happen automatically because it's like yes, it's a destructive action, but I don't really care.

[00:07:29]
For instance, I wouldn't care if I had a tool where my agent wrote to my notion document. I wouldn't care, you can do it, you don't need my approval for that. Just please do it, stop asking me for that, but please do not send that email off without approving that.

[00:07:43]
So that's a little more medium to high risk. So please don't do that, right? So you can have these different tiers. So a high risk one might be like writing to the database. Two people have to approve that, like a code review. I send it off, it goes to approval or queue.

[00:07:58]
It has a list of people for when this tool is created. It looks up like, here's a list of people who are authorized to approve this. Grab their information and send it however you wanna send it. Email, text message, Slack. However you send your approvals, once they approve it, then it gets executed, right?

[00:08:14]
So there's a way to do that. You can even do like roles, departments. This thing got created, go look up. What role or department's responsible for this? Go look that up. Message everybody in that role or department, there's literally nothing you can't do, which is kind of crazy.

[00:08:31]
How do we design approval flows? It's basically only two things. What action do we want an approval for? How can we detect when that action is being ran? In our case, stop it. And then the next thing is making sure the agent doesn't break. Because if a tool call was made, then if we save that in the database, the very next message has to be a tool response.

[00:08:55]
But if we're not gonna save that message because we need to get an approval, then we discard it. So then what do we say to the user now? So we gotta figure out, do we make another message? Do we ask them a question? How does that work? And then when you come back, you need to check to make sure that they didn't leave off an approval state.

[00:09:14]
If they're left off in an approval state, a pending approval, you have to satisfy that, you also need to check, the user can respond in any way. Like in our example, it's a chat app, so there's no button that says yes or no. They can say, NVMD, nevermind.

[00:09:29]
The hell does that mean? We don't know what that means. So you have to run that through an LLM to see what the intent of their approval was and then figure out if they left off in an approval state or not. And if they did, write the appropriate message type to your database, in this case, would be a tool response.

[00:09:47]
And then the loop continues. So there's there's a lot of work that you have to do to make sure that it's not broken and it's a good user experience, at least in our use case.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now