Lesson Description

The "Spring AI" Lesson is part of the full, Enterprise Java with Spring Boot course featured in this preview video. Here's what you'd learn in this lesson:

Josh introduces Spring AI and discusses how Java developers can leverage their knowledge of building robust APIs and backend applications as they integrate with AI systems. A Spring application is scaffolded with Spring AI, support for MCPs, and a Postgres vector database.

Preview
Close

Transcript from the "Spring AI" Lesson

[00:00:00]
>> Josh Long: I wanted to save the best for last, right? This is a Frontend Master's course, it's awesome. And right now the front end is changing a lot, right? It's a little hyperbolic today, but I don't think we can argue with the idea that a lot of things are going to be done through a model, an AI model of some sort in the near future, right?

[00:00:23]
Maybe not tomorrow, maybe not next year, but for some use cases today and for a lot of use cases, maybe next year, maybe 10 years from now, whatever. The model is becoming the new user interface, right? The client, the chat client is becoming the new user interface and we're seeing this, you know, this sort of utility balloon, right?

[00:00:41]
It's not great. AI as we know it today is not great for everything, obviously, but there's no doubt it has some good use cases. And so I think we in the Java community are well positioned to take advantage of this moment in time. Right now most of what people talk about when they talk about AI engineering is just integrating, it's calling a REST API, right?

[00:01:01]
This is not hard. Most of us are not going to be building a model in AI. That's not what most people are doing when they talk about AI engineering. They're not building their own models in the same way that most people aren't building their own SQL databases either.

[00:01:15]
They're using one that's been pre built and stabilized and expanded and grown and all that. So if that's the name of the game and if the integration protocol du jour is HTTP for most of these models, then there's no reason to be afraid. This is a really easy problem to solve, right?

[00:01:33]
AI is here, it's easy to use, it's easy to integrate, and arguably there is no better place to integrate it than your Java code, right? Because where's most of the business logic in most organizations at any size written in these days? I'll give you a hint. It's not Python, it's Java, right?

[00:01:49]
Look at the enterprise, that's Java. The data scientists, sure, they're using Python. That's not where your business logic is, that's not what's driving the scale on your services and your systems. So what we wanna do is to make your business logic and the data that business logic guards available to the models to see if that can help.

[00:02:07]
And so, we created this project called Spring AI. So what we're gonna do is we're gonna build, we're gonna revisit our old friend Prancer and answer the question, how did we find him, right? How did we learn about Prancer? How did we, like, I want to actually go through the motions of adopting that dog.

[00:02:23]
We know how to do the adoption. We know actually how to update the adoption table, right, with that field. But we only knew about Prancer because I saw this thing on the news. This dog was so hilariously described in that advert that somebody put it out there and it went on the news.

[00:02:40]
That's how I found out about Prancer. That's not how one imagines, at least most people discover the dogs of their dreams, or in my case, nightmares, right? They go to the shelter, they have a conversation. There's a whole process. There's paperwork, there's formality, all that kind of stuff.

[00:02:54]
So what I want to do is to build an assistant to help people adopt dogs from our fictitious dog adoption agency called Pooch Palace. Okay, so that's what we're going to do today. That's our purview, that's our mandate. So we're going to go here, desktop talk. Okay. Init database.

[00:03:11]
Okay, I'm goinna make sure I have the dog database there. And we're going to go back here and we're going to build a brand new project, GraalVM. We're gonna use the web support. We're going to use OpenAI. Now I'm using OpenAI. You can use whatever you want. You can use Ollama.

[00:03:26]
Ollama runs locally. It's like Docker for models. Okay. It's a container. Actually, Facebook started this all, right? OpenAI is a proprietary hosted software as a service AI model. But Facebook released a open source AI model called Llama that was so popular that it kind of, because of that, they standardized, they sort of extracted out the C bindings required to run it, and that became a container program called Ollama, right?

[00:03:53]
And you can use Ollama to then run any other model. There's thousands of models that you can run via Ollama, okay? And I think Docker sort of feeling a little left out. They just recently announced a feature here called Docker Model Run. So Docker Model Runner. And we just announced this.

[00:04:11]
We just announced support for this in Spring AI M7 in, I don't know, what is it now? 14 days ago, 13 days ago, not even two weeks ago. Okay, so this just came out a few weeks ago in Docker and we have support for it already in Spring AI.

[00:04:26]
So you can use Docker to run models now as well. That's why I didn't ask you to install Ollama, right? You can use Docker if you want, right. There's infinite variety there. And this is something that's kind of different from the other that I use the metaphor that it's sort of like a data source, but it's actually a little different in that it's not unreasonable to have more than one model in the same application.

[00:04:46]
Whereas if somebody had more than one, if you have like two postgres in your application, that might be a smell, right? If you saw a program that had five different databases, I might wonder if that's maybe a little overwhelmed, a little overstuffed, you know. But in the model scenario, it's not uncommon to have more than one model.

[00:05:01]
Okay. And you can think of, just imagine the efficiencies, imagine the size differences. Some models are very sophisticated, they can do anything. Some like Ollama, I'm telling you, with Ollama you could do darn near anything. It is quite good, right? Google, they have one called Gemma, which is like a lightweight one that you can run locally.

[00:05:18]
These are small models that are really, really fast and really, really good. And they can do, they can write poetry in five different languages in the same request. They can tell you about history, they can do all sorts of things, right, Just like you'd ask from ChatGPT or OpenAI.

[00:05:33]
Some models, however, are much more terse. They're lightning fast, don't hesitate, not at all. But also they are, maybe they only speak a subset of English or one language and they don't have a full grasp of history or whatever. You know, they, they can't write code because they don't know anything about code.

[00:05:49]
They're optimized for certain use cases. So it's not unreasonable to have an application that uses maybe a fast model to do quick hot path analysis and then forward the request onwards down to a more full featured, full fledged model as well for final analysis. So I'm not gonna do that.

[00:06:07]
But my point is you have a lot of options when you use Spring AI. We support literally dozens of different models. And by the way, some models we don't even OpenAI. Kind of like with S3. You know how there's like 500 different services that are S3 compatible, so it's not Amazon that hosts it, but they speak the same protocol.

[00:06:25]
The same has kind of become the case for OpenAI. There are dozens of different models from all over the world in different countries and all that stuff that are OpenAI compatible. That is to say, if you speak the OpenAI wire protocol, the REST API, then you can talk to their model even though it's not OpenAI.

[00:06:43]
Okay, so we're going to use OpenAI. I've got that in the class path here. We're gonna bring in Postgres, right? We got a Postgres driver. We're going to bring in. Not Postgres, we're going to bring in PG Vector, actually, okay? PG Vector is Postgres, plus some extra stuff, okay?

[00:07:01]
And I think that's pretty good. That seem right? That seems like it's okay. Okay, good. So we got PGvector. What else? We want the MCP client support, okay? We'll talk about what that is in a second as well. And I think I'm pretty happy there. We're gonna call this adoptions.

[00:07:21]
Okay, hit Enter, go here. There's my database full of doggos. UAO adoptions, zip. Now we're going to build a application that talks to that SQL database. So same as always, same setup here. Secret username, my user URL, jdbc, postgres, QL localhost, my database adoptions application. And we're going to build a simple controller here.

[00:07:59]
Just an assistant to help people adopt their dogs, to answer questions about dogs, okay? For this fictitious dog adoption agency called Pooch Palace. So we can imagine having a username maybe. Maybe if I had set up OAUTH or some sort of security context, I could just infer that from the current authenticated user, the principal.

[00:08:20]
But I did not because I'm bad as a person, okay? And inquire. Right. And then this will be a request parameter. Param, there you go, or param. Good. And I'm going to now inject private final chat client. Okay. The chat client is my interface, just like the REST client, the JDBC client, the web client.

[00:08:48]
You've seen a bunch of clients so far. The chat client is my interface to talk to a model. What model? Well, like I said, I'm going to use OpenAI. How did I connect to OpenAI? I have an API key, Right? I've specified that. And there's a lot of other prop.

[00:09:01]
Like if you're Talking to actually OpenAI, that's usually enough. You specify the API key. I already did this though, before I got on stage, my friends, I exported an environment variable and then that's running in my shell, right? So just trust me that it's done and forgive me for not leaking my API credential, okay?

[00:09:17]
You just need to remember to do that when you log in, okay? And for most, if you're using actual OpenAI, that's enough. But a lot of times you might be using something else. So notice that we have base URL for these different OpenAI modules, right? Which brings us to another good point.

[00:09:31]
What does a model provide, right? I'm assuming we're talking about chat, right? So there is the chat client, right? The chat client is a easy way to talk to a chat model. Okay, here's your chat model. But there's also transcription, right? Models. There's also embedding models, right? There's all sorts of stuff.

[00:09:58]
There's all sorts of different models that you can use. And so we're going to use the chat model. And not all models provide all those different interfaces, right? Like, embedding is orthogonal to chats. You don't need to do both to be an effective model, right? You can do just chat and you can delegate embedding to something else.

[00:10:15]
So one of the things you have to remember when you use ollama is you might also use different ollama models for different purposes. One for chat, one for embeddings. We'll talk about what embeddings are in a bit. But so OpenAI just happens to have everything. It has the ability to do all those things.

[00:10:30]
When you auto configure an OpenAI starter, when you use that starter, it auto configures the embedding model, the transcription model, the image model, the chat model, et cetera, okay? So I'm going to build a chat client. This chat client builder already has a pointer to the one already auto configured chat model.

[00:10:45]
So I'm gonna use that chat client. By the way, this chat client is really good code. I think the people that wrote this code are the best, the best in the business, okay? Surely they are, and I think you will benefit from using it as well. Now, moving on, we're going to make a request here, okay?

[00:11:00]
So I'm going to say, return this AI prompt. When you make a prompt, you're asking again. Just please remember this. Everything I'm showing you here is just a fluid type safe DSL on the almighty string concat operator. All interoperability, all integration. That's just sending strings to an AI model's HTTP endpoint.

[00:11:26]
This is not special. Don't, don't get it twisted. This is a pretty straightforward stuff. So what are you gonna send? Well, in the body of that request, you can send a user prompt that's just A regular question. And I happen to have a question. Great. I can also send a system prompt.

[00:11:40]
We're going to do that in a bit, but not right now. I'm going to ask for the content that comes back as a string. There's the string right there. Let's go ahead and restart this now. HTTP:8080 in jlong inquire question==. And I'm going to do a post question equals my name.

[00:12:05]
This is not actually a question, I guess, but whatever it is, Josh, let's see if that works. Okay. Nice to meet you, Josh. How can I assist you today? Faboo, it knows me, right? We're best buddies now already. Let's just. Let's just put that to the test though, right?

[00:12:21]
What's my name? In shambles, it's horrible. I have that effect on people, though. It's not. It's not the first time. It just still hurts. It's already forgotten me, okay? I just talked to it a second ago and it's already forgotten who I am. It says, I'm sorry, but I don't know your name.

[00:12:41]
If you'd like to share it, feel free. Curse you. OpenAI. It doesn't know. And this is very different from the experience that a lot of people are accustomed to when they go to ChatGPT. Right? And the browser ChatGPT is a chat client that talks to an. An API model.

[00:12:55]
A model that has an API, that chat client keeps memory. It has session, right? It has state. It's durable. But the API itself doesn't. It's like Dory the Goldfish from Finding Nemo, right?

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now