Open Source AI with Python & Hugging Face

Attention Mechanism to Focus Model

Steve Kinney
Temporal
Open Source AI with Python & Hugging Face

Lesson Description

The "Attention Mechanism to Focus Model" Lesson is part of the full, Open Source AI with Python & Hugging Face course featured in this preview video. Here's what you'd learn in this lesson:

Steve discusses using vector databases to augment prompts for models like ChatGPT or Claude, the process of tokenization, embedding, and finding relevant content to enhance queries. He also demonstrates visualizing semantic relationships between words using BERTviz.

Preview
Close

Transcript from the "Attention Mechanism to Focus Model" Lesson

[00:00:00]
>> Steve Kinney: With stuff like retrieval, augmented generation of prompts, effectively, that's where you take your own data, you turn it into a bunch of vectors, you store it in a database, a vector database, a database of numbers. And then if you were to say, hey, I wanna augment this prop I'm sending to ChatGPT, you would use the same tokenization and embedding.

[00:00:23]
You turn your own prompt into numbers, find your own content that you would also turn to numbers. The vector database does effectively what the transformer is doing. It finds the chunks of your own text that are relevant to your prompt and then kind of like just puts the text onto the end of your prompt.

[00:00:39]
And to give you automatically take maybe all swaths of data. In fact, if you open up something like Claude and you hook it up into Google Drive, that's effectively what it's doing, right? It's kind of taking your data, vectorizing it, and then figuring out what stuff matches and appending it onto the query to give it more context.

[00:01:00]
So even if you're not building your own transformers, which most of you are not, you probably shouldn't be building your own vector database. I did the other week, I can't recommend it. It works on indexeddb. That was pretty cool. But you can actually kind of use. These are the same ways that vector databases work for basically creating a search engine for your own data that you then use on top of these prompts.

[00:01:23]
So like that idea of turning stuff into tokens, turning into IDs, figuring out the pieces of your content that match is the same concept that you can use to kind of build your own knowledge. Query systems that augment ChatGPT or Claude or Gemini. And that is somewhat of a segue to actually seeing how this attention stuff works in action.

[00:01:45]
So I am going to install some dependencies. They're nothing particularly crazy, except for this bertviz, which will visualize some of these tokens or the embeddings and show you how they can relate. Cool, cool, cool. Here's some of the examples from previously and some of the notes that I then later turned into slides so we can move past some of that.

[00:02:13]
And so here we have the cat sat on the mat and the dog sat on the park, played in the park. And what this library will do is it's basically just a way to show you the semantic relationships between the words. And so I'll go ahead and hit play, even though it's already loaded.

[00:02:32]
But just to make my point, and you can change any of these sentences in this Playground. So if you don't have it open, come on over, open it up and play around with some of these sentences and we can see like how two different sentences maybe relate or don't relate.

[00:02:48]
Cool, so right now it's going to install some dependencies for me, even though I had a chart that I could have just looked at. It's okay, it'll be here in one second. But what this will do is like take two of our sentences, right? Two different strings of text that technically, I guess these are technically not different lengths.

[00:03:05]
Even though I'm not sure how the tokens will work on that. We're gonna tokenize both of them, right? And we'll get the outputs and then we're going to kind of like visualize how they relate to each other and we'll try out with a different one in a second.

[00:03:21]
Getting impatient, could have hit that. Could have just not re. Hit the button. But here we are. So we're PIP installing. Hold on, who doesn't like a good NPM install or PIP install at the beginning of trying to get anything done? So we are downloading some dependencies.
>> Speaker 2: Does it pull everything down by default every time or is it smart enough to cache stuff if it already has it?

[00:03:43]
>> Steve Kinney: So on the Google Colab side, if you are running the instance and you don't disconnect, it'll have this stuff. But if you start it up cold and you get a brand new VM, you will download the dependencies again. The thing that I am not doing on purpose is you can actually have all the outputs saved.

[00:04:06]
So you could theoretically run all your data, produce all your graphs, all your outputs, send it to your buddy. They will see the outputs without having to rerun the code. But if you want to run the code, you will most likely get a very fresh VM Right now if the VM is running and you're still connected to it, great.

[00:04:27]
But if you get a fresh VM is like getting a blank slate. You will install everything all over again. Now, some libraries like Torch, like I think the Hugging face Transformers library are just baked in and already. And there's a Docker container where you have certain things set up already.

[00:04:41]
Some of them you will have to pull in yourself.
>> Speaker 2: What about persistent disk? Could you write dependencies there and then pull them in from the disk? Or is it.
>> Steve Kinney: My sense of my own experience is no, right? I've watched this. It's tricky cuz there is a difference between if you do restart session and run all, yes.

[00:05:02]
If you go into runtime and you hit disconnect and delete runtime, no. And then the question is that if I send it to you, I don't think you get my runtime, right? So ish, right, cool. So now we can run that and we can see. In this case, at any given point, you can hover over and see kind of the relationship between two words.

[00:05:30]
So here we've got the two various different sentences. There's not a lot of overlap between the stuff we have in the cat sat on the mat and the dog played in the park. There's not a lot of lines getting drawn there at all other than to the beginning of sentence and end of sentence separators.

[00:05:52]
However, if we change the strings a little bit and we can say these rob the bank, we run it again. Obviously, there's a lot more interrelationship between these tokens. This is visualizing what Bert is doing under the hood. Obviously, the could be related to all these thieves and robbed.

[00:06:27]
A pretty tight relationship. But bank, as you can see, has got a line down to river. I understand anyone looking at this giant screen is not seeing it nearly as well as I am on mine. So there is a little bit of a contrast issue here. But you got it, yeah, on that screen, it becomes a little harder to see if you're in the room with me.

[00:06:48]
But if you pull up the notebook yourself and take it for a spin, you will see them in a little more clarity than it shows on the screen. But for those watching the livestream or later, we can see that in given cases, there is a line between bank and robbed.

[00:07:03]
If we look at bank, we get a line apparently robbed is probably showing up more than river in a lot of context. And like, you can see, basically, this is visualizing for Bert some of the relationships between all of the words and how they kind of relate to each other.

[00:07:21]
So you can kind of just see how some of those have been, like, correlated through the, like, millions of knob turns over time as it tries to figure out some of that meaning. And like I said, what we're building towards is eventually we're gonna take GPT2 medium and we're gonna feed it a large data set of, I think, like, 16,000 quotes.

[00:07:44]
And those quotes are all gonna be a certain format. It's gonna be like, quote from Oscar Wilde, and then it's gonna be colon and then some quotes, right? And we're gonna see it as we kinda train it and we can just feed it a ton of them. With the expected outputs, we're gonna see that we can kind of strengthen those relationships.

[00:08:03]
And all of a sudden we'll go from like, who knows what comes out when we start a sentence with that too. Even if it's a made up quote, it will be in the format that we expect. It will have quotations around it. Because the act of sending all of those things and training it to be like, this is what I want.

[00:08:21]
Nope, Slap it on the wrist. This is what I want. And you can again tune those parameters to actually build those relationships up. Cool, so it's just a way to visualize, try it out with some different strings, like things that are widely related, words that have different meanings, so on and so forth.

[00:08:42]
And you can kind of like get a sense and play around with it as well. Different things. Like other, like, dual meanings of words are worth trying as well. Right. I don't know, this one will work. And I. I'm already hitting the end of my sports knowledge. Right.

[00:09:02]
Like, I guess stealing first base and stealing diamonds is the same thing. I don't know why my mind is going to crime.
>> Speaker 2: First base.
>> Steve Kinney: What's that?
>> Speaker 2: You can't steal first base.
>> Steve Kinney: Yeah, that's true. See, this is why I can't do sports metaphors in these workshops.

[00:09:20]
Because already.
>> Speaker 2: They don't tag you. Isn't that a thing? That's not stealing first base.
>> Steve Kinney: Okay, all right, we're abandoning the sports metaphors. Okay. I clearly can't even make us. Like, that was a safe part of this. That was my second one. Yeah.
>> Speaker 2: Steal second, third, and home.

[00:09:36]
>> Steve Kinney: All right, all right. No more sports metaphors. Play around with this. See how some of the words. But like, the neat part is you can, like, get a sense of, like, seeing like, how this model is tuned.
>> Speaker 2: What does the levels mean? Where it's at or the layer.

[00:09:50]
Layer. Cuz I was playing around with that.
>> Steve Kinney: Yeah. I'm not like, let's see.
>> Speaker 2: Go to like, layer two or layer. Yeah. And then go take a look.
>> Steve Kinney: I'm not totally sure off the top of my head.
>> Speaker 2: Okay.
>> Steve Kinney: Yep, but I do know that the visuals are a lot better, right?

[00:10:06]
It's probably as it's going through, like the various, like. Yeah. Cause look how, like, that's interesting. Like, I'm not totally sure on that one. It's probably as like, I'm guessing as it goes through each layer. I'm curious, like, if I grabbed a different model, like, if we would get more layers and stuff like that.

[00:10:22]
That'd be fascinating.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now