Training a Pixel Data Model

Codesmith

Lesson Description

The "Training a Pixel Data Model" Lesson is part of the full, Hard Parts of AI: Neural Networks course featured in this preview video. Here's what you'd learn in this lesson:

Will builds a pixel detection model based on the sample data. The model is trained on two samples, one smile and one non-smile. A matrix of multipliers is built to convert the pixel data into the correct output value. This system represents the foundations of a neural network.

Join Now

Preview

Transcript from the "Training a Pixel Data Model" Lesson

[00:00:00]
>> Will Sentance: Now our job, can we possibly build, just as before, can we build a converter? Can we create our converter? Everyone, what's the real name of the converter that's converting these pixels into a value? What do we call converters? Not converters, we call them?
>> Speaker 2: Models.
>> Will Sentance: Models, well done, people, that was a risk, wasn't it?

[00:00:27]
Good job, models. And we're gonna use our sample for our training stage. You see, it's exactly the same. Okay, let's get our sample in place. So we have 1 0 1, 0 0 0, 1 1 1, 1 1 0. And that's our
>> Will Sentance: Smile 1, beautiful. Then we have another element of our sample, it's a tiny sample, but allows us to visualize it on the blackboard.

[00:01:00]
And unfortunately I've never used a computer, so I don't know how to do that. [LAUGH] There you go, Smile -1. So, we have 12 numbers, that's all it is. That's how an image is represented. That's what it's under the hood. I know to us it looks like a face.

[00:01:18]
And this of course, looks exactly like a face. But under the hood is with pixels, dark and light and or, you could say bright and dim, whichever way you wanna think about it. And then a value attached to it, Smile 1 or Smile -1, just like Refund 1, Refund -1.

[00:01:34]
Okay, and then we have to develop some way to convert these 12 numbers and their positions into 1, and these 12 numbers into -1. If we can work out that converter, then we can apply it to this image and identify 1 or -1, smile or not. Pretty cool.

[00:02:02]
How do we do it? Well, neural networks. For images, and really for much other labeled data, that is data with associated label, the game was changed by neural networks. Introduced decades and decades ago, but took off in the last twenty years. And really transformed in the last five, a little pun there, as we'll see under large language models, but also many different models where transformers are a vital part.

[00:02:33]
They can interpret images and much data better than really, anything else. To the point that data sets that have previously been used to test out how good is your converter. And you go and download a data set off the Internet to check how good your converter is, like one called MNIST, MNIST, National Institute of Standards and Technology.

[00:02:55]
Developed in the 80s with USPS to help better classify, or sort, their mail parcels by having a big data set of hand drawn, messy digits. And then the label for them of 7, 7, 7, 8, 8, 8, to be able to interpret zip codes potentially automatically, recognize, look at an image of a hand-drawn 7, and then spot, is that a 7?

[00:03:29]
A hand-drawn 8, is that an 8? Hand-drawn 2, is that a 2? That MNIST dataset, I mean, I'm not saying completely redundant now. But neural networks, modern neural networks are so good at capturing and classifying those images that you need harder bench-marking data sets to work with. And that only really was achieved in the last 10, 12, 15, and a push years.

[00:03:51]
So there's a huge leveling up in the toolkit we have. As again, I was sort of commenting there about OpenAI, what everyone thinks, as Sam Altman said in an essay he wrote recently. We got lucky to have discovered the algorithm, the model, the converter, that somehow, better than we ever perhaps imagined possible, captures vastly complex, underlying distributions of data.

[00:04:22]
That's patterns of data. The pattern here might be what numbers and where are typically associated with not smile, what numbers and where are typically associated with smile. The underlying patterns in data, we found an algorithm, a model, a set of instructions that is better than we could ever have imagined at identifying those patterns and distributions.

[00:04:49]
And that is the neural networks, and particularly, transformers. And that is quite extraordinary. So, how does it work? Here it is. It's probably gonna be best for us to just give it a go. Convert pixels, 12, how many pixels are here, Kayla?
>> Kayla: 12.
>> Will Sentance: Yep, into output label, what are our options here, Ben?

[00:05:15]
>> Ben: 1 or a -1.
>> Will Sentance: Spot on. Multiply each pixel number by a different number.
>> Will Sentance: By a different number. Multiply each pixel by a different number. So we will go, this one by whatever this number is. This one by whatever this number is. This one by whatever this number is.

[00:05:38]
This one by whatever this number is. But every number, and then add up the total. So multiply by a a grid of numbers and add up the total. Compare the total to the actual target output label value, Smile 1 or -1. Adjust the numbers we multiply by until the group of pixels labeled Smile 1, the multiplied pixels add up to number 1, and the ones labeled -1, the multiplied pixels add up to -1.

[00:06:09]
And that gives us a grid of multipliers.
>> Will Sentance: That's gonna be it. [LAUGH] That is it, we're gonna see. That is it. So we have our, as before, target output. There we go. Now we have our converted output. All right, Joe, target output for our first set of 12 pixels, 1s and 0s, the target output is what?

[00:06:41]
>> Joe: 1.
>> Will Sentance: 1, spot on, Joe, target output, the other one?
>> Joe: -1.
>> Will Sentance: -1, I'm still getting used to the negative 1 versus minus one. -1, exactly. We're gonna multiply each pixel by a different number, and the total, and then we're gonna compare the total to the actual output label that was given by, I don't know, someone working at Scale AI or just us looking at it.

[00:07:03]
And then adjust the numbers we multiplied until the pixels labeled Smile 1, when multiplied, add up to give 1, and the pixels labeled -1, when multiplied, add up to give -1. I've got a little clue here, we can ignore the pixels where the numbers are the same in both of the images, both the one that's labeled Smile 1 and the one that's labeled Smile -1.

[00:07:23]
Because they're not gonna have any effect, I wanna try and have the multiplied numbers add up to 1 and -1. So if the numbers are the same in these, it's not gonna have any effect what we multiply it by. So we might as well just have those multiplied by 0.

[00:07:42]
So let's do it. We've got our multipliers, here they are. If we know that for 1 0 1, is that the same in the top row of the both images? Ben, are they the same? Yeah, next row?
>> Ben: Yes.
>> Will Sentance: Yes, next row? Not quite. We got the, these ones are the same.

[00:08:02]
So let's put zeros wherever it's the same. So 0, 0, Ben, what's next?
>> Ben: 0
>> Will Sentance: 0
>> Ben: 0
>> Will Sentance: 0
>> Ben: 0.
>> Will Sentance: Everyone together.
>> Speaker 2: 0, 0, 0-
>> Will Sentance: I wanna hear everybody.
>> Ben: 0, 1-
>> Will Sentance: No, [LAUGH] that's also not gonna be, so we don't know this one yet, do we?

[00:08:24]
We don't know this one yet because we need to work out what these multipliers are gonna be. We know confidently that if the pixels are exactly the same, there's no difference between the pixels but for a label of an image Smile 1 and -1, then we can just put 0.

[00:08:45]
So 0 here, what about this one here, everybody?
>> Ben: 0.
>> Will Sentance: 0, what about this one here, Kayla?
>> Kayla: 0.
>> Will Sentance: 0, but these are different. Okay, so we won't put 0 on those. Our goal, multiply all these numbers by all these, multiply all these numbers by all these.

[00:09:04]
So we're gonna multiply all these numbers by my little grid of numbers, and then all of these by my same little grid. I'm just, [LAUGH] these numbers, there they are. And then I'm going to
>> Will Sentance: Add them up. So let's go. Maria, 1 multiplied by 0 equals what?

[00:09:33]
>> Maria: 0.
>> Will Sentance: Excellent, Maria, you got us started, you keep us going, 0 by 0?
>> Maria: 0.
>> Will Sentance: 1 by 0?
>> Maria: 0.
>> Will Sentance: 0 by 0?
>> Maria: 0, 0, 0.
>> Will Sentance: 1 by 0?
>> Maria: 0.
>> Will Sentance: We don't know this one yet, we'll hold on it. 1 by 0, Maria?

[00:09:52]
>> Maria: 0.
>> Will Sentance: 1 by 0?
>> Maria: 0.
>> Will Sentance: We don't know the next one, 1 by 0?
>> Maria: 0.
>> Will Sentance: And it all sums up to what?
>> Maria: 0.
>> Will Sentance: 0, okay, let's make sure we're tracking as before how accurate is our set of multipliers converting our pixels to our output, Smile 1 or Smile -1, is 0, 1.

[00:10:21]
>> Maria: No.
>> Will Sentance: No, no, so out of two so far, not accurate. Let's give Maria a hand for saying zero so many times without stopping me. All right, let's get converting our next image's pixels with our multipliers. Who's gonna be the lucky person? Jamie, you're up, 1 by 0?

[00:10:47]
>> Jamie: 0 0 0, 0 0 0.
>> Will Sentance: Yeah.
>> Jamie: 0, unknown, 0. 0, unknown, 0.
>> Will Sentance: Beautiful, comes to?
>> Jamie: 0.
>> Will Sentance: Is 0 equal to -1? We want this converter, this model, to successfully convert our pixels into this label so that we know that if the population mirrors the sample of what makes a smile 1 and what makes a smile -1, not.

[00:11:17]
Then we can apply that same converter to convert these pixels into 1 or -1.
>> Will Sentance: How's it doing so far, Jamie? Is 0 equal to -1? It's definitely not. So what's our accuracy?
>> Jamie: 0.
>> Will Sentance: 0 out of 2, okay. Raise your hand if you can already spot what our multipliers need to be here to get it to fully convert?

[00:11:43]
Well, good, I will say this, it's not easy, because you gotta kinda, it's a little puzzle. Well, let's try and do it. We know that this one here, we want it to be what, Ben? We want this to sum to what?
>> Ben: 1.
>> Will Sentance: 1, cuz if it does, then we've successfully converted these pixels into number 1.

[00:12:00]
And we want this one, Ben, to sum to what?
>> Ben: -1.
>> Will Sentance: -1, if you want this one to sum to 1, Ben, we know that we've got a 0 here, so what's this only ever gonna be this value, the multiplier?
>> Ben: There will only ever be 0.

[00:12:13]
>> Will Sentance: Yeah, whenever we have as a multiplier for it, what's, it multiplied by 0, Ben? 0, 1, if we want this to 1, what do we need our multiplied pixel to be here?
>> Ben: 1.
>> Will Sentance: 1, beautiful. Which means our multiplier is going to need to be what?

[00:12:33]
>> Ben: 1.
>> Will Sentance: 1, and we might be like, is that gonna break the other one? Why is it not gonna break the other one, Ben?
>> Ben: Because the value is 0.
>> Will Sentance: Multiplied by 0, it's not gonna matter either. Some of all this now is what, Ben?
>> Ben: 1.

[00:12:46]
>> Will Sentance: 1, does that equal our total our target output? It does, yay, yay! What's our accuracy now, Ben?
>> Ben: Half.
>> Will Sentance: Half, fine, 1, exactly, 1 out of 2, beautiful. Okay, the converter's getting better, people. Okay fine, yay! Well done Ben! Well done Ben!
>> [APPLAUSE].
>> Will Sentance: Convert's getting better.

[00:13:12]
Now, a bit harder, let's have Paul. Paul, we want this one to sum to -1, what are we gonna need the value here to be? The multiplied pixel up here on this set of pixels.
>> Paul: -1.
>> Will Sentance: -1, spot on. What's that implied for our multiplier there, Paul?

[00:13:32]
>> Paul: The same.
>> Will Sentance: The same, it's gonna need to be -1, 1 multiplied by -1 equals -1. So, Kayla, what's our sum now for this set of multiplied pixels, for the second image?
>> Kayla: -1.
>> Will Sentance: -1, amazing. Is -1, Kayla, equal to -1? You bet it is. Have we successfully converted this set of 12 pixels into our target of -1, Kayla?

[00:13:59]
>> Kayla: Yep.
>> Will Sentance: You bet, we have. What's our accuracy, Kayla?
>> Kayla: 2 out of 2.
>> Will Sentance: How do I maintain this enthusiasm about this basic math? [LAUGH] Do we get 2 out of 2? You bet we did, exactly, a woot is in order.
>> [APPLAUSE]
>> Will Sentance: Well done, well done, well done, well done.

[00:14:19]
We have successfully produced a set of multipliers now. Ben, you were saying you probably spotted it was -1 and 1 immediately, right?
>> Ben: Yeah.
>> Will Sentance: Yeah, there's a little puzzle, but it's like, here's the thing, we'll see in a moment. Hooray, what on Ben, while on all of us, but I'm gonna be pretty sure that if we have to have more multipliers and anything more than two sample images is gonna be pretty damn impossible.

[00:14:44]
Even harder than a decision tree. Good, well, we'll see what these numbers might mean in a second, but it's a lot more obscure, isn't it? So we multiplied each pixel by a different number. We then added up the total, I should have written, sum. What's our favorite symbol for all?

[00:15:00]
>> Speaker 2: Upside down A.
>> Will Sentance: Upside down A, sum all. And then compare the total to the actual output label value, 1 or -1. The actual being labeled by a person in the sample, by Scale AI or someone, like that. Adjusting the numbers, we multiply each pixel by, we kind of implicitly had 0 and 0 here, didn't we?

[00:15:17]
If it was adding to 0, we had kind of implicitly, 0 and 0. We then adjusted to 1 and -1 to ensure that the pixels labeled Smile 1, those 12, when multiplied by those multipliers, added up, gave a total of 1, matching Smile 1. And the pixels labeled -1, when multiplied, these pixels, and add it up, gave a total of -1.

[00:15:43]
Meaning we can now, give us a grid of multipliers. What's a grid of numbers called, people, in math?
>> Ben: Matrix.
>> Will Sentance: Matrix, exactly. Matrix, what's it called when we multiply matrices together? We were multiplying a grid of numbers by another, what's that type of math called?
>> Paul: Linear algebra.

[00:16:02]
>> Will Sentance: Linear algebra, that's the one from Paul. Linear algebra. It's the math of the rules associated with multiplying grids of numbers by other grids of numbers. Things like, they better be the same dimension, I can't multiply a 4 by4 grid by a 3 by 4 grid. And there are lots of effective ways to multiply matrices.

[00:16:25]
It's at the core of neural networks. It's why, actually, some of the chip design for, as we'll see in a moment, working out the best multipliers are designed specifically to be perfectly optimized for handling matrix multiplications. Because at the core of, we'll see training, a, whatever this is in a second, we'll say, this model is matrix multiplication or linear algebra.

Learn Straight from the Experts Who Shape the Modern Web

In-depth Courses
Industry Leading Experts
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now