Validating Weight Accuracy

Codesmith

Lesson Description

The "Validating Weight Accuracy" Lesson is part of the full, Hard Parts of AI: Neural Networks course featured in this preview video. Here's what you'd learn in this lesson:

Will updates the model weights to reflect 100 iterations for gradient descent. A new smile image is added to the sample and used to validate the current state of the model. The weights achieve a 93% success rate and 7% error rate.

Join Now or Start a Free 5-Day Trial

Preview

Transcript from the "Validating Weight Accuracy" Lesson

[00:00:00]
>> Will Sentance: Let's just check we trust that. Let's check that it's actually working by updating our weight. So this is after a hundred rounds of tweaking. We did one, and got a lot better. Well, not quite so much better with the sigmoid adjustment, but people, whereas we could never find a perfect set of weights to match 1 to 1, -1 to -1, -1 to -1, 1 to 1.

[00:00:30]
Now we're able to do so because, we're dealing with closer to a smooth curve towards the 100% and 0%, through the sigmoid function. And it's also nicely, giving us a much more familiar than 1 and -1, giving us a 100% for smile, and a 0%. And it mirrors roughly a sense of the confidence we have in that prediction.

[00:00:54]
So, let's get validating, let's use these multipliers updated for our after a hundred rounds of tweaking them. And you can see the very first round of tweaking, we jumped all the way from 0s to 1s. After 100, we've taken a lot smaller steps to get there, but we got there, so help me out here people, what are our weights?

[00:01:16]
Just help me out so I can read off there, and you can start practicing the math, top left.
>> Speaker 2: -0.5.
>> Will Sentance: Yeah, -0.5, beautiful, top right.
>> Speaker 2: Same.
>> Will Sentance: -0.5 brilliant, bottom second up left 2.2, and then the right one is 2.2 as well, and then the middle one is.

[00:01:39]
>> Speaker 3: -2.5.
>> Will Sentance: -2.5, everyone getting their math ready, bottom left, -1.6, 1.9,
>> Speaker 3: And -1.6.
>> Will Sentance: Okay, it's, time for some validation, and I've not used my balls space well enough. So firstly, before we get validated with another sample, let's just look at what our outputs are, of multiplying here.

[00:02:15]
I guess we could prove it to ourselves. Let's do a rough back of the envelope calculation from, well, let's do it for the easiest one, because it's only got five numbers you wanna multiply. So, we run our flat face, which should have the output of 0%, through final multipliers after 100 iterations.

[00:02:34]
1 by- 0.5, everyone together?
>> Speaker 3: -0.5.
>> Will Sentance: Yeah, plus 1 by -0.5, -1 in total let's keep the running total, now it's -2.6. The -1 plus one point state of 1.6 plus 1.9 multiplied by 1, -2.6 that gives us negative, everyone's arithmetic,- 0.7, right? Yes, plus -1.6. -0.7 plus -1.6 comes to what, everybody?

[00:03:11]
>> Speaker 3: -2.3.
>> Will Sentance: Is that right, yes, did you see the answer, good, -2.3, exactly. So for this one, our total from adding them all up, and then we put that into sigmoid is -2.3. What's -2.3 output?
>> Speaker 3: Less than 10%?
>> Will Sentance: It's less than 10%, exactly, and what is it, roughly?

[00:03:34]
>> Speaker 3: 9
>> Will Sentance: 9%, beautiful, let's do it in green because it's sigmoid, it's 9%. Is 9% equal to 0? No, but it's like, that's the symbol for roughly, I'll take it. And then we'll just do the other ones, fully, what's the output of running, new multipliers on this set of 12 pixels anyone see what the total output is?

[00:04:03]
>> Kayla: 5.2
>> Will Sentance: 5.2, excellent Kayla, 5.2, which is roughly what through Sigmoid, Kayla?
>> Kayla: 99%
>> Will Sentance: 99%, which is very much roughly a 100%. Okay, 2 out of 4, let's do it for this sad phase here, it comes to roughly what can we see on there, negative?
>> Kayla: 2.4.

[00:04:31]
>> Will Sentance: 2.4, through sigmoid we get?
>> Kayla: 8.
>> Will Sentance: 8%, which is good enough roughly, 0% we'll take it, and then the top one, we have what Kayla?
>> Kayla: Two.
>> Will Sentance: Two, which comes to?
>> Kayla: 89.
>> Will Sentance: 89% just below our 90% as I said, just below the 90%, 89%, which is good enough [LAUGH] for our model to 100%.

[00:05:05]
Well, let's switch to using a percentage now, we have accuracy of 93% on average, and error how far off are we of 7%. So, that's how far off are we on each case as opposed to like, is it right or wrong, because we're no longer right or wrong.

[00:05:27]
So we're very, very confident that the top and the bottom are smiles, and we're very confident that the middle two are not. Our model has done a really good job of converting our samples such that we can now use it on another image, or even check it without validation data.

[00:05:48]
But look at this, people, look how different the value is. It's 5.2 on this lower image, and it's 2 on this upper image before we run it through sigmoid, where it becomes 99% and 89%, both quite confident. Now we have a measure of our confidence. This is a very smile, very, very smile, this is a quite smile, which sorta makes sense.

[00:06:11]
It's got stuff going on that looks a bit like the non smile, whereas this, there's none of the stuff going on, that looks a bit like the non smile. So now we're almost there, people, let's validate. So, we're doing well enough that we can validate on a new smile.

[00:06:27]
I'm sorry, I'm gonna squeeze in here, it's very, very messy of me to do validation. Okay, new smile, we've got 101, 000, 110, 111. There it is, is that a smile? That is not a smile, we label it, not smile. But we're gonna run our multipliers. Everybody, it's arithmetic time, run our multipliers.

[00:06:59]
[SOUND].
>> Will Sentance: Run our multipliers.
>> Will Sentance: There it is, run our multipliers through, and what do we get, everybody ready, Maria, it's your moment.
>> Maria: The first one?
>> Will Sentance: Yeah, give me the go ahead, sorry, Maria.
>> Maria: 0.5.
>> Will Sentance: Yeah, beautiful.
>> Speaker 2: 0.
>> Will Sentance: 0, -0.5, yep, then we got a lot of 0s and then?

[00:07:25]
>> Maria: - 2.2.
>> Will Sentance: + 2.2 plus no, negative-
>> Maria: 2.5
>> Will Sentance: Yeah, -2.5 + 0.
>> Maria: And then -1.6.
>> Will Sentance: Well done.
>> Maria: + 1.9- 1.6.
>> Will Sentance: Okay, math time [LAUGH].
>> Speaker 3: -2.6, I have a calculator.
>> Kayla: [LAUGH]
>> Will Sentance: Outrageous, yeah, 5, -2.6. Honestly, you saved us all a lot of time.

[00:07:56]
>> Speaker 3: None of us need to do that, we went through grade school.
>> Will Sentance: This is very important material.
>> Speaker 3: Sorry [LAUGH].
>> Will Sentance: There we go, run it through, and what's the output of that?
>> Speaker 3: Less than five.
>> Will Sentance: Yeah, maybe even less, yeah, I think it's probably about 7%.

[00:08:14]
Look at that, we know it's not a small, we gave it 0%, our multiplies through sigmoid given us 7%. I would say that accuracy, the error there for our validation data, albeit we only have one in our validation. Our error is therefore 7%, the target, 0%, the error therefore is seven off that 7%.

[00:08:36]
And that's why an error being a bit higher, not a bad thing, that's really good confidence. We're very happy with that, we're very happy with that. So, how's it doing our validation data? It does really nicely, so, there we go, people, this stuff is really challenging stuff. We built out here a set of weights through.

[00:09:00]
We didn't see the 100 stages of tries of tweaking, but we did one, that we did each tweak independently, which way should I tweak up or down? And we saw up better go down on that one, and then we mirrored as Jamie did better go down on that one.

[00:09:21]
And we check each of them independently, go back to 0 of all the others each time. But then we were allowed to mathematically do them all at once. And in one go, increase the one that said go up if you just did it, decrease the one that said go down if you just change that weight, and we did it all at once and we got whatever it was.

[00:09:43]
It was like 1- 1, 1,- 1, 1, -1. And that got us closer to our targets of 1, smile, -1, not, -1, not, 1, smile. Problem is there was no perfect conversion set of multipliers. If we wanted to light for light match up ones, out of our prediction or -1s out of our prediction, out of our converter.

[00:10:09]
And that would prevent us using it on our larger population. We'd get a bunch of random numbers that were neither 1 nor -1. So we don't know is it a smile, or not, even though our sample only labeled 1s and -1. So we said, if there's no perfect conversion, maybe the size of how positive or negative it is, could be a useful indicator of how confident we are, it's a smile.

[00:10:33]
Well, yeah, if we're gonna do that, let's run it through a sigmoid function that then actually squashes our big positive numbers, very confident down to about 97, 96, 95%. And squashes are big negative numbers. Very confident it's not a smile. I know we only had -1 and -1, but you can imagine a different sample, big negative numbers, definitely not as small, into, remember all negative numbers are below 50%.

[00:11:00]
And all negative numbers lower than about -5 are like or -2 are like under 10% ,squash those into under 10%. And then rather than have our labels be 1 and -1, just say 100% and 0%. And now we're trying to convert our 12 numbers through our multipliers, and through our sigmoid conversion to get to something close to 100%, something close to 0%, something close to 0%, something close to 100%.

[00:11:31]
And our accuracy is now measured in percentage how far off we are, but we now have the possibility of a bunch of multipliers. That actually give us, firstly, they don't need to be exactly to be correct matching, so that enables it to even be possible. But also, it's rather nice that they also mirror kinda a percentage confidence.

[00:11:57]
This means we're really quite confident now, this means we're quite confident it's a smile. This means, this is definitely the shape of a smile. Now, this is for our existing historic, our sample data. We then used that on a new, in this case, validation so we did actually know it wasn't a smile, we used it on this image, and got out a 7%.

[00:12:20]
A pretty high degree of confidence, no smile in this image. Using our model, we knew it was a smile, but we're able to use our multipliers on it and get out as the output -2.6. So quite a negative number, and convert it to a sigmoid cuz it squashes all numbers into between 0% and 100%.

[00:12:43]
And so negative numbers, even as not that small, but below -2, as Joe was asking, into a 7%, and now we can feel pretty confident this is not a smile. And this, people, is full your network implementation, would be a single A your network, single F weights where we use gradient decent to figure out how to get closer to more of that sample converting actually.

[00:13:13]
Side note typically we would refer not to the accuracy conversion but the average error. The average error is known as the loss, so you might have heard of a machine learning model minimizing loss. In fact, you might have heard that training a neural network is about finding the loss function.

[00:13:36]
That is to say, how has I changed these Multipliers, does loss go up or down? And that's exactly what's happening. And the loss we wanted to go down steadily get more and more converting correctly. Single layer neural network, 12 parameters we reached loss of 7%. We reached that error of 70% after a 100 iterations.

[00:13:59]
We only saw one of them, but after a 100 we got to these weights. This technique, we had four in our sample. This technique could be applied to a 100,000 in our sample. There's no eyeballing here, there's purely, per multiplier, try and tweak up or down independently, tweak up, leave all the rest the same, run through.

[00:14:25]
How's it do? Better worse, better, better, better, better worse. But okay, well, then keep that tweet up, tweet down, keep all the same, run through all the sample with everything the same besides the little tweak. And we only have to do 12 checks then, albeit on every sample element.

[00:14:41]
12 checks on every sample element and then we just do all of those changes at once, and then we do it all over again. New training iteration, and then we do it all over again, getting our multipliers to get us closer and closer to our sample converting.

Learn Straight from the Experts Who Shape the Modern Web

In-depth Courses
Industry Leading Experts
Learning Paths
Live Interactive Workshops

Start a 5-Day Free Trial