Transcript from the "Neurons & Model Training" Lesson
>> I was telling about the deep learning, about the model, and the next logical stuff used to explain how the model is actually trained, how is learning to do the predictions? And there was a comment about [LAUGH] connectivity of the neural network. So usually, in fully connected neural networks, and that's what people played with originally before AlexNet actually.
[00:00:30] Well, as the name suggests, fully connected neural network means that all the neurons of the previous layer are connected with all neurons the next layer. So for example with three neurons right here, it means that this neuron will be connected. This neuron will be connected and this one will be connected.
[00:00:50] And it means that in the layer, we will have W1, W2, and W3 associated with this particular neuron. And there will be W1, W2, wait, so W3. But basically, associated with this neuron, right? So each neuron will have its own set of Ws, weights and biases. So there will be additional bias here, additional bias here.
[00:01:26] So that's why people usually put weights 1112131 and 1222232, sorry, [LAUGH] just to kind of differentiate that we have different weights and different biases. In convolutional neural network, that's what AlexNet used, not all the neurons connected, so if it's not fully connected layers, it's a convolutional operations. I think I will get to it when we will be discussing image processing.
[00:02:00] So it will be basically in segments. Now let's talk about training of the model. So how do we know what weights and biases to use to actually do the prediction if we're looking at the hot dog or not hot dog? And that's what we need training data set for.
[00:02:20] So what is training data sets? For instance, if we are providing the images to our neural network to train on, we need tons of images. And when I say tons, it means we need about, well, ideally several thousands for each class you want to try train for. And those images should be annotated.
[00:02:44] So for instance, we need thousand images with hot dog and thousand images with not hot dog. And what we will do, we'll just provide the image, let's say with the hot dog, I don't know how to draw a hot dog. Yeah I dont wanna mess with my [LAUGH] lines and boxes here, but imagine that that's a hot dog, right?
[00:03:09] And at the end, we just seen our model. Just because our training data set is annotated, I was saying that you're looking right now at the hot dog, actually. And all those weights and biases, originally, when the model is not trained, it's activated randomly. Meaning that we just simply using those Ws and Bs randomly somewhere, let's say ideally around zero, right?
[00:03:40] And maybe in the normal distribution between something like minus one and one. So all of them are randomly activated. And our model first do the forwards and that's the words we will be hearing quite a bit, forward propagation. So what is forward propagatio? It means that it's just taking this image of the hot dog and do those calculations, right, with this random weights and biases originally.
[00:04:10] So some of the neurons will be for instance activated and will propagate the signal further, right? And depending on those random weights in the second layer, some of the neurons here will be activated and so on and so forth until we get to the last one. And it can be activated or it can be not activated.
[00:04:32] So basically, this neuron will say if it's looking at the hot dog or if it's not looking at the hot dog. All of that at this point is random, right? We just did forward propagation. But our model is randomly activated so we just get some random answer. The trick is coming next.
[00:04:55] Based on the prediction our model did, if it was correct answer, right, if we actually correctly said that you know what, most probably, that's the hot dog. In this case, we'll just go back and reinforce all of the previously made correct decision. So basically if this neuron was activate and this neuron was activated and we'll simply reinforce this connection between those neurons.
[00:05:26] And how we do that? By simply increasing weight, and go to the next level and next level and basically gonna go to all the previous levels and slightly modified those weights. And what is happening right now, this whole process of going back is called backward propagation. So in machine learning you probably if you heard just a little bit about machine learning backprop or backward propagation is this process when we training the machine learning.
[00:05:58] But we're changing those weights just a little bit and the way how, what should be the difference in those weights, it's actually calculated through the derivative. And I don't wanna go too deep into the math, but just trust me, what we do here is if we're seeing those decisions we made, if the prediction was correct, right?
[00:06:21] And then if a kinda similar image for instance will be provided, there is a chance that the same neurons will be activated and we will get the same result. But if for instance the prediction by our random activated model was incorrect, for instance, although we're looking at the hot dog but predicted that it's not a hot dog.
[00:06:42] We will penalize our neural network by doing exactly the same thing, just going through those activated channels and simply shrinking down the weights a little bit. So the idea is that you reinforce the correct predictions and penalize the incorrectly predicted results. And you're just doing it over and over again by showing multiple images, right?
[00:07:07] So that's how you do it. That's how you train neural network.