Transcript from the "Deep Learning" Lesson
>> You're gonna ask kinda okay, that's the basic concepts, but why, why are we doing all of this? Well, the thing is you can now stack those neurons into layers. Or you can have several neurons right next to each other processing simultaneously. So for instance, you can just combine neurons into one layer, right?
[00:00:23] And then have multiple inputs or maybe even the same inputs provided to different neurons, right? And they can just somehow do something with the information we input into our, in this case layers of neurons. On top of that, we can have multiple layers of the neurons interconnected partially or fully with a previous layer.
[00:00:53] And you can have multiple layers like this. And remember, I was saying that we'll be talking about deep learning. And have this connection between machine learning and deep learning. So deep is actually coming from this architecture. With layers of neurons you can simply gonna take the input signal.
[00:01:21] And input signal can be for instance visual information, right? Some picture or sound or even text as we will see during this course. So you can just provide some numbers, right? And neurons will do those transformation and use the signal. And somehow for instance scan, the last neuron can be a classifier, which can, say, for instance, let's say we provided the input text, just our emails.
[00:01:53] And this model, let's call it just model, What it did, at the end, it just said that we are looking at spam, Or, not spam. It can be anything, we can provide, for instance, image, picture we took with our phone, right? And our model with those tons of neurons connected with each other at the end scan, say if it's looking at hot dog or not hot dog.
[00:02:34] It is possible, you can do that easily with this system. So the question is now, once again, where deep is coming from, deep learning came from all of those layers. So in 60s, in 50s, people were playing with just single neuron, or maybe two neurons in one layer.
[00:02:55] But actually, we come up with the idea that you can have multiple layers stuck with each other. And kind of jump, I'm sorry, that's gonna be mathematical jargon here. But having multiple layers like that help you to jump from the linearity to more complex polynomials, I'm done with math.
[00:03:16] It means basically that we can better fit into distributions or whatever information transformation we're trying to get if we have multiple layers. And this deep learning means that we basically can put more and more, deeper and deeper layers into our model. And we started with something like, seven, eight layers, for instance in AlexNet in 2012.
[00:03:42] That's the model which won ImageNet competition. And I will, well, I can probably tell you right now about it. It's basically the contest and datasets created by Stanford University and a couple of other collaborators. Where they have 1.4, or at least in 2012, they had 1.4 million images with different resolutions.
[00:04:10] And the competition task was to find objects, not objects, but basically say what you see on the image. And there were 1,000 different objects, so we can say 1,000 different classes. And the competition was just to process all of those 1.4 million pictures, right? And do your predictions and then, basically, different models or different algorithms were competing, right?
[00:04:40] And they were trying to figure out which model is the best. And in 2012, the competition was significant better result, was won by convolution neural network now called AlexNet. Krasinski, Alex, professor, the author of the model and his collaborative group. I think the difference in the performance, and by the way, when we talk about performance in machine learning, it means the accuracy mostly, right?
[00:05:10] They won by something like 20% better compared to support vector machines, which were the winners of that competition the year prior. So, interesting, so yeah, deep learning is coming from just those layers. But at the same time you see that those models in deep learning it's usually almost have this triangle architecture, right?
[00:05:35] We're starting with a lot of different inputs, for instance, intensity of the pixels. And then we're reducing something to relatively small, one neuron, or in case of AlexNet, it will be just 1,000 neurons, right? And the one will be activated which have the corresponding label, for instance. If I'm providing the image of the dog, then the neuron which had dog will be activated.
[00:06:04] But I haven't told you how those weights and biases are actually selected. And the thing is, no one is selecting them, it is actually trained. And that's the process of in machine learning how we get all of this magic of deep learning.