
Lesson Description
The "Preprocessing Sample Data" Lesson is part of the full, Hard Parts of AI: Neural Networks course featured in this preview video. Here's what you'd learn in this lesson:
Will explains the need to preprocess images before model training. This allows the model to focus on its intended goal. Additional models may be required for the reprocessing stage to adjust the rotation of images or determine if there's a face.
Transcript from the "Preprocessing Sample Data" Lesson
[00:00:00]
>> Will Sentance: Okay, any questions?
>> Maria: Yeah, I had a question. What if we flipped the original smile upside down to make it an upside-down smile? Would it still-
>> Will Sentance: And just be clear, Maria's not asking here. What if we switch the smile upside down and made it a frown, like switch that frown upside.
[00:00:19]
I get that Maria's note making a refrigerator magnet, but instead is referring to, hold on. What if we are training image sample has an image, which could totally happen. Someone's uploaded it when they wanna identify a smile or not, the wrong way up. As in, when we use it, when we want to infer for a new image, someone's provided an image that's upside down, and we wanna be able to spot it so smart it's just an upside down.
[00:00:46]
The heads, the whole way upside down, almost the photos upside down. We're gonna see in a moment, we need to pre-process our data. We do not want to have multipliers be learned for the rotation of the head. We want to learn the right multipliers that capture for the same orientation, the features of the smile, not the rotation of the head.
[00:01:17]
So we would preliminary, perhaps, have a totally separate model that actually works out the right rotation of the head and says, that head is off by this side. Before you input it into this model creation, rotate it to be facing, we'll see this in a second. One doesn't wanna learn or figure out multipliers for a bunch of different orientations ahead, because then you're not actually capturing the characteristics of the smile.
[00:01:45]
You're capturing the characteristics of the head rotation. And that's not as useful, or if it is, do that bit first and then capture the characteristics of the smile, have multiple models working on it. Great question from Maria, you had another one as well.
>> Maria: I think I just was looking at a different type of smiles, so it would be kind of like an open mouth smile.
[00:02:09]
>> Will Sentance: And I think in that sense, we could always add many more here. And the labels you give them will determine what the prediction is for the population, there's no truth to the small characteristics. The only truth is in the sample, did this set of pixels get labeled 0, this set 0, this set a 100 percent, this set a 100 percent?
[00:02:30]
Okay, Ben, question?
>> Ben: I don't know if this is really appropriate, but in your training data, would you put in things that you don't have 100 or 0% confidence on?
>> Will Sentance: You could do, I think in this case, the classification is binary, it's just, and I'd almost say it's really 1, 0, 1, 0.
[00:02:51]
And we're building confidence here of the 1, 0 sort of binary.
>> Ben: I can't really think of an example.
>> Will Sentance: There's a little bit of a smile [LAUGH], you're trusting the model to capture the subtleties in between. With a large enough sample, you're finding those in-betweens via the in-betweens in the sample, as in, that is labeled a smile, that's not that is.
[00:03:27]
But actually the difference between them is only subtle, and you can't even quite pin down what it is that meant that that one got labeled not. But that is capturing the in-between via if you had ten pictures you label a smile. And they have a range of characteristics and ten that you label not and they have a range of characteristics.
[00:03:49]
You're capturing in that the diversity and also the boundaries, at the point at which that one didn't get classified as a smile. We didn't get labeled as a smile, and the in-between will be captured quite well by the multipliers. So when it's applied to a new smile, the 89% represents some of that kind of in-betweenness.
[00:04:15]
So it's quite nice that it's not a binary output because we don't want to have that. And that's another thing about the sigmoid function, is it can never reach 100% or 0%. We don't want to have absolute certainty, smile or not.
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops