
Lesson Description
The "Neural Network Data" Lesson is part of the full, Hard Parts of AI: Neural Networks course featured in this preview video. Here's what you'd learn in this lesson:
Will introduces the next prediction challenge: a model that can detect a smile in an image. The scope is simplified by constraining the size of the images to a three-by-four-pixel grid. Ones and zeros are used to reflect active pixels, and sample smiles/non-smiles are provided.
Transcript from the "Neural Network Data" Lesson
[00:00:00]
>> Will Sentance: All right, so, if we were able to predict fraud or not because we had background information on the refund request, I want my money back on DoorDash. Then we used it on the population as a whole. That means all refunds, and we were able to say, hey, this new refund request, looking at the background data on it, because I learned the pattern of the background data of the refund request.
[00:00:29]
And whether or not to give a refund for the sample, I can apply that same pattern to the population. If we have labeled data, some information in that case it was details of the refund, represented as numbers plus labels of what that info is. And remember, whether or not a refund was given was essentially a label, Refund 1, Refund -1.
[00:00:54]
This data associated with Refund 1. This data associated with Refund -1, for the sample. If we have labeled data, some numbers plus a label for it, and we then created a converter for that sample, converting the input values into the label, so the input numbers into the label, 1 or
>> Will Sentance: -1, then assuming the sample will effect the population, that is to say what there was like in the sample was reflective of the patterns in the distribution.
[00:01:27]
Which is a fancy way of saying the patterns of the data in the broader set of refund requests, it's completely different. Then we could use that convert on the rest of the population that isn't labeled to predict those labels, refund 1 or refund -1. What other info represents numbers and labels of what that info is?
[00:01:47]
Can you think of, people, what other stuff that you might not think of as actually being a collection of numbers might actually be a collection of numbers? Maria, what other things can we capture as data.
>> Maria: Pixels.
>> Will Sentance: Pixels, images, images can be collections of numbers. Let's say, I don't know, higher numbers for brighter, or higher numbers for darker, or lower numbers for dimmer, or lower numbers for lighter.
[00:02:16]
But we can also do it with words. Let's just take a quick look at words. Just to get a rough sense of it, I am gonna show this for a second here. So, if we could build a sample of human language, let's say, I don't know, some text strings.
[00:02:37]
Let's say, I don't know, I [LAUGH] always throw down these. I love my, and then our label could be how likely the next word is dog. Let's say very often in, I don't know, text off the Internet, it's dog. Let's say it's cat is, I don't know, I walk my, there we go, that's a bit more likely.
[00:03:06]
Walk my dog, very likely in the history of text on the Internet. Cat, very likely? Not impossible, but pretty unlikely. Refrigerator, [LAUGH] Never been used. Completely unused in the history of the Internet. And then we can have, I love, now this works better, doesn't it? Dogs, maybe that's being used like quite often.
[00:03:35]
Cats also, I don't know, pretty often but less than right because, who likes cats? No, sorry. [LAUGH] Why pick a fight? If we can, though, work out the pattern to convert I walk my into the distribution, the number of dog, cat, refrigerator in the history of text on the internet, and if I love into dogs cats.
[00:03:57]
Then we could create a a converter for, I walk my into these strings, I love into these, and then use it on the larger population of, and this is wow, where it gets really wild. Of all human language, or let's think, all text strings. And perhaps we could make predictions on how likely I love my, so not I walk my or I love, which have these, but I love my is going to be dog, we don't know, or cat, or refrigerator.
[00:04:45]
We can apply the pattern we found for converting I walk my into the next word in the string, I love into the next word in that string. We can apply that same conversion or set of rules for I love my, because we can give each word a numeric value, or a set of numeric values.
[00:05:07]
And match up those numeric values with the numeric values of dog, cat, refrigerator, and their associated frequency in the history of language use. And from that generalize to the underlying population of all possible human language use. The term, the equivalent of pixels in language, anyone know the equivalent of pixels in numeric values that we give to bits of words or sentences in human language.
[00:05:35]
The term is embeddings. These are the text, sort of numbers that we give. But we are, though, going to focus on pixels, because we want to build a smile identifier. We want to be able to identify, not patterns of numbers that represent text, but patterns of numbers that represent images.
[00:06:06]
And again, remind us Maria, what numeric values, or what's the, I don't know, what the images comprise that we can then help comprise, that we can then give numbers to? What's the word you said earlier? Just say it again.
>> Maria: Pixels.
>> Will Sentance: Pixels, well done, Maria, thank you.
[00:06:25]
Exactly, pixels. So we're gonna do this not for words, but for images. Not with embeddings, but with?
>> Maria: Pixels.
>> Will Sentance: Pixels, it's gonna be the right answer forever. Excellent, thank you. So, we can see an image here of a happy face being reduced in its dimensionality. That's another term we'll see again and again, gone from 1,000 by 1,000 maybe, something like that, down to 3 by 4.
[00:06:54]
All right, we're gonna have a sample of 3 by 4 images, 3 by 4 images of 0s and1s only, keeping it super, super simple. And we're going to, so 1s being the dark, raise your hand if you buy that that is a smiley face in the bottom right hand corner there.
[00:07:17]
Raise your hand high and proud if you buy it. That's three people. Maria, do you buy that's a smiley face? No, okay, good. No, that's a very obvious, that's a smiley face. That's an obvious smiley face, reduced to 3 by 4 dimension. We are going to build a sample of faces, 3 by 4 images, in order to hopefully, if we label them, have associated label of smile, 1 for a smile and -1, I guess, like refund for not.
[00:07:52]
That we might be able to look at new 3 by 4 images, and if we can identify how to, well, let's see. Let's get our little sample of, and you're gonna all be very convinced in a moment that this looks exactly like a smile. Here it is.
>> Will Sentance: Indisputably, and that is labelled Smile 1.
[00:08:20]
Just like our refund, someone has come and looked at this, and by the way, there's a lot of business opportunity and a lot of people doing it. I think it's his name, Alex Alexander Wang, I think I'm getting it wrong, probably I'm getting it wrong, the creator of Scale AI, you don't get this wrong in an actual public talk, but Scale AI's founder.
[00:08:39]
Their company's job is to build out sample data sets, labeled images, for example, for example, of, I don't know, physical space. For autonomous vehicles that need to be able to recognize, is that a person or a shadow in that image. Those being seen in that rendered space that's being seen by the autonomous vehicle.
[00:09:03]
If you haven't been in a Waymo yet, it's very fun. It's capturing and identifying things all the time based, on we'll see, some sample data. Who does the labeling of that sample data? People, where do they work or who do they work for? Companies like Scale AI, which is worth $10 to $20 billion as a result.
[00:09:24]
There's also public data sets available, labeled images, like ImageNet, one of the most famous. And a historic one that was super popular called MNIST, we'll talk about it in a second, of hand-drawn digits. So there you go, there's one of our images in our sample. Let's make another one.
[00:09:41]
It's going to be a absolutely obviously sad face. No one's gonna debate that. And that is Smile -1. No smile in this picture. If we can work out, people, it is really cool. If we can work out how to convert these 12 pixels, 1 0 1 0 0 0 1 0 1 1 1 1 into output of 1, and these 12 pixels, 1 0 1 0 0 1 1 1 1 0 1 into output of Smile -1, if we can work out how to do that.
[00:10:23]
Then maybe, assuming that our 3 by 4 images that we have in our sample are similar in their nature to the population as a whole, then we can take, I don't know, let's say, 1 0 0 0 1 0 0 1 0 1 0 1. Look at that expression.
[00:10:54]
That looks like the sorting art, and nothing else. And that, we don't know, is that a smile or not? We don't know. We don't know. But if we can apply whatever rules to convert for our sample, our broader population of images, if we can apply that conversion, can we get a 1 or a -1 out?
[00:11:26]
Based on our sample, based on the fundamental notion that a smile is not something you can per se define, except by, this is some smiles and this is not. Now, use that to tell me, is this a smile or not? We're taking information that we have, where we know from a human it's labeled small, but no different to information we had that we knew as a fraud, or sorry, we knew got a refund or not.
[00:11:55]
And generalizing that to the population of all 3 by 4, 0s 1s images. There it is. Okay, quick question. How many possible 3 by 4 images are there with 12 pixels that can have either 0s or 1s?
>> Will Sentance: [LAUGH] It's a lot. I tell you, people, it's very quickly a lot.
[00:12:26]
This one is two for each pixel, there's either a value of 1 or 0, and then there's 12 pixels. So it's 2 to the power of 12, what? So it's 4,096 possible 3 by 4 images. Don't underestimate how many images there are. So this, by the way, can quickly tell you how vast the possible data sets for any serious number of images, I know for a fact Ben is looking at this, it can't be true.
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops