# Plotting Data with Noise

Transcript from the "Plotting Data with Noise" Lesson

[00:00:00]

>> So let's visualize this distribution of the numbers. So a couple of things. First, we need to import matplotlib. So, import matplotlib as mpl. Or if you not specify as, if I just do import matplotlib like this, it means that to run any function from matplotlib, I need to type matplotlib first.

[00:00:32] And then the corresponding commands. I'm actually not gonna be using matplotlib directly, but I will be using pyplot, submodule under matplotlib, which gives, once again, kinda higher level API for utilizing matplotlib functionality. So import pyplot as mpl. You commonly will see this alias used for pyplot. And with this command, what I can do, I can just use mpl.

[00:01:04] Let's use histogram, so histogram X, and mpl.show. I think the last command can be skipped. What's happening? I know, sorry, I need to convert my X, because right now, X is in tensor form, right? So it's the representation TensorFlow is using. I want to just simply transfer it to regular array.

[00:01:38] And by just calling numpy methods on my X, I will do exactly that. And what has happened? Object has no attribute numpy. Interesting.

>> [INAUDIBLE]

>> Yes, thank you, sir. [LAUGH] That is exactly what's happening here. So I was just bringing, because I haven't finished my code writing yet.

[00:02:13] All right, so, numpy, yeah, it's pretty noisy data. I think if I just create more, actually, we should put it like this. If we just, let's say, generate 10,000 points, right? So every time I modify something, I need to execute this cell. And yeah, you can see what uniform distribution actually doing, right?

[00:02:46] It's basically giving you random numbers between 0 and 1. And just because I've asked to generate 10,000 of them. Now we can see that distribution is, more or less, kinda equal. So that's what uniform distribution actually doing. All right, so let's return back to our 100 numbers. With 100 numbers is just we have, so to speak, noisy data.

[00:03:16] What I'm trying to do next is add the noise. So with the noise, let's say we can use tf.random as well. But let's use slightly different distributions. So normal distribution will give you instead of kind of uniform distribution, it will just be a bell curve, right? And with normal distribution, I need to specify, once again, shape.

[00:03:54] And it will be similar to our X input data. So we're gonna just say n, comma, and do we need anything else? We can specify standard deviation. By the way, if you have any kinda questions, what additional parameters can be specified? You can always go to empty space and press Tab.

[00:04:19] It will show you the list of additional parameters you can specify. So mean, by default, it is equal to 0, so we could have skipped it. And standard deviation, sorry, it basically tells you the distribution, how far. It's basically the width of the half height of the bell curve.

[00:04:48] So for standard deviation, let's specify something small, an example is like 0.01. All right, and now, the most interesting part. So in my function, Y, I want just to take all of those input variables, and multiply them by corresponding weight, and add bias. And the top of that, I want to add noise, and return X, Y.

[00:05:31] So let's just do that. Can be executed. We can print it out. But the thing is, let's plot this function. Let's plot all those Xs and Ys. So we already have a plt, pie plot available to us, so we can just say plt.plot. Mpl, sorry it's not plt, it's mpl.plot.

[00:06:10] And we can say plot my X's and Y's. But also, plot, there’s a neat trick. You can specify color by just using, for instance, r for red, g for green, b for blue. And then also, what kind of markers do you wanna use? For instance, I want just to plot circles, we can specify capital O.

[00:06:34] And if I run this command, yes, and I totally forgot that I wanted numpy. You know what? Actually, I will do it for, All X's and Y's by just returning. Yeah, Like this. We're running this function. So now, we have new function, right? We can now avoid typing numpy everywhere, and it should work.

[00:07:19] No it's not. [LAUGH] Let's see.

>> Lowercase o.

>> Probably lowercase o, thank you. Yes, all right, so that's how our data is. Yes, okay, so lowercase o, that's the big circle, and I think for small one, I can just use dots. Is this true? Yes, that is true.

[00:07:44] Okay, so that's our input data. What I'm trying to do is somehow, so let's say I don't know my w and b at this point, right? It's only somehow used inside my function. But I want to learn those w's and b's to, well, fit this kind of distribution of my data.

[00:08:15] So if we do something like this, mpl.plot, and for instance, let's say our X will be between 0 and 1. And for our Y, I will just say there's gonna be 2 points also, but I want my 0. I'm explicitly saying multiply by w+b. And then 1 multiplied by w+b, and let's say I want to have a green dashed line.

[00:08:55] No, this should be good. And then I can just specify that w, I've used w=0,1 and b=0.5. So that's gonna be my ideal values. If I just plot this, Can't assign to, Literal, that is interesting. Mm-hm, okay.