Transcript from the "Dataset Exercise" Lesson
>> Shirley Wu: So the first data type is categorical. So in terms of the movie data set we have, categorical could be the movie genres, it could be the actor names, etc. Ordinal, I think of it as like categorical, but with a specific order. And the most common example for that is t-shirt sizes, small, medium, large.
[00:00:27] And then there is quantitative. So I think of those as, for example for movies, I think of them as ratings, metascores, that could be from 0 to 100, or yesterday we talked about temperatures. So temperatures, degrees from 0 to 100, etc. Temporal so dates, years, months, etc. And spatial, cities, countries, regions.
[00:00:57] And I wanted to talk about that because when we go to this exercise to do the data exploration I want us to be able to open up the data and take a look at each of what the data gives you. There's a lot of attributes in here, so this is basically, I don't think I cleaned it, I just wrote some scripts to pull things off of IMDB.
[00:01:25] But this is not clean, and so this is very similar to something like maybe a raw data set that you might get from a client. Or actually this is too nice to be a raw data set you get from a client, but maybe one step clean. [LAUGH] But it still looks a little bit hairy.
[00:01:42] And so what I want us to do is list out attributes in here, and what type of data they are, and maybe bold them if they seem like they might be relevant or important for analysis later on. And then we'll come back and we'll talk about them after the ten minutes.