Data Visualization First Steps

Analytics Project Data Setup

Anjana Vakil

Anjana Vakil

Software Engineer & Educator
Data Visualization First Steps

Check out a free preview of the full Data Visualization First Steps course

The "Analytics Project Data Setup" Lesson is part of the full, Data Visualization First Steps course featured in this preview video. Here's what you'd learn in this lesson:

Anjana walks through the content of the "What size devices are users browsing with?" project. Students are then instructed to transform the deviceAnalytics array into a new devices array with the numeric values and additional columns and create a new array based on the devices data that adds "width" and height" columns.


Transcript from the "Analytics Project Data Setup" Lesson

>> So we are going to jump back in with another project. This one is going to be the project what size devices our users browsing with. You should be able to navigate to it right from that sidebar there, the data visualization first steps collection you should see there.

And we can drop the link in the chat as well. So that is what we're gonna to be working through next. And so in this project we're going to be looking at a different dataset. This time, we have another file attached to our notebook. This one is called device analytics.

And we'll notice it as a CSV comma separated values file. Another really common type of file that you might get data in, again, if it's a small data set. And this is some mock data about usage to a hypothetical site that is the kind of data one might be able to export from an analytics platform like for example, Google Analytics.

So this data set has some data about visitors to a site and which device types and screen resolutions those visitors were using. So let's take a look at this data set. What we're doing is very similar to our last Json file. We're reading in this file that's attached to the notebook again if you were doing this in vanilla JavaScript you would be parsing this in with whatever library of choice or maybe you'd be fetching it from elsewhere on the web.

In this case, an observable we're loading it as a file attachment and then parsing it as a CSV file. Which once again is giving us our nice little JavaScript array of an object for each row in the CSV data. And if we throw all of that into another little table to make it a bit more human readable, we can see what we've got here.

So we have a column called device, where we have a kind of size category desktop, mobile or tablet. We might have a few tablets in there if we scroll down. Then we have the resolution column which shows us the resolution that folks on this device, this resolution we're browsing.

And then we have a column for the number of users that we saw with that combination of device and resolution and the number of sessions that those users browsed in on that device. So, what we've got here is some data that has the values that we want, but doesn't necessarily have the exact shape that we want.

So we're gonna do a little tiny bit of lightweight data wrangling, which as we talked about before is a crucial first step to setting ourselves up for a nice visualization. So let's take a look at one of these objects here. So if I'm just looking at the first thing in this dataset, I see I have these different properties corresponding to those column names and then I have string value for each of these observations here.

So our first task is to get this into better shape so that we can more easily visualize it. So we have a couple of problems her. First of all, we have this resolution value hear which is a string that combines both the width and the height of the user's screen or resolution settings.

So what would be more helpful than that, is to also have additional columns that separate out the width and the height of the resolution. So instead of one string where we have this kind of 1920 by 1080, we also want a column that says, width that's 1920, and then height that's 1080.

And it would be great if those were numeric values instead of strings, because that way we can actually treat them as a continuous data space. Similarly, we notice we have strings for these users insertions counts, and again, these are actually numbers, so it'd be useful to have them as numeric values.

So our task, our mission should we choose to accept it is to do a little bit of lightweight data wrangling using our good old friend array dot map. So we have this device analytics array with our original data, we're gonna map over it with a function that we're gonna write.

So we're gonna replace this little identity function to make sure that we first of all add width and height columns with numeric values corresponding to the values broken down from the resolution column. And convert the values in the users and sessions columns from strings to numbers. So there are some hints about ways that we can do those conversions in the hints button.

But this is just basically going to be a little thought exercise for us to get ourselves into the habit of wrangling our data to make our lives easier later.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now