Data Visualization First Steps Facets for Comparing Group Data Solution
Transcript from the "Facets for Comparing Group Data Solution" Lesson
>> Let's see how this faceting operation went, shall we? Insert, I don't know, bling joke here about diamonds having facets or something, gonna have to work on my stand-up material. All right, so we said that in order to add facets to our chart, it's a matter of adding this facet property with a configuration object into the config options of the plot.
[00:00:29] Which when we're using this syntax where we do plot.mark, in this case dot, and then .plot, it'll come there. And in the previous syntax that we looked at where we called plot.plot and then passed in a mark's array can just come in that general call to plot. So, all right, what we're gonna wanna do here is, again, remember to add commas, very difficult for me.
[00:00:55] [LAUGH] We'll specify that we want to facet this plot. And then we are going to always tell the facet operation what data we want to base our splitting or faceting operation on. And in this case, it is the same devices data that we've been using for our marks, so the same dataset here.
[00:01:22] And then we pass in the mapping of the feature of the dataset that we want to split on, which in this case is the Device category, right? We tell then the faceting operation which channel we want it to split on. So in this case, we could try the x-axis.
[00:01:45] So once I've done that, we go from having this single dot plot to, if I press my little Play button here to evaluate my new facet code, we'll see three different dot plots that have each of them just one single color. Well, more or less, we've got some overlapping data points here based on those data points that have higher values than we allowed for in our axes.
[00:02:15] But that's okay, we'll breeze on by that. So here we've got these three side by side plots, desktop, mobile, and tablet. And now, since we don't have so much overlap between the different device categories, we can see some of the points, especially on the less frequently used devices, like mobile and tablet.
[00:02:37] We can see some of the points that were a little bit obscured by the other data, by the desktop data, in the original plot, right? So for example, some of these red dots here that are in this landscape orientation where the width is higher than the height, so the width is bigger than the height.
[00:02:58] Those are more easily identifiable now that we've split the plots up by device. And did anybody else tries splitting across the y-axis instead? So if we wanna play around with that, we can try that as well. And your mileage varies. Depending on what your data looks like and on what your output format looks like, how much real estate you have in your dashboard or what have you.
[00:03:28] You might wanna choose one or the other to give folks a better visual experience based on the specifics of your dataset and where you're putting these graphics. But in this case, I don't know, which one do folks prefer? I think it's a little bit more readable in the x direction, I'm seeing some nods.
[00:03:48] But maybe we have different perceptions, that is entirely valid. And that's why having the possibility to customize these things is so important, because every person is different and every dataset is different. But for now, let's say x-axis is fine. Awesome, so please interrupt me if there are any questions.
[00:04:08] But otherwise, we have successfully finished our second project. So we wrangled some data using a basic map operation, but still, we made ourselves think about exactly what we needed to be visualizing. Like, for example, the width and height each taking their own channel, made sense for us to wrangle that out.
[00:04:28] Okay, so just to continue our recap of this project. So we did some data wrangling to make our lives easier in the visualization step. Although, again, we can do some transforms in the visualization step as well. So your comfort with, let's say, using functions to transform strings to numbers in the plot is totally valid, versus doing it beforehand with the map operation.
[00:04:55] Your mileage may vary, so whatever feels best to you, go for it. Then we took a look at aggregation by means of group transforms in plot. Which we can use with categorical variables to create aggregation operations, like, for example, summing up all of the users observed for a particular device category.
[00:05:15] And then we saw a new type of mark, the .mark, which we used to create an informative plot, a scattered plot or dot plot of different device categories using all these different channels, fill, opacity, radius, R, to convey different features of the data. And finally, we took a look at faceting as a means of revealing some of those details that might get obscured if we're smushing all of the data together into one visualization.
[00:05:47] And highlighting the perhaps harder to notice features from, especially some of the smaller subsets of the data, by putting each category of device in its own smaller plot.