Check out a free preview of the full Web App Testing & Tools course
The "Clustering Application" Lesson is part of the full, Web App Testing & Tools course featured in this preview video. Here's what you'd learn in this lesson:
Miško spends a few minutes walking through the clustering application included in the course repository. The user interface allows users to change the dataset's size, distance, and minimum cluster amount.
Transcript from the "Clustering Application" Lesson
[00:00:00]
>> So I found this data set online somewhere, and it shows all of the accidents that occurred in Chicago. And what I find interesting about it is that recently I did some work with machine learning and I learned to use a clustering algorithm. And so I just figured, hey, it'd be fun if I could just cluster the data from all the accidents and you can You can see maybe something interesting happens in here.
[00:00:22]
In this particular case, you don't really have anything interesting because it turns out that if you're in a center of the city, then it's more likely to cause an accident. So if you move the buttons a little bit, I did, I think if you just change the value here, it just takes a while because it's a large data set.
[00:00:40]
Click on the button. Yeah, it did click on it. It's chewing through it. What I find is interesting about it, if you tweak the parameters of clustering and how much data you have, it essentially shows you all the intersections in a city, right? So guess what? Intersections are dangerous.
[00:00:56]
But I think it's kind of interesting that you can kind of play with the data. Anyways, we're not here to talk about the data, but I thought this would be an interesting kind of a background for the kind of test that we built. What is a cluster? Good question.
[00:01:07]
So cluster, so let's go back to the previous view right here. And each dot represents an accident, right? And so x and y are the position, right? So you can see the map of Chicago kinda over here, and you can kinda see the lines which represent where people drive so this is the avenues they have inside of the city.
[00:01:28]
And each that represents an accident. And so well how do you decide if something is part of a cluster or not? So like for example if you look at these three dots in here in the middle, hopefully you can see them, maybe I can zoom in, right. Notice there is big space around them, right.
[00:01:46]
So when you set up a clustering algorithm, you ask it to tell you how much distance does something have to be to be still considered part of the same cluster, right? And for whatever reason, no accidents happened over here and this is why this particular cluster ended up on its own, because none of the points that are nearby qualify the distance thing in there.
[00:02:12]
Anyway, so that just gives you the clustering. This is the background, this is what we're gonna be using for testing. So what we want is we wanna be able to do a unit test Test, we wanna be able to test this particular component, right? This is a component in here that we wanna unit test.
[00:02:27]
We probably also wanna have an end-to-end test because as a application developer I can mess with these numbers here and different things can happen. And so we wanna look at different kinds of testing that exists now. So there's a sub-component folder called clustering. And there's also a component, sorry, there's a clustering component in here.
[00:02:48]
So the routes clustering index TSX, this is basically just a UI that you see. So that file represents this UI right here.And then if you go to the. Clustering, the source clustering, there is a clustering TS and there's a cluster TS, TSX. The cluster TSX is a component that creates, basically that's this component right here, this clustering component right here that you see here.
[00:03:18]
So that's a component that we're using to build up the UI. And then the clustering is actually the logic that does the clustering, that knows how to do all the parts. So we have a piece of code that knows how to do clusters. We have a visual component that knows how to render the clusters.
[00:03:38]
And we have a page That not only shows the clusters, but also has some UI associated with it. So you can see how testing all of these things together might be tricky or things to explore. So let's start from the lowest level. The lowest level is we would like to test the UI of the cluster that the clustering algorithm works.
[00:04:03]
So that's this particular file right here, it's called clustering.ts. Okay, so clustering, this is the piece of code that takes a dataset and figures out how to build clusters, right? I am storing the dataset in just a JSON file, database..json. If you open it, you'll see that there is thousands and thousands of accidents.
[00:04:28]
I'm actually kinda shocked just how many accidents there are. I guess it's over last 50 years, but still. So this loads the data set. Then we have some parameters and some default. So my default The radius is how many meters before the size of the cluster, how big the radius is.
[00:04:47]
So the default I have is 300 meters, which I think is about 1,000 feet or something like that. And because the actual x and y positions of where the accident happens is in longitude and latitude. There's some math here to convert it into longitude and longitude, and each cluster basically will give me the minimum and maximum in latitude, minimum and maximum in longitude.
[00:05:13]
And a collection of data points that represent the cluster, right? And so Location is, if you go in here, a location is just a number that represents the longitude and latitude, right? So pretty straightforward. And clusters is going to be plural. It's just a collection of clusters. So array of cluster is clusters.
[00:05:43]
So there is an existing implementation of clustering algorithm that we're using. So it's called a density clustering that I've already installed. And so the hard part is really just getting the data in the correct format for this algorithm to run. So typically, we're not testing whether the algorithm is correct.
[00:06:07]
We're just testing if the data is being fed into the algorithm in the correct way. And so you can see that we create the algorithm. And we then have to map our data set into a format that the algorithm wants. We do some performance numbers in here and then we let the algorithm do its magic right here.
[00:06:28]
Then once we have that, we have to go and convert it into format that we can use. So this piece of code over here just converts it in the format that's usable to us, right? And so the test we really would be interested in is testing this particular code to see if it works.
[00:06:48]
So let's get to it.
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops