Transcript from the "Running Benchmarks" Lesson
>> All right, so how do we start? What do you guys think is slow? Can anyone give me like a guess?
>> Creating objects.
>> Okay, creating objects I love that by the way. That's like my favorite. That's the thing I love to harp on creating objects. What objects?
[00:00:40] We want to only control what we can control right now. So we'll just keep that. We're not like thinking about edge deploying and how we get servers closest to groups of people are organizing web sockets with relation to where they're coming in. So we're just not gonna think about that level of speed.
>> Promises, okay, beautiful, beautiful. It's like you guys know me all right, yes, objects and problems is both great answers. So usually the best way to find out what is slow is to measure. So I highly recommend you don't guess but instead we measure, right? That will give you the best answer.
[00:01:17] So a couple things, don't forget to change your U limit because at some point you will exceed 1024, which is the default on Pop OS or whatever your OS is. And then you'll get a bunch of errors trying to create a bunch of test clients. So just remember that, adjust your U limit so you can have lots of files open.
[00:01:33] First thing we're gonna do is actually look at a little utility I've made to help kind of visualize what's happening on our server. And so to be able to launch this, it requires three separate terminal tabs. [INAUDIBLE] there we go, I'll open that up next, okay? So CD shooter and then we need to launch the program that actually runs the server.
[00:01:55] Then we're gonna do it again, where we actually are gonna launch something to launch the client. And we're gonna do it one more time to something that actually launches the kind of perf server that looks at the logs and tells you what's happening inside of the server. Okay, so the first one, we're just gonna build node with TSC.
[00:02:14] Yes, I didn't make a watch command, okay? Cuz I don't wanna keep on running just wanna running some of the times. So Okay, and so TSC and then node dist server.js and we're gonna set a log path, this is where it puts all the logs. It's gonna log things that are happening on your server into this file.
[00:02:30] So I'm just gonna copy this command and this will launch the server. So I'm gonna go here and if it doesn't launch for you, I'm sorry, it worked on my machine. All right, awesome. Next, we're gonna want to launch this. This is a simple rust program that does the following a we run in release mode.
[00:02:48] Second, we're gonna have 500 connections created to the server. Third, we're gonna play a whole bunch of games. I set it out to play a whole bunch, so it can just keep on running. And then this T right here is time between connections. So a connection will be created to the server every two milliseconds.
[00:03:03] So that way we have a better distribution of connections as opposed to make 500 at once, lose 500, make 500 at once, lose 500. So at least these 500 will be created over the course of a second, as opposed to all at once. So copy that and here, I don't like it.
[00:03:20] Actually right there. Let's do the original one. I'm gonna make sure I have the exact same command as you guys grab that, put that there, and then we'll start the client. Yeah, the client there we go, the little testing client will run rust. So it's gonna just start running.
[00:03:39] It's just gonna start Start running against the server my computer will start going making sounds all that fantastic. And then we'll have this whole visual server which this one's written in. Go with HTML, by the way. It was like the first time I was playing around with go and HTML, so it's kind of a largely ugly.
[00:03:55] So enjoy all you want. And so I'm gonna go over here I'm just gonna run this command that'll. Create a nice little echo server for 2068. And so if you go there, you'll be able to see what's happening on the server. So I'm gonna jump over here, I'm gonna go to localhost 420, sorry, this tiny font, 420.68.
[00:04:14] Pretty simple. You're gonna simply pass in a file that we specified as our log path. So our log path right here was temp slash testing. So I'm gonna go Temp testing and that's gonna tell you all about it. Now typically I actually like to have this thing kind of split right here just makes it a little bit easier to see on this page as you can see right now we've played about 500 games.
[00:04:39] We have this ratio find that here in a second. We have about a million on time ticks meaning 1516 or 17 on time ticks. We have about 400,000 ticks that went over that went 18 or more milliseconds, and we have about 67,000 ticks that were under. So if I refresh this again, it continues this way the ratio is simply under plus over on top of the sum of all possible ticks.
[00:05:05] So what percentage of ticks are over plus under? And so 31% of the time we are not behaving correctly. So this isn't a good place to be. I'm not happy about it. Your boss is probably not happy about it. Whatever number you're getting is probably different than the number I'm getting getting, and that's okay.
[00:05:24] Then also you can look at like say the distribution of the interval as you can see, we have a lot of 16 and 17s, we also have a lot of 18s. We also have a lot of 15s. We have some 19s. We have some 14s, and you see that it kind of goes out a little bit.
[00:05:37] There's some unusual ones way out here. There's 1767 milliseconds to take ticks that happened. And so things are happening obviously On here so, there you go. So this is one way in which we can look at performance and we can watch this happen over time. And so what I'm gonna do is I'm actually gonna cut this and I'm gonna cut that and I'm gonna cut my, both my client and my server for a second.
[00:05:58] And I want to run a few of them. And so, what we're gonna do is I want you to name it in such a way you're not gonna lose it, so you could do it like temp, master 500, 1,000, 1,500. However you want to do it, but let's try running on your own your numbers may be a little bit different than mine.
[00:06:15] Just choose three numbers that are progressively larger and we're gonna kind of look at each performance as it goes and maybe see how does this the server react? And you're gonna keep running with the same number of connections as we make these improvements and see do we make it have less or more?
[00:06:29] And we'll be able to look at this data in the past. So I'm gonna just start right here and I will simply run this thing as I don't like to type. So I'm just gonna do master 500, there we go. And we'll run this thing [INAUDIBLE] wrong window I like to run it in this window.
[00:06:45] I have OCD about which windows I like to run things? And so this will run the server, then I'll run the client. I'll make sure I'm doing 500 connections, and we're gonna run this one. So this one's gonna collect information. We'll let this one go for a little bit.
[00:06:58] So I'm gonna replace this right here with a [INAUDIBLE] 500, I let this go until we get like maybe 1 to 2 million tics, and then I'll repeat it again with 1,000, then I'll repeat it again with 2000. We'll see how it goes. Probably go to about how about 1000 games played, two rounds of it.
[00:07:20] There you go, we're up to after that. So I'm just gonna kill that. I'm gonna kill both the client or the server and the client and I'm gonna do it again, except for this time I'm gonna do 1000. Be careful if you run the same log file twice, you're gonna get mixed results.
[00:07:34] So careful with that. So I'm gonna run it at a 1000. There we go. I'm gonna go over here. I'm gonna rerun this one, but I'm gonna make sure I put my 1 value as 1000. All right, so now, when we run the server, we're running it with 1000 connections.
[00:07:47] One big thing about running a lot of perf up, look at that. I didn't even listen to myself. I'd even set my own you limits to be proper here. Look at that I got had by my own problem that I warned you about. It says right there, too many files open.
[00:08:00] Just feels so bad. All right, I'm gonna remove my previous log file, and I'll run it again, start that up, start a nice, fresh, clean run here, because I don't want that to be a part of things, and then run this. And so now we're gonna let that thing run, and I'm gonna open up a new tab, and I'm gonna change this out to 1000.
[00:08:24] And so we can kind of start seeing this thing run nicely. As you can see, it's looking a lot worse. And my laptop 's a-spinning. Of that ratio, we're sitting at 38% as opposed to 31%. So again, just let this go for a little bit I'll try to get two rounds.
[00:08:54] Again, doing a lot of the perf stuff, you find yourself doing a lot of waiting and running. It's just part of the game because you want to try to get some level of, You want to be able to at least have some confidence moving forward in the numbers you're looking at.
[00:09:11] And I just wanted to have these for historical reasons. I think it's really fun to look back on numbers to see how good or bad you're doing. So I'm gonna kill that and we're gonna do it one more time, except for I'm gonna put 2000 here. So I'm gonna run with 2000.
[00:09:22] 1000 on the server I'm gonna run with 2000 on the client. I am making some assumptions that you guys can follow along you know how to use your terminal well you know how to use the browser localhost make sense. I just wanna make sure it cuz it's it's difficult to always try to prepare for every situation at 2000.
[00:09:42] Look at that. This is not looking good at 2000, yes Mark.
>> Someone's asking if you could clarify the scale.
>> The scale,
>> Yeah, just like the chart.
>> Yeah so I'll clarify the chart one more time. So the chart is very, very simple. This first one right here is ticks on time that is 15, 16, or 17 millisecond ticks.
[00:10:04] This right here, take overruns are 18 or more milliseconds meaning a game loop that took longer than 17 milliseconds is considered an overrun. A game loop that took less than 15 milliseconds is considered an underrun. The ratio is simply the addition of I should have not called it ratio, I should have called it percent late.
[00:10:24] And so that would be 1.185 million on time plus 466,000 over plus 81,000 under on the bottom, on the denominator. I'm just normalizing it according, I'm creating a nice little eigenvector, if you will, of this guy. Sky right here and normalize the value nicely, so it's 31%. And so when we increased to a 1000, it went to 36%.
[00:10:49] Now that we're at 2000, which I've probably played a lot of games. Well, it's not looking fantastic, right?
>> We are looking at about 62% of the time. We have values that are over. And that makes sense cuz it almost looks like only 1/3 of the frames are actually on time.
[00:11:05] In fact, it looks like we even have we actually have more frames that are over running that are actually on time. If we look at our distribution, it looks terrible, right? We have a whole bunch of numbers pegged at 0, meaning that it took so long on one side that they just had to try to run frames as quickly as It's possible through the system to try to catch back up.
[00:11:26] Which my ticker could have been probably more smart and not done anything, but we didn't do that. We're just, it's a dumb ticker. It's not well-written. So there we go. So I have some nice numbers now. I have, what does it look like at 500? What does it look like at a 1000?
[00:11:40] What does it look like at 2000? So exciting, right? All right, so we have these nice log files, hopefully you guys have done. This is kind of like a very crude stepladder test. If you don't know what a stepladder is, all you do is you create a series of tests to see how does this thing perform as we increase input.
[00:11:59] And that's gonna give you a good idea as to what your server or what your program's gonna do as it gets more complex. Measuring something at a singular point is really difficult because you don't really know how fast it grows. You don't know what the behavior of the program is cuz a lot of times you can run nice for a while and once you cross some sort of threshold, then it just like takes off and becomes really, really bad.
[00:12:20] So it's very good to make sure you do multiple measurements