CPU Basics & Speed Limits
Check out a free preview of the full Bare Metal JavaScript: The JavaScript Virtual Machine course
The "CPU Basics & Speed Limits" Lesson is part of the full, Bare Metal JavaScript: The JavaScript Virtual Machine course featured in this preview video. Here's what you'd learn in this lesson:
Miško walks through the basics of a CPU, including its registers and memory operations. The limitations of CPU speed and the techniques used to make CPUs faster, such as increasing clock speed, parallelism, and adding more cores.
Transcript from the "CPU Basics & Speed Limits" Lesson
[00:00:00]
>> I have a model of a CPU, right? And we are gonna import memory from a separate file, I'll show you the file in a second. Actually let's go into the file. This is a super simple memory. All it says is, if I hover over, you can see that this happens to be 100.
[00:00:18]
I just randomly assign numbers to these instructions, but these are just numbers, right? And so basically, it says a 100, 5, 111, 3, 4, 1, and then 0, right? And so to CPU that 100 means load a constant into register0, which means, read 5. Again, the numbers are made up, right, like just just kind of prove the point.
[00:00:44]
Different architectures basically have different meanings for these numbers. So this one loads 5, and then it says Register1, load 3, and then it says add Register 1 to Register0. So now you should be ending up with 8 inside of Registers0, right? So if I go and run this code, so what you can see is that in memory we have these numbers, right?
[00:01:06]
And these are the registers that exist in the CPU. I've kind of split them out for you here. So you can see program counter starts at 0, the R1 and R1 is, everything's zeroed out at the initial state. And then it reads an instruction that says load 5 into R0, so after this, there is a 5 in Register1, load 3 into R1.
[00:01:25]
Now we have a 3 in the Register 1. Then we'd say instruction to add perform the odd operations of 3 and 5 becomes 8, and you can see that there is now 8 inside of Register 0. And then I just have a halt instruction completely stop doing anything you're finished, right?
[00:01:42]
Fundamentally, that's what a CPU is. Now, it's kind of important to understand the nuances of it all. So let's build something more complicated. The first thing that I wanted to show you is, this isn't the real thing because 3 and 5 are constants that are encoded within the program, right?
[00:02:03]
And what this code really hasn't done is it hasn't only shown you how to read or write data into memory, right? And so we're gonna build more complicated versions of it in a second, but I just wanted to show you the basics of CPU, right? So the CPU has these registers, these are, you can think of them as local variables.
[00:02:24]
Flags, we'll talk about later. Here's our program counter, which is a special register that automatically increments to read the next instructions and stack pointer which we're going to talk about later. I print out the memory state, right, you can see I log out some information so that you can kind of see it and it really is just a loop that runs forever.
[00:02:45]
And what the loop does is it says, grab me the next instruction from a memory, right? So take a look at the address of the program counter, go fetch the data, and increment the program counter, right? And now we have a decode step, so now we have this we go and look like what instruction do we have?
[00:03:03]
Is it a HALT? So I believe the first instruction was load R3. So it was load constant to R0, right, was this one right here. And what this instruction really does is just says fine, just take the program counter location again and load it into Register 0, right?
[00:03:20]
And that's how we loaded the information in. And of course, I have some logging information here to kind of see what's going on. But fundamentally, this is the operation that was performed by the CPU, right? You can see the same thing for load constant R1. It's the same exact thing, but this time we're doing it R1.
[00:03:38]
There is a difference R0 is special, and it has these things called flags. Flags will make more sense when we talk about jumps. So just ignore them for a second. And you can just see that all the CPU is basically this huge decode thing that kind of figures out like, okay, so the instruction was whatever, that means I have to perform one of these many operations.
[00:04:01]
And these operations, basically say how to fetch data how to add data how to manipulate data etc. The speed of the CPU, we measure things in kind of gigahertz, right, the frequency by which we can process it. The way to think about it is, at the end of the day, we kind of reached the limit in terms of megahertz.
[00:04:25]
If you look at it through the history, I remember when I was a kid my first computer was at 86xD, it was running at a whopping 5 megahertz not gigahertz megahertz, okay? And over time, I could watch like it goes from 5 megahertz to 40 megahertz, 25 to 50, 100, and every time it went up I was like this is amazing.
[00:04:47]
And every year the megahertz just would go up, and up, and up and then it finally hit about 2.5 to 4 gigahertz, and we've been stuck there for last, I don't know, 10 to 15 years, we haven't really moved, right? So fundamentally, we kind of reached the limit of how fast we can move the electrons around.
[00:05:08]
And one way to think about it is, I'm gonna get the math wrong, so I apologize. One way to think about speed is how fast can electron travel in a given time? Actually, let me rephrase that just to kind of blow your mind a little bit. How fast do electrons move?
[00:05:26]
Anybody?
>> Speed of light?
>> Not even close. [LAUGH] This is a trick question, I'm so sorry. Electrons have weight, and therefore, sorry for going into physics here. [LAUGH] Electrons have weight, and because they have weight, they can never approach the speed of light. Photons have no weight and they can go speed of light, but electrons cannot.
[00:05:46]
Turns out electrons in your devices go extremely slow on the order of centimeters per second. Let that sink in, centimeters per second. So how is electricity instant? Okay, the reason electricity is instant is because electrons don't like each other. And so when you push on this electron here, it pushes on the next one, and it pushes on the next one, it pushes on the next one, right?
[00:06:17]
And the propagation of the pushing is at the speed of light. The electrons themselves are super slow. So there's something called an electron drift. And that's what measures the speed of the electrons. And if you get the electrons going any reasonable speed, running speed, like for humans, how we would walk etc, electrons tend to melt wires.
[00:06:40]
So the speed of electrons is extremely slow, it's the propagation of the wave that is speed of light, not the electrons themselves. Sorry for the digression, let's go back to it. So one way to think about speed is, how far can light or this propagation of travel in a second.
[00:06:59]
And so a nanosecond is this long, right, in terms of how fast you can go places. So nanoseconds, we're talking about, if I remember my math correctly, and somebody should double check me. I believe nanoseconds equals gigahertz. Is that right? So gigahertz means a million flips back and forth per second, right?
[00:07:26]
So a nanosecond would be one-millionth of a second if I get this right. Can you double check me?
>> Gigahertz is a billion.
>> Is it a billion?
>> Yeah.
>> In Gigahertz is a billion?
>> Yeah.
>> Okay, but I believe a nanosecond is about this long in terms of what light travels, right?
[00:07:42]
And I think a gigahertz is a billion of hertz, right? So that's what equals, okay? So here's the problem. That's a gigahertz, if you move it down to 4 gigahertz, so that's one quarter that's about this much, that's about the size of a CPU, okay? This is why we're not gonna go much faster than what we currently are, chips are about this big, right?
[00:08:08]
You can't go much faster, you can't get into a situation where you flip something on one side of the chip, and there isn't enough time for speed of light to get to the other side of the chip to do something, right? So this is the reason why we're hitting this limit.
[00:08:24]
And the reason I'm telling you all this, it's cuz I wanna convince you that, in the future, we're not gonna see 100 gigahertz CPUs. It's just not gonna happen because there's physical limitations that are preventing this from happening. Anyway, so getting back to this. So what we've seen is that the CPU speed has gone up.
[00:08:45]
The other thing we can do to make the computer's faster or CPUs faster is, we can be really clever about doing multiple things at once. So in the past, it took several clock cycles to process a operation. So like an operation like load constantly to Register 0 it might have taken multiple clock cycles.
[00:09:05]
So one clock cycles for the CPU to decode what instruction says, another clock cycle to go to the memory, and so on and so forth, right? And so CPUs got really good at parallelizing things together. So now what you can have is that on any given one cycle, you can actually have multiple instructions getting processed by the CPU.
[00:09:24]
Now you can imagine that that creates problems because instructions have codependencies between themselves, right? You can say, if you have Instruction A, it might need data from Instruction B. And so now you get to the world of speculative execution. And what speculative execution basically says is that, let's say I come across an if statement, I need to make a decision, do I go this way or that way?
[00:09:45]
One way to do it is to say, well, let me just stall the pipeline, meaning that, let me just wait until the previous instruction completes so that I have the data to make a decision about how to go next, right? And you can imagine that, if statements are pretty common, inside of CPUs, and so, as a result, you would stall the pipeline quite often.
[00:10:05]
And so what modern CPUs do is they say, well, how about we guess whether we go A or B. Let's just make an educated guess. So what they do is they keep statistics, and they see like last time I was at this branch I went that way, so let's just go the same way as I did last time, okay?
[00:10:24]
And so that's called speculative execution, and so the CPU starts executing something that may not be correct, right? And so the CPU executes that, and eventually the previous instruction completes and says, yes, the value is this, and the guess you have made is correct. And usually the answer is yes, it's correct.
[00:10:41]
And the CPU can then take the data that it computed and commit it into its official registers and say, yep, that was correct, and let's keep executing. Sometimes the CPU gets this wrong, and basically all the work that it has done has to be thrown away, it cannot be committed, it's the wrong thing.
[00:10:55]
And sometimes we get bizarre bugs like this. What is it called? Scepter? Where you speculatively execute and read memory that doesn't belong to you, and then you get to read memory that might contain a private key or something like that few years ago.
>> That bug blew up the Internet-
[00:11:15]
>> Yeah.
>> Four years ago.
>> Yeah, that by the globe and they're few years ago, right?
>> Yeah
>> The hilarious thing about the bug is that it's working as intended. Modern CPUs all have speculative execution and speculative execution means that at times the CPU can end up executing stuff it's not supposed to.
[00:11:32]
The CPU is good about like not committing it and so you don't get to see it, but you can see a side effect of it. And a side effect of it is that you can see what got loaded into an L0 and L1 cache. And through that indirection it is possible to figure out what data was read in memory that doesn't belong to you.
[00:11:56]
And that was a huge problem that kind of blew up. Anyways, I digress.
>> As someone said, it was called zero-day specter.
>> Zero-day specter, yeah, yeah. So that came on it. So what I'm showing you here is all the different tricks we have for making CPUs go faster, right?
[00:12:13]
Trick 1 is make the clock go faster and we have reached the limit. Trick 2 is do more operations per clock. In other words, parallelism. And I'm gonna argue that we've also reached the limit. Modern CPUs can probably keep somewhere on the order of five to ten instructions concurrently, in memory going through the pipeline of decode stage, arithmetic stage, commit stage, and all the other branching that happens.
[00:12:39]
So again, we probably reached the limit, I don't think we can do much more complicated. And so the only thing we have to us today is adding more cores. And so we've seen that in the recent years, right? CPUs keep adding more and more cores. And so now it's not uncommon for your phone to have 16 cores.
[00:12:59]
I forget the exact number, but I think it's on the order of 16. And what's cool about it is, some of the cores are high performance, meaning they eat a lot of battery, but they're super fast. And some of the cores are slow, but they sip on a battery, right?
[00:13:15]
And so with the CPU, what it actually does, or the applications, usually they run on the slow cores. But then when they need to do something complicated, they switch over to the fast core and do the complicated stuff and go back down. Now, why does all this stuff matter for JavaScript?
[00:13:29]
It matters because, JavaScript has a design flaw, and that is that it's single-threaded. Which means, we can only utilize one CPU at a time. So you might be figuring, hey, let's build, our app is slow, we don't care. In the future the CPUs will be faster, not really.
[00:13:51]
Yes, we have been making amazing progress in terms of when you look at a modern device and you say how many teraflops of data it can process and these amazing numbers. But it's bit of a cheating because those are numbers that are sum of all the cores when all the cores are pushing all at once.
[00:14:10]
But our single threaded JavaScript can't take advantage of that. Now there's ways to make a multi threaded through service workers or other tricks. But fundamentally, we're in the single threaded world and so we get to have one CPU. So even if you have modern computer with lots of cores and whatever you get to use one, okay?
[00:14:33]
So, I kind of digress. Have any questions at this point? Yes.
>> JavaScript being single threaded, is that a limitation of the JavaScript engine that's being used or is that like a more fundamental
>> It's a limitation of the design really, right?
>> Of the design of the language itself?
[00:14:56]
>> I would say that all languages in their nature are fundamentally single threaded, right? But many languages like for example, Go have lots of primitives for taking work and pushing it on the other thread. Now JavaScript does have a primitive, which is a service worker, but that primitive I wouldn't call it a light weight primitive.
[00:15:15]
So in Go this go sub routines are extremely cheap and and easy to work with. In JavaScript, if you wanted to have some work done on a service worker, that's not cheap. That's a lot of work in terms of you as a developer to code, but also a lot of work in terms of the computer to execute, but also a lot of work in terms of the fact that the service worker doesn't share memory with the main thread.
[00:15:45]
And so passing data between, requires a serialization and deserialization step, right? You basically have to convert everything to JSON, send it over as a string, and the other side has to convert everything back. Now, there are new things in there, like shared array buffers and other things that are coming to JavaScript, but as of right now, we don't really have good primitives.
[00:16:02]
Now, in the future, it's possible that the language will evolve and we'll get some primitives that are useful for multiple CPUs. But as of today, the way we are today, JavaScript is single-threaded. And again, I'm gonna say all languages fundamentally are single-threaded, it's just some languages have primitives then make it easy to basically offload work to other things.
[00:16:27]
>> Service workers are native to the browser.
>> Yes.
>> So for like NodeJS, there you have no really option for another thread then, right?
>> Correct. So NodeJS, you have to spawn up separate isolates. What are they call them?
>> Process.
>> Processes
>> Child.
>> Child's or whatever?
[00:16:45]
Yeah, those are separate threads, right? So if you wanna do more work, that's basically how it goes. And the problem becomes, I don't know, can you easily share streams between child and parent? Probably, it's tricky. Okay, so let's go further.
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops