Transcript from the "A Bug Story" Lesson
>> Douglas Crockford: I made a bug once and I need to confess. I need to come clean and share this with you. It was in 2001.
>> Douglas Crockford: And I was writing in Java, I wrote one of the first JSON parsers and it was part of a reference implementation to show people how easy it was to write a JSON parser.
[00:00:27] And it included this statement, which created an index variable, which counted characters into the string or a stream that was being parsed. So if there were a syntax error, we could tell you at what character position that error occurred. And an int is big enough to cover what about 2 billion characters, which is pretty big.
[00:00:52] That was 2 gigabytes was a big disk drive at that time and the way I was using chase and they never had a message that was bigger than a couple of K, so I thought that's a lot of headroom. That's being pretty generous. I'm future proof, this is good.
[00:01:11] So a couple years ago, I got a bug report from somebody who had passed a JSON text to the parser that was several gigabytes in size that contained a syntax error past 2 gigabytes. And the error message that my parser provided was wildly wrong and they looked at the code and they very quickly figured out why it was wildly wrong, and was because I said, int.
[00:01:39] So I hate int. Int's terrible. Int, I think is one of the worst ideas in the history of programming and the reason I hate int is because what it does on overflow. So, let's talk philosophically. So, what should happen if you have a value that's too big to fit in a cell of memory?
[00:02:07] What should happen? There's one school of thought that says, well, the CPU should haul it. That's kind of old school. You're not going get 99s that way, but at least you're guaranteed you're not going to create any bad results. Death before confusion. There's another school that says, we'll have an interrupt.
[00:02:29] We'll raise an exception. We'll go to some other path, which says something. [SOUND] Do some about that, give some warning and prevent the original operation from falling through. That's reasonable. Another approach would say, we'll provide some sentinel, some non-value and put that in memory instead. So when you go back and look for it, it'll say, don't know what you're looking for, but it's not here.
[00:02:54] Another approach might say, take the largest thing that does fit and put that in there instead. It's called saturation. That's a reasonable thing to do in computer graphics and signal processing, but you don't wanna do that in financial applications. No, no, no. But suppose what you want to do is maximize the likelihood in effect of errors, I think what you wanna do is throw away the most significant bits and don't tell anybody and that's what we do and I don't think it makes sense.
[00:03:27] So, why do we do that? It's because it made sense in the 50s. So in the 50s, we made computers out of vacuum tubes and vacuum tubes are big and they consume a lot of power and they get hot and they burn out quickly. So the more tubes you have in your ALU, the more it costs to build, the more it costs operate, the more it costs to maintain, the shorter the mean time to failure.
[00:03:56] So if you can figure out a way to take some tubes out, that's a huge win and some genius and I do believe he was a genius. Figured out that if we use complement arithmetic instead of signed magnitude, we don't have to implement the subtract circuit. All we have to do is put a compliment, which is almost free in front of the add circuit and ignore the overflow.
[00:04:20] That was brilliant. That was a great idea at the time. But since then, Moore's Law has gone crazy with the number of devices that are possible on a chip and we have not reconsidered that decision that was made in the 50s and I think we are way, way overdue on that.
[00:04:37] To make this even worse, memory used to be extremely scarce and expensive. Machine might only have a couple of K in it. The Atari 2600, if you remember, only had 128 bytes of RAM. That was all of the ram in the entire machine. So, you were really careful in allocating memory to numbers.
[00:04:58] And if you could figure out a way to get two numbers into 1 byte, you would do it, because you didn't have very many of them and our programming languages are still in that mindset. So for example, in Java, you've got byte, char, short, in, long, float and double just as the main built-in types.
[00:05:15] So every time you create a property or a parameter or a variable, you have to ask, which one of those? You've got to pick one. And if you get it right, then okay. But if you get it wrong, the program is gonna fail and it's not gonna fail immediately necessary.
[00:05:37] In fact, it might fail some time in the future and tests cannot find that, because your tests all assume a particular data model and they are not sufficient for finding these sorts of errors. Tests fail on this stuff and there's no value in having picked the right type, because I saved 16 bits on this variable.
[00:06:27] That means it's impossible to make an error by picking the wrong number type and that's a huge win. I'm totally in favor of that. The only problem is, it's the wrong type and you know it's the wrong type, it's because 0.1+0.2 is not equal to 0.3. I've heard from some people who say that doesn't matter and they're absolutely wrong, that matters a lot.
[00:06:52] So, how did this happen? How did we get to this place? Well, again, we go back in time. So in the 40s, when the first Von Neumann machine start coming online, they are integer only machines, but most of the programmers are mathematicians and they're trying to figure out how to do real computation and it's hard.
[00:07:16] They're trying to do stuff with scaled integers and it's a lot of work, and it's error prone, and someone who is really smart figures out floating point that we will have two numbers per number. One is the number itself and the other is a scale factor, which tells us how many positions to move the decimal point.
[00:07:34] And then we can just give it to a subroutine and the subroutine will figure out how to add these things, and it worked, and it made programming much easier to do. Unfortunately, those libraries were really slow. So when we get to the 50s, there is now interest in putting floating point into hardware, but we're making stuff out of tubes and it's hard to do.
[00:07:57] And some genius and I do believe he was a genius figured out that if we use binary floating point instead of decimal floating point, we don't have to implement a divide by 10 in order to do scaling. We can just shift one bit, which is free. And so, they did that.
[00:08:13] Now, that worked great for scientific computing. Because in scientific computing, your lower order digits are probably wrong, anyway. So the fact that you can't exactly represent the decimal digits is not all that important, but it doesn't work for business processing, because they're adding up money and they need to be exact.
[00:08:35] They have to give the sense exact. So there is this division, there is scientific computing and there is business computing and they use different hardware. They use different programming languages. So when you order your mainframe, you either get the floating point package or you get the BCD package for doing your COBOL processing.
[00:08:55] And that's sort of the way the world worked until we kind of came into the modern age and COBOL dies off and Java is the successor to COBOL, but Java comes from the Fortran school not from the COBOL school. It doesn't do a good job of dealing with the the business types, but that's kind of the tragedy that we're in now.
>> Douglas Crockford: So, I propose to fix it and this is my solution. I called DEC64 and it's a 64-bit quantity containing two numbers modeled very much after what they were doing in the 40s. So I've got one number, which is a coefficient. It's 56 bits long and I've got another number, which is the exponent, which is 8 bits long and it's really nice and it works, because of this 10.
[00:09:51] So, it's really easy to see what something is going to do. It's just this formula, really really simple. No complicated packing or unpacking. I put the lower 8 bits in the bottom, because I'm Intel architecture, we can unpack that for free. So a software implementation can be very efficient and this gets us 16 or 17 decimal places, very accurate.
[00:10:14] We can do exact business processing. It's not data doing scientific processing, either. So my proposal is that this be the only data format in future application languages, that we just have one number type and it be this one. So to prove it, I have a software implementation available in DEC64 assembly language, that's on GitHub.
[00:10:38] If you're curious about this, you can check it out and and it performs pretty well. So if you're adding integers in a software implementation, it can add two integers in five instructions, which is pretty good. It's maybe four more than you'd like, except that with those five instructions you also get overflow protection and you get nouns, which are nice things to have.
[00:11:06] I think essential things to have. In a hardware implementation, adding integers should happen in one cycle, which means we don't need to have ints as separate type in order to get performance. We can get performance and we can get the range of values that we need in one number type.
[00:11:26] So, my goal is to convince everybody in the world that this is the one number type to use in the next system. Now it turns out, I don't actually have to convince everybody in the world. I only have to convince one person, that's the man or the woman who designs the next language.
[00:11:46] If I can convince that person that they want a language with one number type that works well with humans that does arithmetic the way we were taught to do arithmetic in school, so that we won't get confused, then we'll win.