Transcript from the "Hash Tables" Lesson
>> Bianca Gandolfo: Hash tables are made up of two things, some things store your stuff in, right, so data structure, and then hashing function. If you've done any sort of authentication layer, whatever, for your company you're probably familiar with certain kinds of hashing functions. Has anyone done that?
>> Speaker 2: Yeah.
[00:00:35] So we say, I don't know, like name, actually you don't use name. User name is, what's your user name? Some people have a user name they use that they really like and they wanna share. No?
>> Bianca Gandolfo: Like a user name. What's a famous user name? I don't know, anonymous.
[00:01:01] Anonymous user name, colon, anonymous age, colon, question mark, question mark, key value pairs. So we're familiar with that. Under the hood, there's a hashing function that maps the key to an index. And this is the core of what makes the lookup so fast for an object. Cuz we were talking about time complexity, remember that?
[00:01:30] When I was like, accessing or saving into an object is constant time, right? But if we look at for example a linked list or a tree and these other data structures that we've used that are a little more complex. Maybe you're wondering, how is that even possible to be constant time?
[00:01:52] For it to be constant time, it has to know immediately where to go and get that data. Where is it stored? It needs to go immediately to that storage location somewhere, probably in memory, in your computer. How does it know to get there instantaneously without having to like look, cuz we've been searching and sorting through things, how does it get there, right?
[00:02:13] And that's where our hashing function comes in. Our hashing function is going to map the keys which are usually a string to some location, right, some memory address. And so, once we have that memory addressed, it's gonna do something mathematical to know a location, like a distance. And then we can imagine a needle and this is an oversimplification.
[00:02:36] But we have an input that's cat, and then it does some math that says the index for cat is 4,300. And then it'll say, so the location is gonna be 4,300 plus 8 plus 6 plus 0.23. Then it'll just move a needle to that location in memory.
>> Bianca Gandolfo: So this is an oversimplification but this is how we can sort of conceptualize how an object has instant look ups and also instant saving.
[00:03:09] It'll just go directly there instead of having to say is this the location? Is this the location? Is this location I'm going in circle cuz usually we draw it out as like a disc. Those old record players. Yeah, exactly. So you can think of it as a record player, and then if you unspin the record player into a long strip, you can say each memory slot is a centimeter or something, you can find the location like that.
[00:03:45] But it's in a circle. So, other math stuff. But, that's the basic idea here. And same with an array. An array is easier to make that leap, right? Cuz their indices are numerical and so that kind of makes a little more sense. But under the hood, it's actually also a hash table and an array.
[00:04:05] Because arrays are actually just objects, secretly.
>> Bianca Gandolfo: Or maybe not so maybe we all know that now, I don't know, but. I just remember that I was dealing with the DOM API and I had to iterate through an array of inputs. And it was really confusing because it looked like an array and then I tried to for each through it and it wouldn't let me.
[00:04:27] And then I realized that it was actually an object with indices that were the keys. Mm, yeah, and there is also pseudo arrays out there like the arguments object. Have you guys used arguments in a function? That's a pseudo array. You can't do a lot, you can't slice it, for example.
[00:04:48] So there's all these weird fake array things, and they're all objects. Even a real array. A real array is an object. Really the thing that sets it apart is the dot length property and the methods that come with it. Anyway, the idea here is that we're using hash tables under the hood.
[00:05:07] And we're gonna explore and create our own just for fun. Cool, so the first thing we want to start with is just talking about a hashing function. So a hashing function, like I said, it takes an input of any size and it returns some identifier that you can map backwards to it of usually some fixed size, right?
[00:05:34] If we're thinking about that number coming back as like a memory address or a location, right? It's gonna be within some range, right? It can't return something that's outside of our range because then you'll be saving, not on your hard drive, you'll be saving it to, I don't know, somewhere else on your computer that's not useful.
[00:05:57] Or, what else? Yeah, so it's within a range if you're hashing a password or you're hashing, like GitHub will hash all your files. Like a whole file of whatever length will hash into some key. So they could be very, very complex or they could be really simple like my hash here, which is gonna return an integer between 1 and 20 based on a string.