Introduction to Data Structures for Interviews Hash Table: Usage, Constructor & Insert
Transcript from the "Hash Table: Usage, Constructor & Insert" Lesson
>> Bianca Gandolfo: All right, so the next thing we're gonna do is implement a hash table. So just to remember, the interface of a hash table is that you're going to insert and remove in constant time. The way the magic that makes this constant is gonna be our hashing function.
[00:00:18] And yeah, so when we insert, we send a key value pair through, when we remove, we just send the key. Retrieving is just also a key, and this is the same as an object, right? If you're assigning a value to an object, you need the key and the value, right?
[00:00:35] So you can say that myobject.prop = 2, something like that. So the prop is the key, 2 is the value, so you need both. And then when you're retrieving, right, you say myob.prop, right? So you just need the key in that case. So that is the interface that we're going to be copying.
[00:01:01] So let's just get started with creating an instance, so my instance is called my hash table. This is gonna be an object that has some properties on it. In this case, it's gonna be a storage property that's assigned to an array. The reason we're using an array is because we can't implement an object with an object, so we're gonna use an array.
[00:01:27] Otherwise, that would be cheating. Okay, so let's think about this. So let's just console.log my hash table and see what it looks like. All right, so just like we expected, it has our storage on there. Great, now, let's insert something into myHT.insert. So let's inert a, ('a', 1).
[00:01:59] So a is our key, 1 is gonna be our value. So then I will imagine that my hash table would look something like this. So at some index, let's say 3. We'll get to how this happens exactly, let's say 3. We will save the key and the value here.
[00:02:35] So this is how I would imagine this would go down. And maybe we have some other 0s here. So we have an array initialized with some 0s or undefineds, it doesn't really matter. But when I insert it, I need to add this into my sorts array, it needs to be constant time.
[00:02:54] Our only option with constant time is that we need a numerical index for our array. That's the only way we can do that. Otherwise, every other thing that you're thinking of would require looping through and finding the value. And we can't do that because hash tables constant time.
[00:03:12] So we're storing that there. If we have multiple items, so let's say that we also insert b. And we wanna give it a value 2. And let's say for some reason we have the same hash. So this is how I'm gonna handle the collisions, is that I have an array with two items in it.
[00:03:38] So that's kind of what I'm thinking in terms of how I'm gonna work with this.
>> Speaker 2: Why did you just go smack dab in the middle rather than the front?
>> Bianca Gandolfo: So we'll get to like how it gets its location in a second. But that's a good question.
[00:03:55] Great, okay. So let's get started, and we'll talk about how we get to the specific index. Because that's the really important aspect of a hash table, is how do we get to that index. Okay, so our insert, we need to have our key and our value.
>> Bianca Gandolfo: So,
>> Bianca Gandolfo: So we are going to initialize our index. We need to get an index. And what we're gonna do is we're gonna use our hash function that I implemented for us. And that is gonna take a string and a number. So the string will be hashed to a number between 0 and n.
[00:04:46] Again, a core property of a hashing is for whatever string you input, it's always gonna give you the same output. And in this case, the output is gonna be between a range. And that range it's what we want the size of our hash table to initially be. So I like to start with 25.
[00:05:05] And of course, this is something that you might wanna think about, how much data are we trying to work with? Say we're gonna have a hash table, and we only have two pieces of data ever. And that's just something we know somehow. You don't need to have 25.
>> Speaker 3: Well, but is it gonna take up on your space, the space holders so to speak?
>> Bianca Gandolfo: Not really.
>> Speaker 3: So then it's really more unnecessary.
>> Bianca Gandolfo: Yeah, yeah.
>> Bianca Gandolfo: So we have our tableSize, which is gonna be 25
>> Bianca Gandolfo: So we'll just keep that in mind, and so great.
>> Speaker 2: Why did we hard code that in?
>> Bianca Gandolfo: We just are initializing it.
>> Speaker 3: So doesn't feel so consistent.
>> Bianca Gandolfo: Yeah, you could parse it.
>> Bianca Gandolfo: You want in.
>> Bianca Gandolfo: Probably better, I like that better. Okay, so now we want to hash our key to an index. So we have our key and then we're also gonna pass this.tableSize.
>> Speaker 3: Won't you have an underscore before, like in front of tableSize?
>> Bianca Gandolfo: Cuz I forgot, thanks.
>> Bianca Gandolfo: Okay, so this.storage [index] = value. So basically we wanna do something like this, right?
[00:06:42] Where our key, which is a hash to an index between 0 and 25, is going to be assigned to the value. However, we need to think about collisions, which is what happens in the case of, for example, what if b runs through a hash function, and also returns 0, 1, 2, 3.
[00:07:05] And also returns 3. We need to make sure that we're not overriding and losing data. So we need to create an array and,
>> Bianca Gandolfo: Add these values. So I would say something like if. So if this already exist,
>> Bianca Gandolfo: So if this already exists, then that means we have an array already.
[00:07:39] If this doesn't exist, then I wanna say this._storage[index] = an empty array. So I'm initializing like a container array inside of our storage array. So right now, it will look something like this. Storage looks something like this. Let's say our index is 3 and there's 25 or something, maybe.
[00:08:14] Actually, it just looks like this. We'll go with this.
>> Bianca Gandolfo: So that's what that does. So we're just initializing and it's empty. We haven't added anything yet. So now we have this set index is gonna return this array. So what I wanna do is push. I'm gonna push another array with the key and the value.
[00:08:45] And so what this does is it will push something like this.
>> Bianca Gandolfo: And the reason is, and so once we do this again. So if that doesn't exist, then we initialize it. Otherwise, we push it, hm?
>> Speaker 3: Is it just me or that's weird syntax, where's your curly brackets after if statement?
>> Speaker 2: You don't need it if it's on one line.
>> Bianca Gandolfo: Yeah, I just one-lined it, but you could do like this if you wanted.
>> Bianca Gandolfo: Yeah, they do the same thing though.
>> Bianca Gandolfo: Okay, so we're initializing that. So now, the second time that we call it, we have this already.
[00:09:33] We hash the key of b, and let's say it returns the same 0, 1, 2, returns 3. So what we need to do is push the key and the value as an array. So here's the array, here's the key, and here's the value. Okay, so that's what we just did, we're handling our collision.
>> Bianca Gandolfo: So collision is when your hash function returns the same index for a different key. If you didn't handle a collision, what we would do is instead of all of this,
>> Bianca Gandolfo: What we would simply do is we would just have the value there. So the first time when we look up 2, and we get 2 and we pass a, the value will be 1, so then here this will be 1.
[00:10:36] The second time b hashes, and then it would overwrite it to 2. And so this is very dangerous. You don't wanna lose any of your data in a data structure. So that is why we're doing it like this. And we're also storing the key because when we do a lookup later, we wanna remember which one is associated with which.
>> Bianca Gandolfo: Okay, great, mh-mm?
>> Speaker 4: There's a comment from chat. If this _storage index is 0, then that would coerce the false.
>> Speaker 4: If the value of-
>> Bianca Gandolfo: It should be an array or not an array.
>> Bianca Gandolfo: So if you do something like you just say new Array[val).fill
. In this case, were you initializing your storages and arrays with a bunch of 0s, then, that would be a problem.
[00:11:52] No, actually no, it is still the same. Cuz 0 would still be false.
>> Bianca Gandolfo: Okay, but I don't see the need to initialize it with 0s or anything. Okay, so where were we? I think this is gonna work. What do you think?
>> Bianca Gandolfo: Guess we can try it.
>> Bianca Gandolfo: All right, let's see what we got.
>> Bianca Gandolfo: Nice, so we don't have a collision, but we do have our nested arrays.
>> Bianca Gandolfo: And you can see this one 16 empty items, a couple empty items in between.
>> Bianca Gandolfo: Okay.