Transcript from the "Pseudocoding a Hashing Function" Lesson
>> Bianca Gandolfo: Now we're gonna pseudocode a hashing function. We're gonna create our own hashing function right now. So the first thing we need to do is take an input, and the second thing we need to do is create an output. And this is, you have all of the creative freedom that you want to explore what it might look like to create a hashing function.
[00:00:19] In real life, you're probably not gonna have to create a hashing function, this is more of a exploratory exercise to understand how a hashing function might work underneath. But people study this kind of thing and write their dissertations on this kind of thing. And there's all kinds of types of hashing functions.
[00:00:36] But let's try our own, sound good? So input, we have to have an output and we want it to,
>> Bianca Gandolfo: We want the input to always the same output. So if I put in ABCD and it returns three, if I put ABCD in again, it needs to return three again.
[00:01:05] So it's not random, it has to be able to map. This is the input, this is the output, that's my dance for it.
>> Bianca Gandolfo: We had a mission. We needed to write a function that takes an input-
>> Speaker 2: Can you pump up the font a little bit?
>> Bianca Gandolfo: Yeah, and always returns the same output,
>> Bianca Gandolfo: Right? However, not the same as the input [LAUGH], right? If we put in two, we don't want it to return two because that would be useless, so it has to be something different.
>> Bianca Gandolfo: Okay, okay, so this is our hashing function. A hashing function is gonna take some input, it's gonna do an operation on it and return a value at the end, right?
[00:02:07] Whether that value is used to encrypt your data so that people who are watching your data go by can't intercept it and understand it in plain text. You always wanna do that with passwords, for example, or if we're using it to return a value that's gonna map to a memory address when we're saving data.
[00:02:37] So those are some different reasons why we'd wanna do this.
>> Bianca Gandolfo: So probably wanna write a function that says myHash, and it's gonna take your input, right? We could put some constraints on it, like must be a string, or whatever. You can put whatever constraints you want on it as long as it's gonna return the same thing.
[00:03:05] So if we're gonna put in, HELLO ARE WE THERE YET?, as an input and for some reason it's gonna return us 764.32a or z, z, okay, z [LAUGH]. And when you do it a second time, you want it to return the same thing every time.
>> Bianca Gandolfo: And then THIS OTHER THING THAT I'M YELLING,
>> Bianca Gandolfo: You want IT to init returns. 43, you want it to, again, you wanna be able to recreate that every time.
>> Bianca Gandolfo: And what these values mean to you could vary.
>> Bianca Gandolfo: Right, depending in what context you're using this hash in function, it's a very broad topic. Okay, so that was our mission.
>> Bianca Gandolfo: So how might we do this?
>> Bianca Gandolfo: What did you do when you were pseudocoding out a fake hashing function?
>> Speaker 2: So I got the ASCII value of each character
>> Bianca Gandolfo: Cool.
>> Speaker 2: And-
>> Bianca Gandolfo: And then what?
>> Speaker 2: At first I was just adding them all together, but then I felt like my output wasn't impressively long enough.
[00:04:42] So then I started multiplying them all together.
>> Bianca Gandolfo: Cool, so-
>> Speaker 3: Yeah, so that's what I-
>> Speaker 2: [LAUGH].
>> Bianca Gandolfo: So you got the ASCII value and added or multiplied them together, cool.
>> Bianca Gandolfo: Awesome.
>> Speaker 2: And then to make it weirder, I returned the hex number. It returned it as hex instead of binary or decimal one.
>> Bianca Gandolfo: How did you do that? Is there just a method that does that?
>> Speaker 2: Called two string and pass 16 in the parameters. So it returns a base 16.
>> Bianca Gandolfo: Cool, that's fancy, I like that. And then returned the hex instead of decimal.
>> Bianca Gandolfo: Cool, so that would do it.
[00:05:33] [LAUGH] I like that. I especially like the third step, cool. Anyone do anything different?
>> Bianca Gandolfo: No? Cool, yeah, these are all legitimate ways of doing it.
>> Bianca Gandolfo: Cool.
>> Bianca Gandolfo: Any questions? So this a hashing function, right? What actually happens in the middle of a hashing function is like a very academic pursuit and I encourage you to research different ways that hashing functions work.
[00:06:13] It's kinda cool, but beyond the scope of today. We just need to understand that a hashing function, you're gonna put in a value and it's gonna return the same value every time. And so here is a diagram of our hash table interface, and how we're gonna work with it for our hash function.
[00:06:33] So we can see here, we are gonna put in a key into a hash table. It's gonna pass through a hash function, then it's gonna map to some location. So Lisa Smith, it's going to hash to one, then John Smith two, and then Sarah Dee maybe 14, cool?
[00:07:02] And this might be like a memory address or a memory bucket, if you will. And so this is just sort of like a high level kinda visualization of how this might work.
>> Bianca Gandolfo: That make sense?
>> Bianca Gandolfo: Cool,
>> Bianca Gandolfo: What else do I wanna tell you?
>> Bianca Gandolfo: Okay, I think that's all I wanted to tell you.
>> Speaker 2: One question.
>> Bianca Gandolfo: Yeah, sure. In the picture?
>> Speaker 2: Yeah, if for instance, if we give some key value, that key value map to the same address, how do we?
>> Bianca Gandolfo: So you're only gonna map the key, not the value. And you're saying, what if it maps to the same address?
>> Speaker 2: Same address, yeah.
>> Bianca Gandolfo: Yeah, collision, they call it a collision. I was gonna wait till later, but I can start talking about it now.
>> Speaker 2: But if it's a function, you can't have a collision.
>> Bianca Gandolfo: Well, so the the hashing function is the thing that is gonna give you your hash value, right?
[00:08:12] Let's say, for example, our hash value's between 0 and 15. And eventually, if you can only return something between 0 and 15, eventually, you are going to get the same value, right? And then the question is, what do you do when your hash function for two different keys is returning the same number?
[00:08:35] Yeah, we call it a collision. And so when we have a collision, how we handle that is we put both of them in the same place.
>> Bianca Gandolfo: And then the specifics of like okay, what do you mean you put both of them in the same place? Varies based on the implementation, and we'll explore one of them.