Transcript from the "Data Structures Overview" Lesson
>> All right, I wanted to start with a quick primer on DSA, data structures and algorithms. This is the few minute version of what is full courses for other people, right? So, I'm definitely not going deep into detail. We will get into many of these topics and come back to them as we code along, but I just wanted to present for those of you that maybe don't know some of the terms, just so you have a little bit of basic familiarity with the terms, how they relate to each other.
[00:00:30] These are the sorts of terms that you go and google and find the Wikipedia pages for. And just read the first three paragraphs of any Wikipedia page because beyond that, it's probably too much depth for you. But the first three paragraphs of any one of these Wikipedia pages is probably enough to get you a familiarity like, okay, I know where that's gonna fit into the overall scheme of things.
[00:00:52] Some common data structures to be aware of, these may be familiar to many of you, maybe not familiar to some of you, I just picked a few of them that you definitely wanna have on your radar screen and have done some at least minimal research on. Arrays. We talked about that, the concept of sequentially ordering a list of values, they go by lots of different names, lists, and arrays, and vectors and, there's all kinds of subnuances and caveats underneath each one of these.
[00:01:22] But arrays are sequentially, generally indexed order, numerically indexed collections of values. Stacks are most always built on top of arrays, although they don't have to be built on top of arrays, they can be built on top of other primitive data structures. But a stack is kind of like an array with an additional property which is that we only put on to the end and we only take off of the end.
[00:01:47] So if I start out with two values in a stack, and I want to take one value off, I don't take the first item off, I take the last item off cuz it's last in first out, so called LIFO. Last In First Out is what we have with stacks.
[00:02:02] Stacks are immensely important to understand. We use terminology like pushing and popping, we push on to a stack, my favorite visualization of that is the stack of pancakes. I got four pancakes, that's not enough. Give me a fifth pancake, you put a fifth pancake on top, okay? And then I eat a pancake, and then we've popped it off the stack, and now I'm back to four, and now I need a fifth one again.
[00:02:27] Just tell the waiter, always keep five pancakes on top. Okay, so there's your stack, right? And this is immensely common in programming. We'll see this over and over and over again, you'll hear that word stack over and over again used, so be familiar with it. Queues, similar to stack, but it's a first in first out instead of last in first out.
[00:02:47] In the queue world, again, often implemented with an array, can be implemented with other things like linked lists and so forth. But in a queue, the first thing that we put in, no matter how much else we put in, the first thing that we put in is the thing we're gonna take out.
[00:03:02] Queue being like a line. Just think of it like a line at the, waiting to get on a roller coaster. Generally, we don't let people cut in line, so the next person in line is the one that gets to get on the roller coaster. Just think of it like that.
[00:03:14] And again, we use those a lot, we'll see queues today in use in some of our algorithms. These are kinda like Lego pieces that you end up putting algorithms on top of. So data structures are the mechanisms by which we implement our various algorithms. A couple of others, we've got sets, so what is a set?
[00:03:31] Again, often implemented as an array, but can be implemented with other primitive data structures, sets are basically like, I want an unordered collection that is unique. unordered is important. In arrays they're ordered. So if you were implementing a set with an array, you would actually have to go to a little bit of extra trouble to make it seem as if the array was set like, because two sets with the same values in a different order should be equivalent.
[00:04:00] And two arrays with the values in different orders would definitely be different, right? So you have to go to a little bit of extra trouble, and that might be why you didn't actually implement a set with an array. Maybe you implemented it with something else. Again, lots of choices there.
[00:04:13] But sets are an unordered collection of unique values. Extremely helpful. We will implement traversal algorithms and talk about traversal algorithms, where we need to remember that we visited something before. You throw it into a set and then you ask the set, have I seen this thing before? Extremely useful.
[00:04:37] Could you do that with an array? Of course you could. We've got array, we've got array includes, or index have or whatever, but that's not the right tool for the job, the right tool for the job is set. Why? Because you would not wanna just keep throwing the same value into that array over and over and have it grow and grow and waste your memory.
[00:04:52] A set's gonna make sure we only have one copy of the thing. I'll talk about map in a moment and then we can answer the question about, there's a question being asked about, when would you use a map versus set? They're very different data structures so, you would almost never use one interchangeably with the other.
[00:05:43] And so, you're probably familiar with these sorts of things, JSON, objects, or how most people kind of visualize this, but an object is simply a place to store some data in a collection and give it a unique name, generally a string. You technically can use numbers cuz they just, string of I is the number for the key but, usually you stick to numeric indexing on arrays, and string indexing on objects.
[00:06:10] Now what's a map? A lot of people get confused object versus map. Maps are very similar, but maps under the covers fundamentally say we can use any value as our key not simply a string. And that is why there it gives rise to the more formal concept which is called a HashMap, what does that mean?
[00:06:29] There's other terms that you'll find as you google around for this but, a map has to take any value, and if it's a string or a number, it's really easy, right? We just know where to store stuff if we've got strings and numbers, or at least we can think in our minds, I know where I would probably store something if the key was a number or a string.
[00:06:49] But what if the key is a function? Function's a weird value, what would I do? Would I string of I the function or whatever? So there's this notion of hashing something. A hash is a one way mathematical transform on a value, that produces ideally a unique value. Unfortunately, there's no perfect hashing, so we always have what's called collisions.
[00:07:14] And the job of an implementer of something like a HashMap is, what do I do If two different values I push through some transform, and I get a hash to use as the key, and both values end up with the same key? They say end up with the same hash, what do I do?
[00:07:31] And so there's lots of different strategies for how to deal with that. You have these kind of exotic data structures where there's a hash key mechanism used to do the single slot and that in it, and at any given slot, then it's simply an array, a list of all the values that had a collision in their key or something.
[00:07:51] So, it's lots of different complexities they're not relevant to our discussion in this workshop, but now understanding maps and sets, you can see that in a set, its key is completely irrelevant. It's simply the inclusion in the data structure, that's the only question we want to ask of the data structure is, does it have it or not?
[00:08:13] I don't care where it is, I don't care how you stored it, I don't care how you implemented it, just tell me, do you have it or not? And ideally, also there needs to only be one of it, right? Don't ever let there be more than one of it.
[00:08:26] So the kinds of questions we'd ask, like inclusion or exclusion, is it there or is it not, we'd ask that of a set, you technically could ask that of a map, like does it have the key, but again, that's not the right data structure for that kind of question.
[00:08:41] So you often would not, it's not to say that you couldn't and sometimes we do stick a key into an object to represent that we've seen it or not seen it or whatever but, usually only use something like a map or a HashMap when you're going to retrieve it by its index.
[00:08:58] In a set you don't need to retrieve it because you already have the value, and you're asking, is the value already in there? But with, something like a map or a HashMap, you wouldn't have the value but you would have the key, you would have the index, the property name, whatever, you would have that, and you would be saying, tell me what the value is for this thing.
[00:09:19] So it's more of a lookup, as opposed to an inclusion exclusion question. Just different kinds of problems that we solve they're both useful, we do use them for different things. Moving on, we've got trees and graphs, trees are extremely common we use them quite a bit. And many people if pressed, can't actually tell you what's the difference between a tree and a graph.
[00:09:40] They are very different, and some of you listening probably are like, yeah, they're used for very, very different things, they have different properties. What's interesting about the algorithms that we are gonna look at, and the algorithms that you're likely to run across, is that many times there is a version of the algorithm for kind of each of these different classes of data structure.
[00:10:03] So there are algorithms that work on trees, that have an adaptation, sometimes just a one-for-one, and sometimes a slight tweak, an adaptation that works on a graph. But graphs are gonna throw more complexity in, so the algorithm has to generally be a little bit smarter to handle, what am I doing with a graph?
[00:10:19] And there's lots of different kinds of graphs, there's directed and undirected, there's cyclic and acyclic, there's all kinds of stuff. But generally, if you learn an algorithm in a tree algorithm, there's at least some way to apply that or translate that concept over into the graph world. So, what's the difference between a graph and a tree?
[00:10:38] A graph is going to allow all of those sorts of multiple connections. So, you can have multiple parents, one node can have multiple inbound arrows if you were drawing it out in your mind, that's a graph. In a tree, the relationship is always unidirectional. A child only has one parent, it never has two parents.
[00:11:00] So there's no such thing as cycles, trees are inherently directed from parent to child relationship, graphs are not inherently directed, meaning that the arrows have a certain way that they go, but you can create directed trees. I mean directed graphs, or you can create kind of undirected, uniconnected, universally connected graphs.
[00:11:24] So, some words that I've been spouting out, words that you see here in the slide, these are things, if any of them are unfamiliar to you, just make yourself a note, I should go and read the first few paragraphs on a Wikipedia page just to kinda get a little bit more comfortable with them.
[00:11:39] We'll see them in practice as we go. There was a question here, yes?
>> Yeah, so if we look at the categories, the array, set, queue, set, object, map, and tree graph, it's another way of thinking about the differences between those groups. You have a change in the permissiveness, say, of ordering and indexing.
[00:12:01] So, Array, stacks, queues, they're pretty friendly to what you throw at them. Sets, objects, and maps are going to be much more reactive to what you've tried to put in and you have to know kinda how they're going to behave and what you want out of them. And trees and graphs are an absolute mess if you don't go in with a plan.
[00:12:23] Because they'll do anything you want and the tree will turn into a graph, so we have to have a way of accessing the, we have to kind of an expectation for ourselves to access the information, cuz the data structures provide less rules as you move from one to the next.
[00:12:40] Is that a fair interpretation here?
>> I don't disagree with any of what you said, I think it's a pretty reasonable way of summarizing things. I won't take credit that I had any deep thought exactly in exactly how I categorized these or whatever, but there is definitely a difference between whether order matters in the data structure or not.
[00:13:01] That's one of the big characteristics that will differentiate data structures, and you see that array, stacks, and queues order absolutely matters in set, object, and map order does not matter at all, in fact, the concept of order is somewhat undefined. In trees, order absolutely matters and in graphs It depends.
[00:13:20] It depends on the kind of graph. So ordering is certainly one of the characteristics that we could distinguish data structures with. I think there are various other ones, but I agree with your general interpretation that that's a primary characteristic to keep in mind.