Transcript from the "Binary Search Tree" Lesson

[00:00:00]

>> Trees are another way to structure data that they're not arrays so that's the first thing that you should be aware here, like we're not going to be re implementing arrays like we were with ArrayList and linked lists. Trees are usually used for special case kind of things.

[00:00:21] So one we're gonna be doing binary search trees. Not all trees necessary have to be in sorted order but binary search trees actually do have to be right? So for example, like an array of this 1, 5, 2, 7, 3, this is a totally valid array. But this would not be a valid binary search tree.

[00:00:41] Binary search trees have ordered to them and they must be sorted at all times or it's not technically a binary search tree. If we add 5 to this tree here that had 1, 4, 6, 7, the 5 actually has to come between these two or it will not be a binary search tree.

[00:01:02] There's many, many variations of tree structures. There's B trees, there's red black trees, there's ABL trees, there's binary search trees, there's N trees. There's tons, and tons, and tons of types of trees. We're just gonna be looking at two of them today, which are binary search trees, as well as Avielle trees.

[00:01:21] At its core tree is pretty similar to a linked list, that you're gonna have a node. And those nodes have next counters or next pointers, right? We're gonna call those children, right? Because there's some inherent idea of hierarchy here that you have one node and that node has two children, and that has two children, and that has two children.

[00:01:43] That's important to know about trees as they have these sorts of hierarchies. The first one we're gonna be looking at is a binary search tree. Are often abbreviate abbreviated as BSTs. So you'll probably hear me call that. The binary part means that the node only has two children, right?

[00:02:05] As opposed to a three tree which will have three children, right. So binary means two children at most. So it can have zero children, one child or two children. And then the search part means that it's made for searching. Are particularly well suited for searching anyways. So let's take a look at one such binary search tree that we have here in our notes.

[00:02:31] So we have 8. This 8 is gonna be considered the root of the tree. I don't know why trees always go from top down but that's just the way that they're always modeled. So here at the root, 8, everything left. So in this subtree here is all smaller than eight.

[00:02:50] Everything that's in the right tree, so the 10, 14, 13 that's all bigger than 8. That's always true for every node in the tree always or it's not a binary search tree. Right, so if you look here in this 3, sub tree here, right? One is smaller than 3, 6, 4, 7 are all bigger than 3, but all smaller than 8.

[00:03:12] That makes sense so far? So for example, if I put, I don't know, 9 here as the right child of the 7, that's no longer a valid binary search tree because 9 is in the left subtree of 8, and it's bigger than 8, therefore, it is not a compliant binary search tree.

[00:03:33] So I just wanna, again, you need to be able to make those sorts of assumptions about your code, or else, this is not gonna work, right? So why is this useful? Let's say I want the data that's stored in this 4 node. The nice thing about a binary search tree is you start with the root and then you say, okay, is 4 smaller than or bigger than 8?

[00:04:01] Smaller, bigger, smaller, found, right? So lookups in binary search trees are no worse than login. Well, it's not totally true. The average case scenario of lookups in a binary search tree or login, right because we can skip a bunch of nodes. So for example, if I'm looking for 4, I didn't have to look at 10.

[00:04:25] I didn't have to look at 14. I didn't have to look at 13. I didn't have to look at 7 and I didn't have to look at 1, right? So the idea here is they become really easy to find. So basically, you're prematurely optimizing your data structure here for really fast lookups.

[00:04:43] The place that famously uses trees for lookups is database indexes. So when you say hey MongoDB make me an index of my IDs, what MongoDB does is it builds a tree out of all of your IDs, and then it has pointers to all the various different objects that it stores inside of MongoDB.

[00:05:03] Now, it's probably not a binary search tree. It's probably some other variation of a tree but it's almost certainly a tree. So that's why we wanna know how to use these trees because it's really useful for really fast lookups. So again .find is called with 4 here, right?

[00:05:24] Start with the root 8 for smaller so go left, then you're here on route 3, right? Or no 3, 4 is larger so go right. Okay, so you're a node 6 now, 4 is smaller go left. And then you found here, this is node 4. So you found the thing that you're looking at.

[00:05:41] So that's the logic of a lookup inside of a binary search tree. So big O here, best case scenario is that it's the root which would be a constant 10 look up, right? So that's the best case scenario. The worst case scenario is that you have to go through every single item in the array.

[00:06:01] Right, so actually I have it down here that the worst case scenario looks like this. So if you add things in a poor order, you can get 1, 2, 3, 4, 5 and that means I have to go through everything to find the 5. We'll talk about how to address that later but that's the worst case for lookups.

[00:06:27] So how do we add elements to the array? Well, it's relatively the same. As a find, so let's say here I wanted to add 7 to this tree right here. We start at 10. 7 is smaller, so go left. It's larger than 5 so go right it's smaller than it, go left.

[00:06:52] And there's no node here as a left child of 8. So we just add it, right, we put a brand new node there, and we end up with 7 right there. Okay, so delete a few extra steps. The Delete. So let's say we have the same tree here, and we wanna delete 5.

[00:07:15] So that ends up being a problem. Because what do I put there instead, I can't just not do anything there. Well, there's kind of two different options there. You can put the greatest left sub child there. So in this particular case, the greatest left sub child would be 3.

[00:07:34] Okay, and I can put 3 there. And now the tree still works, right? So 10, 3, 8, 6, 7 that all still works. Or I can put the least right child. So the least right child in here is gonna be 6, right? So, if I put 6 here, the tree still works, right?

[00:07:58] So, in that particular case, I'd have to move 6 up to here and then I have to move 7 to be the child of 8. So in this particular case, I would end up with 6 being here and 7 being down here. So kind of the cheap way of doing that as you call delete on 5.

[00:08:21] First thing you do is gonna call find on 5, which you're gonna find it here, right? Right there. Okay, we found 5. Then you do find least right sub child with 5. In that particular case, you're gonna find the 6 down here, this 6. And then all you're gonna do is you are going to move the value of what was in this one up to this one.

[00:08:47] So that's why I put this marking here. So you can see this used to be 5, but now we've just replaced the value of that 5 there with the 6. And then we took everything that's the right child of this one and we just move it up. So in that particular case you end up with 8 having a left child of 7.

[00:09:16] Make sense? Cool. So, there's a couple other variations of this scenario, for example, if you'd call delete 15, and it has no children, you just delete it. [LAUGH] Right? So here, you would just set the right child of 10 to be nothing. So that's pretty easy one. So if I call delete 15, right?

[00:09:52] 15 has no right child but it does have a left child. So all you do is you just move the left child up, and then you end up with that. Okay, we talked about worst case BSTs. If you add everything in a sorted order to a BST, you end up with this kind of straight line.

[00:10:16] Which is just like a really crappy linked list right? We'll talk about how to mitigate that with what's called a self balancing tree in the next lesson, but for now we're gonna have you build BSTs. So we talked about why to use a tree you basically optimizing for search ability.

[00:10:38] Because most trees have an average case look up of log n, which is good Even on extremely large data sets. Okay, so anyone have any questions about BSTs?

>> Yes sir I have.

>> Okay.

>> Is there any duplication is possible in BST duplicate element like two.

>> Let me make sure I'm understanding you're asking, can you have duplicate elements in a BST?

[00:11:21] Yep, so you can either put them in the left tree or the right tree, right? So let's say I added 15 to this one, right? I can either put it in the left tree or the right tree. And it doesn't matter just be consistent. The route is just the first elements that you added to the tree.

[00:11:43] So if I start with an empty tree and I call dot add with. 6, right, then 6 is my root element. So it just depends on the order that you added things to the tree.

>> Is a doubly linked list, technically a BST since both seem like that, but left and right?

[00:12:02]

>> Is a doubly linked list technically a BST? It's not for a couple of reasons. One, doubly linked lists don't necessarily have an order, right? So, Right, if I have a doubly linked list, it's not necessarily going to have everything smaller to the left and everything bigger to the right.

[00:12:26] The other thing about a doubly linked list is, it's circular in the sense of three or cyclical I think is actually the term, 3 would point to 8 and 8 would point to 3 ,and that's not true for BST. 3 actually doesn't have a pointer up, right? It only has its pointer to its children, only 8 has a pointer to this, right?

[00:12:47] So they're very similar in the sense that they both have two pointers to other items. But not necessarily similar to the fact that a doubly linked list is not a tree.