Transcript from the "ArrayList" Lesson
>> So the question is, can we do better? Can we have array access with the ability to grow, right? That's kind of what we want in life is that it's something that I assume at least we all want that, right? And the answer, of course, is yes. Now remember, I did say with an array we can write these operations in, right?
[00:00:38] It's an ArrayList where instead of having underlying containers, we have an array of generic T, and we can use that. And when we need to grow we have to grow it. So let me explain how this one works. Cuz it's actually pretty neat idea and it's also gonna be the basis of a lot of other algorithms.
[00:00:54] Maps are gonna use these, ring buffers are gonna use these, it's a very pretty simple concept. So let's just say we have an array that is from 0 to 3, and let's just say we're storing numbers, right? It doesn't matter what we're storing. And so we have in here, we have 2, right?
[00:01:12] That's all we have in here so far. That means we have a length of 1, right? We have one item in there, but we have a capacity of 3. Does that make sense? We can store up to three numbers but we have a length of one. So that means if we were to do a get operation, say a get with an index, well, we could do a couple things.
[00:01:34] A, we can use our length that we're kind of holding on to that we're maintaining, as a way to say hey, you're either accessing in, or within or without of our array. So if you exceed this index, so if IDX is greater than or equal to length, well, then we could throw some sort of error, whatever we want to do, right?
[00:02:13] You're within the capacity? Awesome, I'm going to push the value in then, or right there, and then increment my length to point to the next part. So we're actually creating a push operation on top of an array. So that's what an array list is, is using an array as the fundamental datatype instead of a node-based structure, and then doing these pushes, these pull or pops, and being able to do these operations.
[00:02:37] So obviously, push is pretty dang good, right? We are just able to write that, it's pretty quick, how fast, what would be the big O of a push operation? So let's break it down. Remember, use your length, write it to an array, increment your length. What would be the runtime of that?
>> Constant, that's right, because none of those things depend on the input. Notice I didn't ever say the input, I only said the length, so that's not traversing the length, it's just simply using the length, and adding a number constant, we consider it constant. There we go.
[00:03:10] Now, what about pop? Well, pop is the exact same thing, right? Which can return either T, or undefined, right? So exact same thing, we're gonna use length minus 1, we'll grab out that value, then we can decrement the length. So yeah, this value, if it's, say just for storing numbers, it can technically remain within the array.
[00:03:35] We don't actually zero out the data. You could zero out the data if you want to, but you also don't have to zero out the data with primitive structures. If you're using a language that uses pointers, you can technically still leave them in there because there's nothing that you're actually doing, right?
[00:03:49] Unless if it's of course a shared pointer, using some like rust, blah blah, there could be all different ways in which you would want to clean this up. But in general, if we're just talking about a simple array underneath the hood, you don't actually have to clean it up because you are using length to keep track of what you own, right?
[00:04:04] And you only clean up as you need to. In Java land you'd probably want to set it to null, because nulls are you releasing a pointer to an object, thus it can be garbage collected, right? There is some dangers there, but understand your environment and what you're doing, and why you'd want to clean it up or not clean it up is always a good plan.
[00:04:21] But this is effectively an ArrayList. Makes a lot of sense. Now the real question you're probably asking yourself is, what happens with push when we exceed capacity, right? So if I were to push push, then push once more, well, something has to happen, because our length is currently equal to our capacity at the moment of push.
[00:04:44] We've ran out of room to do this. So one thing we can do is we can create a new array of some longer length. Let's say six, say we double our capacity. Then we could mem copy or just do a for loop, and copy them in one at a time, where we actually move over all of our values.
[00:05:02] Then our length would still point right here, our capacity now becomes six, and we can now easily handle the push operation. So this is effectively what an ArrayList does underneath the hood, is it contains some actual array structure, and then helps you make it so that it's dynamic, that it grows.
[00:05:22] That's why you're always gonna see a capacity on a lot of these more traditional structures, is because it actually is starting off with hey, give me a hint about how much you want to use. In that way I don't allocate too much memory, and that way you're not using a bunch of empty space.
[00:05:36] Cuz if it starts off by saying hey my initial capacity is 30,000, that could be way too much. It's just wasting a bunch of memory. So it's a game here. There's definitely a game you have to play in which you want to be as, use the least amount of memory but do the least amount of growing operations.
[00:05:52] I spent pop obviously redact, I mean, I've never seen a data structure that actually reduces its capacity once it reaches a certain point. I mean I guess you could if you had to be very, very efficient about memory, say you're using an extremely low memory device. You could do that, but I've never seen that personally.
[00:06:11] So there we go, that's the effects of an ArrayList. But let's think about it, cuz we've only covered two operations, push and pop, which are stack like operations. What about queue-like operations? queue or enqueue and dequeue. Well, if we were to do that on an ArrayList, if we were to take this and say we had the values 2, 3 and 5 in our underlying ArrayList, we have a length of 3, we have a capacity of 3, and we do a enqueue.
[00:06:42] What's gonna have to happen? Well, first, we're gonna have to grow our capacity, right? We're not able to actually sustain any further items. So let's say we grow our capacity, awesome. Our capacity is now 6, right, and so now we can insert at length, correct? Cuz our length is 3, so it's actually pointing at the empty spot, awesome, we can now insert.
[00:07:04] Well, I guess that's for push for n and q. What's the problem here? The problem is actually quite simple. You can't just right, this is an array. if you write, it writes over the value. So what are you gonna have to do? You're gonna have to start here and go move over one, move over one, move over one.
[00:07:24] Now we have created a spot in that position for you to be able to insert into the ArrayList. And so that's why often you'll see people getting up in arms about using insert into the front of a list, because if it's an ArrayList you have to do an N like operation, and actually shift over all of your underlying values.
[00:07:42] And they're very, very not performant. And that's why you see people liking, say queues with node-based items, because you can do this in O(1) operation, not O(N). So exact same thing with dequeue as well. If you do a dequeue, what happens? Well, the dequeue, you're gonna have to literally shift over every single item, and you can right over and then you're gonna have to shift over this direction, right?
[00:08:07] And now you've just shifted everything back, cuz you want to be able to do get zero, and you want to expect the zeroeth item to be an item. If you don't do that you have the first item being an item and then the zeroeth item being one, you've either already pushed or a value you've already cleaned up, which is gonna be fundamentally wrong.
[00:08:23] And so this makes ArrayList really, really bad with enqueue and dequeue, but really, really good with push and pop, right? You see why this is the way it is. And you could effectively say that as long as you're pushing and popping and you reach some sort of stable state, you're gonna have a O(1) performance pushing and popping.
[00:08:40] It never has to grow the array, it never has to do anything, it's just very, very contained in a certain capacity. Awesome. Same thing applies obviously to the middle of a list, just in case I didn't cover it. If you said hey, delete at the fifth index, everything beyond the fifth index needs to be shifted down.
[00:08:58] So those are O(N) operations. All right, let's see, did I miss anything? I don't think so. Running time, yep, we talked about all the deletion/insertions, they're the same thing, right? If you insert, you have to shift everything over. If you delete, you have to shift everything over the other direction.
[00:09:14] They're pretty straightforward, I think I'm gonna skip the implementation on this one, just because it is, again, a very lengthy implementation and it's not very hard. You see the for loops you have to do, you go to that index, and you just have to shift over values. Not too hard on this one.
[00:09:31] There we go. So which one is better, right? Which one would you rather use, an Array List or a Linked List?
>> Does it depend on the situation?
>> Man, it's like you guys knew the thing! It should always be it depends, right? It is always it depends.
[00:09:48] Whenever someone asks that in an interview, say it depends. It depends on the situation. That's why I always do so terrible. I emotionally do so terrible at Google interviews cuz they're always like, let's say we're inserting and removing millions of items. I'm like, well, I mean, you're really doing 100 million items on a single machine, are you really doing this?
[00:10:04] This already seems so not even feasible. Let's talk about why you're doing that, right? I've already lost the battle. It just seems so fake, but experimentally, it depends. If you're pushing and popping from the end, well, either a Linked List or an ArrayList can work quite well. You get a little bit more conveniences cuz often with an ArrayList you also have like say angle bracket accessing.
[00:10:26] You can have that with a Linked List, but doesn't often happen. But at the exact same time you have random access with an ArrayList. Give me index three. That's constant time. That's really really good, but you do not remove from the front. So there's definitely a trade-off, there's definitely a trade-off graph somewhere on there that you have to be careful of.
[00:10:47] Getting sucks on Linked List; removing from the front, sucks on ArrayList. And so long as you can explain that to your interviewer, they're usually pretty happy about that. I in fact, even have a question that Netflix I used to ask all the time, which was implement an async request queue, in which only three requests can be out at any one time.
[00:11:20] And they go, yeah, why not? And I know for a fact they haven't thought about the fact, that when you remove or add to the front, you are fundamentally having to shift everything. I want to hear these words out of their mouth because it is very, very important, you understand what you do and aren't doing, right?
[00:11:34] I don't care that you don't know how to implement. Who cares you don't know how to implement it? You just have to understand that there is a cost at some point in the system.