The Rust Programming Language

Vectors Recap and Q&A

Richard Feldman

Richard Feldman

Vendr, Inc.
The Rust Programming Language

Check out a free preview of the full The Rust Programming Language course

The "Vectors Recap and Q&A" Lesson is part of the full, The Rust Programming Language course featured in this preview video. Here's what you'd learn in this lesson:

Richard provides a review of vectors, usize, stack memory, heap memory, and opens for student questions. Student questions regarding if usize is a pointer, if there is a macro that instanciates and sets capacity at the same time without needing to push, is capacity up to Rust, what to do if the size of a list is unknown, and if it's better to use a linked list data structure are covered in this segment.


Transcript from the "Vectors Recap and Q&A" Lesson

>> So first we talked about Vec, so if I wanna make a Vec of numbers like u8 numbers, I can do it like this. I can use the Vec macro to do that. We also mentioned that you might wanna use Vec :: with capacity if you anticipate doing a lot of pushing onto there, to avoid having to potentially recopy things a bunch of times.

We talked about usize. So this is this new type that's either u64 on a 64-bit system or a u32 on a 32-bit systems. In practice, you're probably only gonna encounter u32s on web assembly and encountering something that's neither 32-bit nor 64-bit is very rare in this day and age.

We talked about stack memory where basically when I'm calling functions, everything that I'm doing is happening on the stack. So functions, we'll use this sort of like, take the stack length of -1 to get my current argument, or a stack length of -2, if there's more than one argument.

And this is how functions actually, really use memory in order to pass values around between one another. And finally, we talked about heap memory, which is for things that don't have a size that's known at compile time. And that's the purpose of the heap. And basically, what we'll do instead is we will write down on the stack in some sort of struct like our Vec metadata struct, for example.

The index into the heap slot that is like the first element in that dynamically sized thing. Okay, questions about any of this.
>> So is usize essentially a pointer?
>> If I were to use C, so pointer is a term from C and C++, you would use this as a pointer.

I don't wanna get into explanation of pointers, which is why I chose not to do that, this is the correct size. If this was C or C++, this would be a pointer type rather than usize, but at runtime, it's all the same thing.
>> But as to the colon colon size underscore t?

>> Yeah, that's the length size, size underscore t-
>> usize as with size underscore t is in C, C++?
>> Right, that sounds right [LAUGH].
>> And then somebody else put, I think first_elem_index is the pointer to an index in the heap?
>> Correct, then again, I'm trying to keep this accessible to people who don't know C and C++, so I don't wanna get into that whole terminology.

>> Can one pass capacity to Vec to instantiate and set capacity at the same time without needing to push?
>> That's a good question. I don't know if there's a macro that does that. Presumably, if there isn't one, you could define your own if you wanted to, like once you learn how to do macros, which again outside the scope of this course, but there's no reason someone couldn't make a macro that did both.

>> Isn't a capacity up to breast?
>> No, you can definitely choose the capacity yourself. So basically, when you say Vec : : with_capacity (5), like when you call this function, what's gonna happen is that this is gonna compile the code that says, hey, bookkeeping system, find me 5 u8s in a row that are available, that are not in use by some other Vec.

And that is gonna give you back the index of the first element where that's true. So yeah, you're definitely in charge of this. Honestly, the Vec macro behind the scenes is actually calling Vec_with_capacity and then push a bunch of times. That's really what that macro does. It's sort of a convenience for calling with capacity followed by push a bunch of times, essentially.

>> What would you do if you have no idea what the size of a list of items is?
>> Yeah, it's a good question. I mean, basically the trade off there is, if you want to pick something like really high capacity, it's gonna use more memory and maybe wastefully so.

There is no real hard and fast rule where it's like, if you do this, you're always fine. This is one of those things where if there's unknown information, you kind of just try to take your best shot. Worth noting that in practice, when you run out of capacity because like this whole thing where if you're in a bad situation where you run out of space, basically rust is gonna be, or Vec I should say, is not going to just be like, well, I hope this is the last one.

So I'm gonna allocate capacity plus 1 for the next time. It's actually gonna try and over allocate in case there's more pushes coming, so it'll actually double the capacity. So it wouldn't go from 4 to 5 here. In actuality, Vec would go from 4 to 8 automatically, just so you have room for another four pushes if you want them.

So knowing that, if you just set something that's reasonably close to what you want, you're probably going to be off by at most a magnitude of 2, unless you're way off, in which case it might need to do it double twice. So hopefully that's a helpful guidance [LAUGH].

>> Would it be better to use something like a linked list data structure in that case?
>> Would it be better to use something like a linked list data structure in a case like this? That's a great question. Without going into too much detail, the first rule, I've watched a lot of talks, I'm not a C++ programmer, but I have actually watched a lot of talks at CppCon.

And also rust talks about performance. Basically, to make a long story short, the first rule of fast performance is never use linked lists. They're super slow [LAUGH]. Basically, the fact that you need to do, every time you make a new element a linked list, you have to go back to the bookkeeping system and say, hey, find me a new index for the next element in the linked list.

Hey, find me the next element in the linked list. Asking the bookkeeping system to find you available memory is quite often the most expensive thing that the entire data structure ends up doing. And so a lot of performance optimization as we're gonna see in future sections is basically avoiding asking that bookkeeping system for how to find available memory on the heap.

And so linked lists generally are much more often the cause of performance problems and the solution to them, not saying they're useless. There are absolutely some situations where a linked list is the right choice. But that's sort of a rule that I've internalized based on watching lots of experts talk about this stuff is, if you wanna go fast, avoid linked lists like the plague.


Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now