Transcript from the "Sets" Lesson
>> Nina Zakharenko: Sets in Python are a data type that allow you to store other immutable types in an unsorted way. And all of the simple objects, or simple types that we talked about earlier in the day, those are all immutable. Strings, numbers, etcetera.
>> Nina Zakharenko: The interesting thing about an item is that an item can only be stored in a set once.
[00:00:28] There are no duplicates allowed. And a set doesn't have order to it. It just kind of has this collection of items but, you won't be able to access them by position, because there is none. The benefits of a set are very fast membership testing. So in a set, it stores the kind of hashed value of that type and I won't dive into what hash is but.
[00:01:08] To check if an item is an asset it's a very very fast operation you don't have to examine each item and say are you it, are you it no, okay next item just very fast membership. And on sets you can use powerful set operations for those of you who have a math background like union difference intersection.
[00:01:29] If you don't have a math background it's okay, I'll explain what those mean. So looking at sets, let's create our first few sets. There's something important to note here. I'll talk about dictionaries next. But, you think we might make an empty set with two curly brackets if that's how sets are defined?
[00:02:00] What is the type of this? It's a dictionary, yeah. So curly braces are used by sets, and by dictionaries. If you want to make an empty set, you have to be explicit about it, and you have to call the set type. As soon as you start adding items to it,
>> Nina Zakharenko: You're good to go, but just the empty set needs to be created by calling set.
>> Nina Zakharenko: And If you feel like you're having a hard time remembering some of the quirks, don't forget you always have type available. So you can test your assumptions. The REPL isn't just something for beginners or for learning, even as a experienced Python programmer with many years of experience, I use the REPL everyday.
[00:03:01] All the time to test my assumptions, to test out little bits of code. As you progress in Python, there are other types of REPLs available that have syntax highlighting and fancier stuff like copy and paste and there's even one that kind of uses this library called causes to make a terminal GUI for your Python REPL.
[00:03:31] So this is just kinda the most basic one. You can upgrade when you feel like it. But the REPL is just a super powerful tool for Python programmers, so don't forget to use type if you're unsure. Okay, so sets with items in them, let's say I have a list of names.
>> Nina Zakharenko: Sets can contain duplicates. So what would my name set look like?
>> Nina Zakharenko: How many items will I have? Two items, yeah.
>> Nina Zakharenko: The duplicate value is just ignored.
>> Nina Zakharenko: We can check the length of our names.
>> Nina Zakharenko: Sets can't contain mutable data types
>> Nina Zakharenko: The way that they allow you to quickly check if an item that is contained in it or not is with an algorithm called a hash.
[00:04:41] And there's a built-in hash function in Python. Generally, you're not really gonna call it manually. But we can call it just kind of to see what the hash of an item might be.
>> Nina Zakharenko: Okay, so if I check the hash of a string, we see a number. Doesn't really matter what it is or what the algorithm is.
[00:05:11] It's there.
>> Nina Zakharenko: What happens if you try to get the hash off a list?
>> Student: [INAUDIBLE] will come up with an error.
>> Nina Zakharenko: Yeah, what error?
>> Student: TypeError.
>> Nina Zakharenko: The TypeError Unhashable type list.. This is maybe one of those errors that could be a little bit clearer but now that you know what a hash is you know what unhashable is right?
[00:05:39] Can't hash it. So if I try to put a list in my set, I'm gonna see that same error unhashable type list, right? Can only put immutable items in your set.
>> Nina Zakharenko: Okay. Because sets can't have duplicate items in them, it's a quick way of de-duplicating a list.
[00:06:15] So if I had a list of names and it has some duplicates in it.
>> Nina Zakharenko: And I quickly wanted to see, what are the unique items in this list? I showed you I can create an empty set with a despite calling set right. But I can also pass in a list into the set creation or a tuple or other kind of container data types, and I'm that's gonna return a set without the duplicates in there.
>> Student: What if one of the duplicates had different casing,-
>> Nina Zakharenko: Then it's not a duplicate, it's different, yeah. Sets have a length and sets are mutable. We can add to them and remove from them. I'll show you that in an order, in a minute, excuse me. But sets don't have an order.
[00:07:11] That means that when you print them out the items in them aren't necessarily going to be displayed in the order that they were entered. So I can say my set is 1. And again, just like less contain whatever types I want.
>> Nina Zakharenko: If I look at my set or the representation of it.
[00:07:37] You know, there's no order. Things aren't necessarily printed out in the same way that they are represented. Right? It's just not what a set is good at. What do you think will happen, and by what do you think, I mean. What do you think, when you type it in and you see that, what if I tried to access my set by position, by index, sets don't have an order, what's gonna happen?
>> Nina Zakharenko: Yeah, type error. Set object does not support indexing, right? Makes sense, there's no position.