Sets, Tuples, and Dictionaries

Site: Saylor Academy
Course: CS250: Python for Data Science
Book: Sets, Tuples, and Dictionaries
Printed by: Guest user
Date: Saturday, July 13, 2024, 12:33 PM

Description

Here is more practice with tuples and dictionaries. In addition, the Python built-in data structure known as a set is also covered. Sets are not ordered, and their elements cannot be indexed (sets are not lists). To understand Python set operations, remind yourself of basic operations such as the union and intersection. Use this tutorial to compare and contrast the syntax and programming uses for lists, tuples, sets, and dictionaries.

Introduction

Now that we've got the basics of strings and numbers and lists down, let's talk about the advanced data types - tuple, dict and set. These are container objects that let us organize other types of objects into one data structure.


Source: Nina Zakharenko, https://practical.learnpython.dev/03_sets_tuples_dicts/
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Tuples

Tuples are light-weight collections used to keep track of related, but different items. Tuples are immutable, meaning that once a tuple has been created, the items in it can't change. You can't add, remove, or update items like you can with a list.


tuple cheat sheet

type tuple
use Used for storing a snapshot of related items when we don't plan on modifying, adding, or removing data.
creation () or tuple() for empty tuple. (1, ) for one item, or (1, 2, 3) for a tuple with several items.
search methods my_tuple.index(item) or item in my_tuple
search speed Searching for an item in a large tuple is slow. Each item must be checked.
common methods Can't add or remove from tuples.
order preserved? Yes. Items can be accessed by index.
mutable? No
in-place sortable? No

Uses

You might ask, why tuple when Python already has lists? tuple is different in a few ways. While lists are generally used to store collections of similar items together, tuples, by contrast, can be used to contain a snapshot of data. A good use of a tuple might be for storing the information for a row in a spreadsheet. We don't necessarily care about updating or manipulating that data, we just want a read-only snapshot.

tuple is an interesting and powerful datatype, and one of the more unique aspects of Python. Most other programming languages have ways of representing lists and dictionaries, but only a small subset contain tuples. Use them to your advantage.


Examples

Empty and one-item tuples
One important thing to note about tuples is that there's a quirk to their creation. Let's check the type of an empty tuple created with ().

>>> a = ()
>>> type(a)
<class 'tuple'>

That looks like we'd expect it to. What about if we tried to create a one-item tuple using the same syntax?

>>> b = (1)
>>> type(b)
<class 'int'>

It didn't work! type((1)) is an int. In order to create a one-item tuple, you'll need to include a trailing comma inside the parenthesis.

>>> c = (1, )
>>> type(c)
<class 'tuple'>


Tip
If you're creating a one-item tuple, you must include a trailing comma, like this: (1, )


Creation

Let's say we have a spreadsheet of students, and we'd like to represent each row as a tuple.

>>> student = ("Marcy", 8, "History", 3.5)


Access by index

We can access items in the tuple by index.

>>> student = ("Marcy", 8, "History", 3.5)
>>> student[0]
'Marcy'


But if we try to change the contents, we'll get an error. Remember, you can't change the items in a tuple.

>>> student[0] = "Bob"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment


Info
We'll see TypeError: 'tuple' object does not support item assignment if we try to change the items in a tuple.

Tuples also don't have an append or extend method available on them like lists do, because they can't be changed.


tuple unpacking.

Sounds like a lot of work for not a lot of benefit, right? Not so. tuple is great when you depend on your data staying unchanged. Because of this guarantee, we can use tuples in other types of containers like sets and dictionaries that expect immutable keys. We'll learn more about this later in the chapter.

It's also a great way to quickly consolidate information.

You can also use tuples for something called unpacking. Let's see it in action:

>>> student = ("Marcy", 8, "History", 3.5)
>>>
>>> name, age, subject, grade = student
>>> name
'Marcy'
>>> age
8
>>> subject
'History'
>>> grade
3.5

You can return tuples from functions, and use unpacking to get the values back.

>>> def http_status_code():
...     return 200, "OK"
...
>>> code, value = http_status_code()
>>> code
200
>>> value
'OK'

Sets

A set is a mutable datatype that allows you to store immutable types in an unsorted way. Sets are mutable because you can add and remove items from them. They can contain immmutable items, like tuples and other primitive types, but not lists, sets, or dictionaries which are themselves mutable.

Unlike a list or a tuple, a set can only contain one instance of a unique item. There are no duplicates allowed.

The benefits of a set are: very fast membership testing along with being able to use powerful set operations, like union and intersection.

set cheat sheet

type set
use Used for storing immutable data types uniquely. Easy to compare the items in sets.
creation set() for an empty set ({} makes an empty dict) and {1, 2, 3} for a set with items in it
search methods item in my_set
search speed Searching for an item in a large set is very fast.
common methods my_set.add(item), my_set.discard(item) to remove the item if it's present, my_set.update(other_set)
order preserved? No. Items can't be accessed by index.
mutable? Yes. Can add to or remove from sets.
in-place sortable? No, because items aren't ordered.

Examples

Empty sets
Let's create our first few sets.

The first thing we might try to do is create an empty set with {}, but we'll come across a hurdle.

>>> my_new_set = {}
>>> type(my_new_set)
<class 'dict'>
>>> my_set = set()
>>> type(my_set)
<class 'set'>


Info
You can't create an empty set with {}. That creates a dict. Create an empty set with set() instead.

Tip
While you're learning Python, it's useful to use type(), dir() and help() as often as possible.

sets with items

Now, let's make a new set with some items in it, and test out important set concepts.


sets can't contain duplicate values

>>> numbers = {3, 3, 2, 2, 1}
>>> numbers
{1, 2, 3}


sets can be used to de-duplicate the items in a list

Tip: Use this to your advantage when you need to quickly deduplicate the items in a list if you don't care about order. Pass the list into the set constructor.

>>> names = ['Nina', 'Max', 'Nina']
>>> set(names)
{'Nina', 'Max'}


sets can't contain mutable types

You can consider a set to behave a lot like a dictionary that only contains keys (and no values). The way that sets allow you to quickly check if an item is contained in them or not is with an algorithm called a hash. I won't cover the details, but an algorithm is a way of representing an immutable data type with a unique numerical representation. An immutable data type is one where the contents can't be changed after creation.

If you're not sure if a type or an object is hashable, you can call the built-in hash() function on it.

>>> hash("Nina")
3509074130763756174
>>> hash([1, 2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'


Info
You'll see a TypeError: unhashable type: 'list' if you try to add a mutable data type (like a list) to a set.

If you try to add a mutable data type (like a list) to a set, you'll see the same TypeError, complaining about an unhashable type.

>>> {"Nina"}
{'Nina'}
>>> {[]}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'


sets don't have an order

Sets don't have an order. That means that when you print them, the items won't be displayed in the order they were entered in the list.

>>> my_set = {1, "a", 2, "b", "cat"}
>>> my_set
{1, 2, 'cat', 'a', 'b'}

It also means that you can't access items in the set by position in subscript [] notation.

>>> my_set = {"Red", "Green", "Blue"}
>>> my_set[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'set' object does not support indexing


Info

You'll see TypeError: 'set' object does not support indexing if you try to access the items in a set by index with my_set[pos]

Tip: If your set contains items of the same type, and you want to sort the items, you'll need to convert the set to a list first.

Or, you can use the built-in sorted(sequence) method, which will do the conversion for you.

>>> my_set = {"a", "b", "cat", "dog", "red"}
>>> my_set
{'b', 'red', 'a', 'cat', 'dog'}
>>> sorted(my_set)
['a', 'b', 'cat', 'dog', 'red']


adding to and removing from sets

Since a set has no order, we can't add or remove items to it by index.

Instead, the operations are called with items.
Add items to a set with my_set.add(item).

>>> colors = {"Red", "Green", "Blue"}
>>> colors.add("Orange")
>>> colors
{'Orange', 'Green', 'Blue', 'Red'}


Remove items with my_set.discard(item)

You can remove an item from a set if it's present with my_set.discard(item). If the set doesn't contain the item, no error occurs.

>>> colors = {"Red", "Green", "Blue"}
>>> colors.discard("Green")
>>> colors
{'Blue', 'Red'}
>>> colors.discard("Green")
>>> colors
{'Blue', 'Red'}

You can also remove items from a set with my_set.remove(item), which will raise a KeyError if the item doesn't exist.


Update a set with another sequence using my_set.update(sequence)

You can add to the items in a set by passing in another sequence such as a set, list, or tuple to the update() method.

>>> colors = {"Red", "Green"}
>>> numbers = {1, 3, 5}
>>> colors.update(numbers)
>>> colors
{1, 3, 'Red', 5, 'Green'}


Info

Be careful passing in a string to my_set.update(sequence). That's because a string is a sequence of characters under the hood.

>>> numbers = {1, 3, 5}
>>> numbers.update("hello")
>>> numbers
{1, 3, 'h', 5, 'o', 'e', 'l'}

Your set will update with each character of the string, which was probably not your intended result.

setoperations

sets allow quick and easy operations to compare items between two sets.

set operations cheat sheet
type set
use Used for storing immutable data types uniquely. Easy to compare the items in sets.
creation set() for an empty set ({} makes an empty dict) and {1, 2, 3} for a set with items in it
search methods item in my_set
search speed Searching for an item in a large set is very fast.
common methods my_set.add(item), my_set.discard(item) to remove the item if it's present, my_set.update(other_set)
order preserved? No. Items can't be accessed by index.
mutable? Yes. Can add to or remove from sets.
in-place sortable? No, because items aren't ordered.

Examples

Let's see it in action.
We have two sets, rainbow_colors, which contain the colors of the rainbow, and favorite_colors, which contain my favorite colors.

>>> rainbow_colors = {"Red", "Orange", "Yellow", "Green", "Blue", "Violet"}
>>> favorite_colors = {"Blue", "Pink", "Black"}

First, let's combine the sets and create a new set that contains all of the items from rainbow_colors and favorite_colors using the union operation. You can use the my_set.union(other_set) method, or you can just use the symbol for union | from the table above.

>>> rainbow_colors | favorite_colors
{'Orange', 'Red', 'Yellow', 'Green', 'Violet', 'Blue', 'Black', 'Pink'}

Next, let's find the intersection. We'll create a new set with only the items in both sets.

>>> rainbow_colors & favorite_colors
{'Blue'}
There are other useful operations available on sets, such as checking if one set is a subset, a superset, differences, and more, but I don't have time to cover them all. Python also has a frozenset type, if you need the functionality of a set in an immutable package (meaning that the contents can't be changed after creation).

Dictionaries

Dictionaries are a useful type that allow us to store our data in key, value pairs. Dictionaries themselves are mutable, but, just like sets, dictionary keys can only be immutable types.

We use dictionaries when we want to be able to quickly access data associated with a key.

A great practical application for dictionaries is memoization. Let's say you want to save computing power, and store the result for a function called with particular arguments. The arguments could be the key, with the result stored as the value. Next time someone calls your function, you can check your dictionary to see if the answer is pre-computed.

Looking for a key in a large dictionary is extremely fast. Unlike lists, we don't have to check every item for a match.


dictionary cheat sheet

type dict
use Use for storing data in key, value pairs. Keys used must be immutable data types.
creation {} or dict() for an empty dict. {1: "one", 2: "two"} for a dict with items.
search methods key in my_dict
search speed Searching for a key in a large dictionary is fast.
common methods my_dict[key] to get the value by key, and throw a KeyError if key is not in the dictionary. Use my_dict.get(key) to fail silently if key is not in my_dict. my_dict.items() for all key, value pairs, my_dict.keys() for all keys, and my_dict.values() for all values.
order preserved? Sort of. As of Python 3.6 a dict is sorted by insertion order. Items can’t be accessed by index, only by key.
mutable? Yes. You can add or remove keys from dicts.
in-place sortable? No. dicts don’t have an index, only keys.

Examples

Empty dicts

We already learned one of the methods of creating an empty dict when we tried (and failed) to create an empty set with {}. The other way is to use the dict() method.

>>> my_dict = {}
>>> type(my_dict)
<class 'dict'>

>>> my_dict = dict()
>>> type(my_dict)
<class 'dict'>

Creating dicts with items

If we want to create dicts with items in them, we need to pass in key and value pairs. A dict is declared with curly braces {}, followed by a key and a value, separated with a colon :. Multiple key and value pairs are separated with commas ,.

We can call familiar methods on our dictionary, like finding out how many key / value pairs it contains with the built-in len(my_dict) method.

>>> nums = {1: "one", 2: "two", 3: "three"}

>>> len(nums)
3

Side note: What can be used as keys?

Any type of object, mutable or immutable, can be used as a value but just like sets, dictionaries can only use immutable types as keys. That means you can use int, str, or even tuple as a key, but not a set, list, or other dictionary.

The following is OK:

>>> my_dict = {1: 1}
>>> my_dict = {1: []}

Info

You'll see a TypeError: unhashable type: 'list' if you try to use a mutable type, like a list, as a dictionary key.

>>> my_dict = {[]: 1}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

Accessing

Our dict contains key, value pairs. Because a dictionary isn't ordered, we can't access the items in it by position. Instead, to access the items in it, we use square-bracket my_dict[key] notation, similar to how we access items in a list.

>>> nums = {"one": 1, "two": 2, "three": 3}
>>> nums["one"]
1
>>> nums["two"]
2

Question: What happens when we try to access a key that doesn't exist in a dict?

Info

We'll get a KeyError: key if we try to access my_dict[key] but key isn't in the dictionary.

>>> nums = {1: "one", 2: "two", 3: "three"}
>>> nums[4]

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 4
One way to get around this is to use the my_dict.get(key) method. Using this method, if the key isn't present, no error is thrown, and no value (aka the None type) is returned.

>>> nums = {1: "one", 2: "two", 3: "three"}
>>> nums.get(4)

>>> result = nums.get(4)
>>> type(result)
<class 'NoneType'>
If we want to provide a default value if the key is missing, we also pass an optional argument to the my_dict.get(key) method like so: my_dict.get(key, default_val)

>>> nums = {1: "one", 2: "two", 3: "three"}
>>> nums.get(4, "default")
'default'

Adding and Removing Items

To add a new key value pair to the dictionary, you'll use square-bracket notation.

If you try to put a key into a dictionary that's already there, you'll just end up replacing it.

Tip: To avoid subtle bugs, you can check if a particular key is in a dictionary with the in keyword.

>>> nums = {1: "one", 2: "two", 3: "three"}
>>> nums[8] = "eight"

>>> nums
{1: 'one', 2: 'two', 3: 'three', 8: 'eight'}

>>> 8 in nums
True

>>> nums[8] = "oops, overwritten"
>>> nums
{1: 'one', 2: 'two', 3: 'three', 8: 'oops, overwritten'}


Updating

Just like with lists an sets, you can add the contents of one dictionary to another with the update() method.

>>> colors = {"r": "Red", "g": "Green"}
>>> numbers = {1: "one", 2: "two"}
>>> colors.update(numbers)
>>> colors
{'r': 'Red', 'g': 'Green', 1: 'one', 2: 'two'}


Complex Dictionaries

One incredibly useful scenario for dictionaries is storing the values in a list or other sequence.

>>> colors = {"Green": ["Spinach"]}
>>> colors
{'Green': ['Spinach']}
>>> colors["Green"].append("Apples")
>>> colors
{'Green': ['Spinach', 'Apples']}


Working with items, keys, and values

There are three useful methods you need to remember about dictionary access:

  1. my_dict.keys()
  2. my_dict.values()
  3. my_dict.items()

1. my_dict.keys() gets all the keys in a dictionary

>>> nums = {1: 'one', 2: 'two', 3: 'three', 8: 'eight'}
>>> nums.keys()
dict_keys([1, 2, 3, 8])

2. my_dict.values() gets all the values in a dictionary.

>>> nums = {1: 'one', 2: 'two', 3: 'three', 8: 'eight'}
>>> nums.values()
dict_values(['one', 'two', 'three', 'eight'])

3. my_dict.items() gets all the items (key, value pairs) in a dictionary

Notice that my_dict.items() returns a type that looks like a list.

Note: this data structure looks like it contains two-item tuples containing the key, value pairs. Remember this, it'll be important later on!

'python >>> nums = {1: 'one', 2: 'two', 3: 'three', 8: 'eight'} >>> nums.items() dict_items([(1, 'one'), (2, 'two'), (3, 'three'), (8, 'eight')])

Mutability

Mutability, simply put: the contents of a mutable object can be changed, while the contents of an immutable object cannot.


Simple Types

All of the simple data types we covered first are immutable.

You can replace the value, but you can't change it.

type use mutable?
int, float, decimal store numbers no
str store strings no
bool store True or False no

Container Types

For the mutability of the container types we covered next, check this helpful list:

container type use mutable?
list ordered group of items, accessible by position yes
set mutable unordered group consisting only of immutable items. useful for set operations (membership, intersection, difference, etc) yes
tuple immutable collection containing ordered groups of items no
dict contains key value pairs yes

Practice

Sets

Sets are a great data type for storing unique data - you can only have one of any given object in a set. Sets are unordered, thus you can't access them with [] indexing syntax, but they do have some handy functions.

Let's play with some set operations:

# Create an empty set
>>> my_set = {}
>>> type(my_set)
# Gotcha: using {} actually creates an empty dictionary. To create an empty set, use set()
>>> my_set = set()
>>> type(my_set)

# Let's create a non-empty set
>>> my_set = {1, 2, 3}
# We can add and remove items from the set
>>> my_set.add(4)
>>> my_set.remove(2)
# We can test if an item exists in the set
>>> 2 in my_set

# Unlike lists, every item in a set must be unique
>>> my_set
>>> my_set.add(3)
>>> my_set
# There is still only one 3 in the set
# my_set should equal {1, 3, 4}
>>> my_other_set = {1, 2, 3}
# We can combine two sets
>>> my_set.union(my_other_set)
# We can get the intersection of two sets
>>> my_set.intersection(my_other_set)

Tuples

Tuples are a lightweight way to hold information that describes something, like a person - their name, age, and hometown. You can think about it kind of like a row in a spreadsheet. Tuples are represented inside parentheses, however parentheses are not required to create a tuple, just a sequence of objects followed by commas.

>>> my_tuple = 1,
>>> my_tuple
# Let's add to our tuple
>>> my_tuple[1] = 2

Oops! Remember that tuples are immutable, so you can't change them once they've been created. Tuples are great for moving data around in a lightweight way, because you can unpack them easily into multiple variables.

>>> person = ('Jim', 29, 'Austin, TX')
>>> name, age, hometown = person
>>> name
>>> age
>>> hometown

Dictionaries

Dictionaries are great for storing data that you can index with keys. The keys must be unique, and the dictionaries are stored in the order you inserted items, however this is only guaranteed as of Python 3.7.

>>> my_dict = {"key": "value"}
# Remember, dictionaries don't have numerical indexes like lists, so if you try to use an index number...
# Unless 0 happens to be a key.
>>> my_dict[0]
# You'll get a KeyError!

# Let's put some more things into our dictionary
>>> my_dict["hello"] = "world"
>>> my_dict["foo"] = "bar"
>>> my_dict

# What was the value for "hello" again?
>>> my_dict["hello"]
# You can also use get() to get a key
>>> my_dict.get("hello")
# What if the key you want doesn't exist?
>>> my_dict["baz"]
# If you're not sure if a key exists, you can ask:
>>> "baz" in my_dict
# Or you can use a default value. If "baz" doesn't exist, return "default response":
>>> my_dict.get("baz", "default response")

# Let's try separating the dictionary into lists of keys and values:
>>> my_dict.keys()
>>> my_dict.values()

# What if we want to get both the key and value pairs? Then we need the dictionary's items. We can use the items() function to get a list of tuples:
>>> my_dict.items()

Mutability

Remember, in Python, some data types are immutable – that means that once they're created, their contents can't be changed. Tuples are immutable - once you make one, you can't alter it, you can only make a new one. Conversely, lists, dictionaries, and sets are mutable - you can change them without making new ones.

Let's see this in practice:

# Lists are mutable
>>> my_list = [1, 2, 3]
>>> my_list[0] = 'a'
>>> my_list

# Dictionaries are also mutable
>>> my_dict = {"hello": "world"}
>>> my_dict["foo"] = "bar"
>>> my_dict

# Sets are mutable, but don't support indexing or item assignment, so you have to use add() and remove()
>>> my_set = {1, 2, 3}
>>> my_set[0] = 'a' # This will throw a TypeError
>>> my_set.add('a')
>>> my_set

# Tuples are immutable
>>> my_tuple = (1, 2, 3)
>>> my_tuple[0] = 'a' # This will throw a TypeError

Solutions

Here's what you should have seen while working through the exercises.


Sets

Here's what you should have seen in your REPL:

>>> my_set = {}
>>> type(my_set)
<class 'dict'>
>>> my_set = set()
>>> type(my_set)
<class 'set'>

>>> my_set = {1, 2, 3}
>>> my_set.add(4)
>>> my_set.remove(2)
>>> 2 in my_set
False

>>> my_set
{1, 3, 4}
>>> my_set.add(3)
>>> my_set
{1, 3, 4}

>>> my_other_set = {1, 2, 3}
>>> my_set.union(my_other_set)
{1, 2, 3, 4}
>>> my_set.intersection(my_other_set)
{1, 3}

Tuples

Here's what you should have seen in your REPL:

>>> my_tuple = 1,
>>> my_tuple
(1,)
>>> my_tuple[1] = 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

>>> person = ('Jim', 29, 'Austin, TX')
>>> name, age, hometown = person
>>> name
'Jim'
>>> age
29
>>> hometown
'Austin, TX'

Dictionaries

Here's what you should have seen in your REPL:

>>> my_dict = {"key": "value"}
>>> my_dict[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 0

>>> my_dict["hello"] = "world"
>>> my_dict["foo"] = "bar"
>>> my_dict
{'key': 'value', 'hello': 'world', 'foo': 'bar'}

>>> my_dict["hello"]
'world'
>>> my_dict.get("hello")
'world'
>>> my_dict["baz"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'baz'
>>> "baz" in my_dict
False
>>> my_dict.get("baz", "default response")
'default response'

>>> my_dict.keys()
dict_keys(['key', 'hello', 'foo'])
>>> my_dict.values()
dict_values(['value', 'world', 'bar'])

>>> my_dict.items()
dict_items([('key', 'value'), ('hello', 'world'), ('foo', 'bar')])

Mutability

Here's what you should have seen in your REPL:

>>> my_list = [1, 2, 3]
>>> my_list[0] = 'a'
>>> my_list
['a', 2, 3]

>>> my_dict = {"hello": "world"}
>>> my_dict["foo"] = "bar"
>>> my_dict
{'hello': 'world', 'foo': 'bar'}

>>> my_set = {1, 2, 3}
>>> my_set[0] = 'a' # This will throw a TypeError
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'set' object does not support item assignment
>>> my_set.add('a')
>>> my_set
{1, 2, 3, 'a'}

>>> my_tuple = (1, 2, 3)
>>> my_tuple[0] = 'a' # This will throw a TypeError
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment