7.1. Dictionaries#
This chapter presents a built-in type called a dictionary.
It is one of Python’s best features – and the building block of many efficient and elegant algorithms.
7.1.1. What Is a Dictionary?#
A dictionary in Python is a built-in data type used to store data in key–value pairs. A Python dictionary is like a real world dictionary, where you look up a word for its definition. In Python dictionary, you look up a key for the value.
The key characteristics of a dictionary are:
Dictionaries are mutable (can be changed)
Keys must be immutable (strings, numbers, tuples)
A dictionary represents a mapping from keys to values.
Keys must be unique
Values can be any data type
In terms formatting,
Each dictionary item consists of a key and a value separated by a colon.
Dictionary items are separated by commas and enclosed in curly braces.
A dictionary is like a list, but more general. In a list, the indices have to be integers; in a dictionary they can be (almost) any type.
Feature |
List |
Dictionary |
|---|---|---|
Access by |
Index (0,1,2) |
Key |
Structure |
Ordered sequence |
Key-value mapping |
Best for |
Ordered data |
Labeled data |
As seen in the example below:
A list of number words can be accessed using an integer as an index.
A dictionary goes in the other direction, and look up a word to get the corresponding integer; in other words, using the keys to look up the values.
nums_lst = ['zero', 'one', 'two']
print(nums_lst)
print(nums_lst[1])
nums_dic = {'zero': 0, 'one': 1, 'two': 2}
print(nums_dic)
print(nums_dic['one'])
['zero', 'one', 'two']
one
{'zero': 0, 'one': 1, 'two': 2}
1
In mathematical language, a dictionary represents a mapping from keys to values, so you can also say that each key “maps to” a value. In this example, each number word maps to the corresponding integer.
The following figure shows the state diagram for numbers.
A dictionary is represented by a box with the word “dict” outside and the items inside.
Each item is represented by a key and an arrow pointing to a value.
The quotation marks indicate that the keys here are strings, not variable names.
7.1.1.1. When to Use a Dictionary#
You use a dictionary when:
You need labeled data
You want fast lookups by name
You need a mapping relationship
Examples of dictionaries:
Student records
Configuration settings
Word counts
JSON data
7.1.2. Creating Dictionaries#
Python provides several ways to create a dictionary, depending on whether you already have the data or are building it incrementally. T
dict literal: the most common approach to create a dictionary
dict()constructor: to turn sequences into dictionariesdictionary comprehensions: are useful when keys and values come from other sources
Python also has a dict.fromkeys() method, which is a special case in creating dictionaries. Overall, common ways ways to create a dictionary include:
# |
Method |
Syntax |
Notes |
|---|---|---|---|
1 |
Empty dict literal |
|
Fastest way to create an empty dict |
2 |
Dict literal |
|
Keys and values known up front |
3 |
|
|
Keys must be valid identifiers |
4 |
From list of tuples |
|
Useful when pairs are already in a sequence |
5 |
|
|
All keys share the same default value |
6 |
Dict comprehension |
|
Build from any iterable with an expression |
Pay attention to ways 2, 3, and 4. We usually use dict literal for creating dictionaries but you should know how to use the dict() constructor to turn a sequence into a dictionary. Also, try to transfer your skill in list comprehension and tuple comprehension to dictionary comprehension.
# 1. Empty dict literal
d1 = {}
# 2. Dict literal with items
d2 = {'zero': 0, 'one': 1, 'two': 2}
# 3. dict() constructor with keyword arguments (keys must be valid identifiers)
d3 = dict(zero=0, one=1, two=2)
# 4. dict() from a list of (key, value) tuples
d4 = dict([('zero', 0), ('one', 1), ('two', 2)])
# 5. dict.fromkeys() — all keys share the same default value
d5 = dict.fromkeys(['zero', 'one', 'two'], 0)
# 6. Dict comprehension
words = ['zero', 'one', 'two']
d6 = {word: idx for idx, word in enumerate(words)} # note the two variables
print("Empty literal: ", d1)
print("Dict literal: ", d2)
print("dict() keywords: ", d3)
print("dict() from tuples: ", d4)
print("dict.fromkeys(): ", d5)
print("Dict comprehension: ", d6)
Empty literal: {}
Dict literal: {'zero': 0, 'one': 1, 'two': 2}
dict() keywords: {'zero': 0, 'one': 1, 'two': 2}
dict() from tuples: {'zero': 0, 'one': 1, 'two': 2}
dict.fromkeys(): {'zero': 0, 'one': 0, 'two': 0}
Dict comprehension: {'zero': 0, 'one': 1, 'two': 2}
To give some details about dictionary creation, let’s use the dictionary literal to create a dictionary, which is to put the items inside curly braces.
numbers = {'zero': 0, 'one': 1, 'two': 2}
Each item consists of a key and a value separated by a colon. The items are separated by commas and enclosed in curly braces.
Another way to create a dictionary is to use the dict function.
We can make an empty dictionary like this.
empty = dict()
empty
{}
Or, if you have a list like [ “zero”, “one”, “two”], you can use enumerate() to create a dictionary.
lst = ["zero", "one", "two "]
dict_words = {}
for idx, word in enumerate(lst):
dict_words[word] = idx
print(dict_words)
{'zero': 0, 'one': 1, 'two ': 2}
And we can make a copy of a dictionary like this.
numbers_copy = dict(numbers)
numbers_copy
{'zero': 0, 'one': 1, 'two': 2}
It is often useful to make a copy before performing operations that modify dictionaries.
7.1.3. Core Operations#
Common dictionary operations involve the following methods.
Group |
Method |
Description |
|---|---|---|
Reading |
|
Get value safely, no KeyError |
|
All keys |
|
|
All values |
|
|
All key-value pairs |
|
Adding/Updating |
|
Add or overwrite from another dict |
|
Set value only if key doesn’t exist |
|
Removing |
|
Remove & return a specific key |
|
Remove & return last inserted pair |
|
|
Remove everything |
|
Copying/Creating |
|
Shallow copy |
|
Create new dict from a list of keys |
|
Checking |
|
Check if key exists |
|
Count of key-value pairs |
7.1.3.1. Accessing Items#
Common ways to access values in a dictionary include:
dict[key]dict.get(key)dict.get(key, default)
We use square brackets [] to access and modify items in a Python dictionary by key. Reading a value looks like dict[key], which returns the value stored under key.
If a key you a key you try to access is missing, you will receive an error. To avoid the error, use dict.get(key), which returns None or a default value you provide when error.
person = {
"name": "Alice",
"age": 28,
"city": "Rolla"
}
# Access an item
print(person["name"]) # Alice
print(person)
Alice
{'name': 'Alice', 'age': 28, 'city': 'Rolla'}
KeyError when trying to access a nonexistent key.
%%expect KeyError
print(person["phone"]) # KeyError
KeyError: 'phone'
When accessing, it is safer to use dict.get() to avoid the error because it
returns
Nonewhen the key does not existreturns a default value you specify when the key does not exist
print(person.get("phone")) # None
print(person.get("phone", "N/A")) # N/A
None
N/A
To access all the keys and values at once:
keys()returns a view of all the dictionary’s keysvalues()returns a view of all the dictionary’s values.items()returns a view of all key-value pairs as tuples.
for k in numbers.keys():
print(k, end=' ')
print()
for v in numbers.values():
print(v, end=' ')
print()
for k, v in numbers.items():
print(k, v , end=' ')
zero one two
0 1 2
zero 0 one 1 two 2
The len function works on dictionaries; it returns the number of items.
len(numbers)
3
### Exercise: Accessing Dictionary Values
# Given the dictionary below:
person = {"name": "Bob", "age": 25, "city": "Chicago"}
# 1. Print the value associated with "name".
# 2. Use .get() to retrieve "email" with a default of "N/A".
# 3. Print all keys and values using .items().
### Your code starts here.
### Your code ends here.
Bob
N/A
name Bob
age 25
city Chicago
7.1.3.2. Adding/Updating Items#
You also use the square brackets to update values. An item assignment like person["age"] = 30 will either
updates an existing item, or
creates a new item if the key does not already exist.
Note that, since dictionary keys are immutable, you can not modify it directly.
# Modify an existing item
person["age"] = 29
# Add a new item
person["email"] = "alice@rolla.com"
print(person)
{'name': 'Bob', 'age': 29, 'city': 'Chicago', 'email': 'alice@rolla.com'}
d.update() is used to add new key-value pairs or change existing ones in a dictionary. d.update() behaves like above but can work with multiple key-value pairs:
if the key already exists, its value is replaced
if the key does not exist, it is added
d = {"a": 1, "b": 2}
d.update({"b": 20, "c": 3})
print(d) # {'a': 1, 'b': 20, 'c': 3}
{'a': 1, 'b': 20, 'c': 3}
An alternative syntax for d.update() is as follows.
d.update(a=10, d=4)
print(d) # {'a': 10, 'b': 20, 'c': 3, 'd': 4}
{'a': 10, 'b': 20, 'c': 3, 'd': 4}
You cannot rename or update a key in place because dictionary keys are immutable. Instead, you could:
Add a new key
Copy the value
Delete the old key
You can do all three steps in one line of code, because pop() will return the value that’s been removed, as shown below. I think that’s pretty neat.
student = {
"name": "Doris",
"age": 29
}
# Rename key "age" to "years"
student["years"] = student.pop("age")
print(student)
{'name': 'Doris', 'years': 29}
### Exercise: Adding and Updating Dictionary Items
# Start with this dictionary:
person = {"name": "Alice", "age": 28}
# 1. Add a new key "email" with value "alice@example.com".
# 2. Update "age" to 30.
# 3. Use d.update() to add "city": "Boston" and change "name" to "Alicia".
# 4. Print the final dictionary.
### Your code starts here.
### Your code ends here.
{'name': 'Alicia', 'age': 30, 'email': 'alice@example.com', 'city': 'Boston'}
7.1.3.3. Deleting Items#
You can delete items from a dictionary when you no longer need a key-value pair.
del: Thedelstatement removes an item by its key, such as del person[“city”].d.pop(): Another option ispop(), which removes the item and also returns its value, which is useful if you want to use that value later.d.popitem(): removes and returns the last inserted key–value pair from a dictionary.d.clear(): removes everything in the dictionary.
person = {
"name": "Alice",
"age": 29,
"city": "Rolla",
"email": "alice@rolla.com"
}
print("person:\t\t", person)
# Delete an item using del
del person["city"] # in place
print("city deleted:\t", person)
# Delete an item using pop()
email_popped = person.pop("email")
print("x_city_email:\t", person) # {'name': 'Alice', 'age': 29}
print("email_popped:\t", email_popped) # alice@rolla.com
print("person_updated\t", person)
# Delete the last added item using popitem()
key, value = person.popitem()
print("popped_k_v:\t", (key, value)) # ('age', 29) or ('name', 'Alice') depending on the order of items in the dict
print("after_popitem:\t", person) # {'name': 'Alice'}
# Clear all items from the dictionary
person.clear()
print("after_clear:\t", person) # {}
person: {'name': 'Alice', 'age': 29, 'city': 'Rolla', 'email': 'alice@rolla.com'}
city deleted: {'name': 'Alice', 'age': 29, 'email': 'alice@rolla.com'}
x_city_email: {'name': 'Alice', 'age': 29}
email_popped: alice@rolla.com
person_updated {'name': 'Alice', 'age': 29}
popped_k_v: ('age', 29)
after_popitem: {'name': 'Alice'}
after_clear: {}
### Exercise: Deleting Dictionary Items
# Start with this dictionary:
person = {"name": "Bob", "age": 25, "city": "Boston", "email": "bob@email.com"}
# 1. Delete "city" using del.
# 2. Remove and capture "email" using pop(). Print the popped value.
# 3. Print the remaining dictionary.
### Your code starts here.
### Your code ends here.
Popped email: bob@email.com
{'name': 'Bob', 'age': 25}
7.1.3.4. Membership Testing#
The in membership operator works on dictionaries, too; it tells you whether a key exists in the dictionary, which returns True if the key is present and False if it is not. You can also use not in to check that a key is missing.
Member testing is useful when you want to safely decide whether to access, update, or delete a dictionary item.
person = {
"name": "Alice",
"age": 29,
"city": "Rolla"
}
print("name" in person) # True
print("email" in person) # False
print("email" not in person) # True
if "age" in person: # safety: Check if key exists before accessing
print("Age exists:", person["age"])
True
False
True
Age exists: 29
The in operator does not check whether something appears as a value.
"Alice" in person
False
To see whether something appears as a value in a dictionary, you can use the method values, which returns a sequence of values, and then use the in operator.
"Alice" in person.values()
True
### Exercise: Membership Testing
# Use the dictionary below:
person = {"name": "Alice", "age": 29, "city": "Rolla"}
# 1. Check if "name" is in the dictionary and print the result.
# 2. Check if "phone" is NOT in the dictionary and print the result.
# 3. Check if "Alice" appears as a value and print the result.
### Your code starts here.
### Your code ends here.
True
True
True
7.1.4. Select Topics#
7.1.4.1. Dictionary Comprehension#
Dictionary comprehensions provide a concise way to build dictionaries from existing data. The pattern is similar to a list comprehension, but each result contains a key and a value.
The basic syntax:
{key_expr: value_expr for item in iterable}
# Dictionary comprehensions
# Create a dictionary of squares
square_dict = {x: x**2 for x in range(5)}
print("Square dict:", square_dict)
# From two lists
keys = ['a', 'b', 'c', 'd']
values = [1, 2, 3, 4]
combined_dict = {k: v for k, v in zip(keys, values)}
print("Combined dict:", combined_dict)
# Filter and transform
prices = {'apple': 0.50, 'banana': 0.30, 'orange': 0.80, 'grape': 1.20}
expensive_fruits = {fruit: price for fruit, price in prices.items() if price > 0.50}
print("Expensive fruits:", expensive_fruits)
Square dict: {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Combined dict: {'a': 1, 'b': 2, 'c': 3, 'd': 4}
Expensive fruits: {'orange': 0.8, 'grape': 1.2}
### Exercise: Dictionary Comprehension
# 1. Use a dict comprehension to create a dictionary that maps
# each number 1-5 to its cube (n**3).
# 2. Given the prices dict below, use a dict comprehension to keep
# only fruits that cost more than $1.00.
prices = {'apple': 1.20, 'banana': 0.50, 'cherry': 2.00, 'grape': 0.80}
### Your code starts here.
### Your code ends here.
Cubes: {1: 1, 2: 8, 3: 27, 4: 64, 5: 125}
Pricey: {'apple': 1.2, 'cherry': 2.0}
7.1.4.2. Counting with Dictionaries#
Suppose you are given a string and you want to count how many times each letter appears. A dictionary is a good tool for this job. We’ll start with an empty dictionary.
counter = {}
As we loop through the letters in the string, suppose we see the letter 'a' for the first time.
We can add it to the dictionary like this.
counter['a'] = 1
The value 1 indicates that we have seen the letter once.
Later, if we see the same letter again, we can increment the counter like this.
counter['a'] += 1
Now the value associated with 'a' is 2, because we’ve seen the letter twice.
counter
{'a': 2}
The following function uses these features to count the number of times each letter appears in a string.
def value_counts(string):
counter = {}
for letter in string:
if letter not in counter:
counter[letter] = 1
else:
counter[letter] += 1
return counter
Each time through the loop, if letter is not in the dictionary, we create a new item with key letter and value 1.
If letter is already in the dictionary we increment the value associated with letter.
Here’s an example.
counter = value_counts('brontosaurus')
counter
{'b': 1, 'r': 2, 'o': 2, 'n': 1, 't': 1, 's': 2, 'a': 1, 'u': 2}
The items in counter show that the letter 'b' appears once, 'r' appears twice, and so on.
### Exercise: Counting with a Dictionary
# 1. Use the value_counts() function defined above to count the
# frequency of each letter in the word "mississippi".
# 2. Print the letter that appears the most (highest count).
### Your code starts here.
### Your code ends here.
{'m': 1, 'i': 4, 's': 4, 'p': 2}
Most common letter: i -> 4
7.1.4.3. Iterating Through Dictionaries#
If you use a dictionary in a for statement, it traverses the keys of the dictionary.
To demonstrate, let’s make a dictionary that counts the letters in 'banana'.
counter = value_counts('banana')
counter
{'b': 1, 'a': 3, 'n': 2}
The following loop prints the keys, which are the letters.
for key in counter:
print(key)
b
a
n
To print the values, we can use the values method.
for value in counter.values():
print(value)
1
3
2
To print the keys and values, we can loop through the keys and look up the corresponding values.
for key in counter:
value = counter[key]
print(key, value)
b 1
a 3
n 2
### Exercise: Iterating Through a Dictionary
# Use this dictionary:
scores = {"alice": 92, "bob": 85, "charlie": 78}
# 1. Print only the names (keys).
# 2. Print only the scores (values).
# 3. Print each name and score together using .items().
### Your code starts here.
### Your code ends here.
alice
bob
charlie
92
85
78
alice 92
bob 85
charlie 78
7.1.4.4. Why Dictionary Lookup Is Fast#
The items in a Python dictionary are stored in a hash table, which is a way of organizing data that has a remarkable property: the in operator takes about the same amount of time no matter how many items are in the dictionary.
That makes it possible to write some remarkably efficient algorithms.
Note
Big O Notation
Big O notation is a way of describing how the time (or memory) an operation needs grows as the size of the data grows. We use the letter n to represent the number of items.
The two cases you’ll encounter most often here:
Notation |
Name |
Meaning |
|---|---|---|
O(1) |
Constant time |
Time stays the same regardless of n |
O(n) |
Linear time |
Time grows in proportion to n |
Imagine searching for a word in a book.
O(1) is like looking up a word in a dictionary: you go directly to the page. You know exactly which page to turn to, so the search takes the same time no matter how thick the book is, that’s O(1).
O(n) is like finding a word in a novel: you read until you find it. The worst case is you may have to scan every page until the end of the book, that’s O(n): the more pages you have, the more time it would likely to take.
Dictionary lookup is O(1) because Python computes a hash of the key and jumps directly to the right slot in memory, without scanning. List lookup with in is O(n) because Python checks each element in order until it finds a match (or reaches the end).
The full treatment of Big O notation — including O(log n), O(n²), and others — comes in the Algorithms chapter.
The following example builds a list and a dictionary with one million entries, then times a worst-case lookup in each. It shows the difference between O(n) linear search (list) and O(1) constant-time lookup (dictionary) in practice.
import time
# Build a large list and a large dictionary with the same 1 million keys
n = 1_000_000
big_list = list(range(n))
big_dict = {i: True for i in range(n)}
target = n - 1 # worst case: last element
# O(n): list search time scales up to n elements
start = time.time()
result = target in big_list
list_time = time.time() - start
# O(1): dict lookup jumps directly
start = time.time()
result = target in big_dict
dict_time = time.time() - start
print(f"List lookup (O(n)): {list_time:.6f} s")
print(f"Dict lookup (O(1)): {dict_time:.6f} s")
print(f"Dict was ~{list_time / dict_time:.0f}x faster")
List lookup (O(n)): 0.006752 s
Dict lookup (O(1)): 0.000019 s
Dict was ~354x faster
To demonstrate, we’ll compare two algorithms for finding pairs of words where one is the reverse of another – like stressed and desserts.
We’ll start by reading the word list.
from pathlib import Path
words_file = Path('../../data/words.txt')
if not words_file.exists():
download('https://raw.githubusercontent.com/AllenDowney/ThinkPython/v3/words.txt', words_file)
word_list = words_file.read_text().split()
print(len(word_list))
print(type(word_list))
113783
<class 'list'>
To check out the file content:
for word in word_list[:10]:
print(word)
aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
aardvark
And here’s the reverse_word function.
def reverse_word(word):
return ''.join(reversed(word))
reverse_word('hello')
'olleh'
The following function loops through the words in the list. For each one, it reverses the letters and then checks whether the reversed word is in the word list.
def too_slow():
count = 0
for word in word_list:
if reverse_word(word) in word_list: ### O(n) lookup in a list is slow
count += 1
return count
To measure how long a function takes, we can use %time which is one of Jupyter’s “built-in magic commands”.
These commands are not part of the Python language, so they might not work in other development environments.
%time too_slow()
Output:
CPU times: user 1min 9s, sys: 182 ms, total: 1min 9s
Wall time: 1min 9s
885
Uncomment the line to run the command in Live Code mode or in your editor.
# %time too_slow()
This function takes more than a minute to run.
The problem is that the in operator checks the words in the list one at a time, starting at the beginning.
If it doesn’t find what it’s looking for – which happens most of the time – it has to search all the way to the end.
And the in operator is inside the loop, so it runs once for each word.
Since there are more than 100,000 words in the list, and for each one we check more than 100,000 words, the total number of comparisons is the number of words squared – roughly – which is almost 13 billion.
len(word_list)**2
12946571089
We can make this function much faster with a dictionary. The following loop creates a dictionary that contains the words as keys.
word_dict = {}
for word in word_list:
word_dict[word] = 1
Or, you can use dictionary comprehension:
word_dict = {word: 1 for word in word_list}
The values in word_dict are all 1, but they could be anything, because we won’t ever look them up – we will only use this dictionary to check whether a key exists.
Now here’s a version of the previous function that replaces word_list with word_dict.
def much_faster():
count = 0
for word in word_dict:
if reverse_word(word) in word_dict:
count += 1
return count
This function takes less than one hundredth of a second, so it’s about 10,000 times faster than the previous version.
In general, the time it takes to find an element in a list is proportional to the length of the list. The time it takes to find a key in a dictionary is almost constant – regardless of the number of items.
%time much_faster()
CPU times: user 23.1 ms, sys: 190 μs, total: 23.3 ms
Wall time: 23.4 ms
885
### Exercise: Dictionary vs. List Lookup
# Given the list of fruits below:
fruits = ['apple', 'banana', 'cherry', 'date', 'elderberry']
# 1. Create a dictionary with fruits as keys and 1 as values using dict comprehension.
# 2. Use the `in` operator to check if 'cherry' is in:
# a) the list b) the dictionary
# 3. Print both results.
### Your code starts here.
### Your code ends here.
True
True