Sets Deep Dive
set is a particularly useful but easily underestimated container in Python. Whenever these keywords appear in a problem, I usually think of sets first:
- Deduplication
- Duplicate checking
- Whether something has been visited
- Intersection, union, difference
Creating Sets
numbers = {1, 2, 3, 4}
empty = set()
Note: {} creates an empty dictionary, not an empty set.
Add, Remove, Search
visited = {1, 2, 3}
visited.add(4)
visited.remove(2)
visited.discard(10)
print(3 in visited)
The difference:
remove(x): Raises an error if the element doesn't existdiscard(x): Doesn't raise an error if the element doesn't exist
Set Operations
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
print(a | b) # union
print(a & b) # intersection
print(a - b) # difference
print(a ^ b) # symmetric difference
Most common application: deduplication
numbers = [1, 2, 2, 3, 3, 4]
unique_numbers = list(set(numbers))
If you also want to preserve the original order, plain set() isn't enough -- you'll need other methods.
Second most common application: visited markers
In graph search, DFS, and BFS, I almost always write a visited set:
graph = {
"A": ["B", "C"],
"B": ["A", "D"],
"C": ["A"],
"D": ["B"],
}
visited = set()
frozenset
If you need an "immutable set", you can use frozenset:
fs = frozenset([1, 2, 3])
It cannot be modified, but it can perform set operations and can also serve as a dictionary key.
A reminder I often stumble on
Sets are unordered containers. Don't assume they preserve the order in which you inserted elements. Even if sometimes they "appear to preserve it", don't build logic around that behavior.