Skip to main content

Sets Deep Dive

set is a particularly useful but easily underestimated container in Python. Whenever these keywords appear in a problem, I usually think of sets first:

  • Deduplication
  • Duplicate checking
  • Whether something has been visited
  • Intersection, union, difference

Creating Sets

numbers = {1, 2, 3, 4}
empty = set()

Note: {} creates an empty dictionary, not an empty set.

visited = {1, 2, 3}

visited.add(4)
visited.remove(2)
visited.discard(10)
print(3 in visited)

The difference:

  • remove(x): Raises an error if the element doesn't exist
  • discard(x): Doesn't raise an error if the element doesn't exist

Set Operations

a = {1, 2, 3, 4}
b = {3, 4, 5, 6}

print(a | b) # union
print(a & b) # intersection
print(a - b) # difference
print(a ^ b) # symmetric difference

Most common application: deduplication

numbers = [1, 2, 2, 3, 3, 4]
unique_numbers = list(set(numbers))

If you also want to preserve the original order, plain set() isn't enough -- you'll need other methods.

Second most common application: visited markers

In graph search, DFS, and BFS, I almost always write a visited set:

graph = {
"A": ["B", "C"],
"B": ["A", "D"],
"C": ["A"],
"D": ["B"],
}

visited = set()

frozenset

If you need an "immutable set", you can use frozenset:

fs = frozenset([1, 2, 3])

It cannot be modified, but it can perform set operations and can also serve as a dictionary key.

A reminder I often stumble on

Sets are unordered containers. Don't assume they preserve the order in which you inserted elements. Even if sometimes they "appear to preserve it", don't build logic around that behavior.