Core Data Types and Containers
One of the most representative experiences in Python is that the built-in containers are good enough. Many problems don't require implementing complex structures from scratch -- getting comfortable with list, tuple, dict, and set first will make you much more efficient.
Build a selection framework first
- Ordered, mutable, allows duplicates:
list - Ordered, immutable:
tuple - Key-value mapping:
dict - Deduplication, set operations, fast membership testing:
set
list: the most common sequential container
numbers = [1, 2, 3]
numbers.append(4)
numbers[0] = 100
print(numbers[1:3])
Common operations:
append(): Add to the endextend(): Concatenate multiple elementspop(): Remove and return an elementsort(): Sort in place
tuple: lighter, more stable fixed data
point = (3, 5)
x, y = point
I typically use tuples in these scenarios:
- Returning multiple values from a function
- Expressing a structure that won't change, such as coordinates, colors, or config items
- When you need the object to serve as a dictionary key or set element
dict: key-value mapping
user = {"name": "alice", "score": 95}
print(user["name"])
print(user.get("age", 0))
user["city"] = "Chengdu"
My most commonly used operations:
get(): Safe value retrievalitems(): Iterate over keys and values togetherkeys()/values()update(): Batch update
set: deduplication and membership testing
visited = {"a", "b", "c"}
visited.add("d")
print("a" in visited)
Sets are especially useful for:
- Deduplication
- Checking whether an element has appeared
- Union, intersection, difference, and other set operations
For more complete coverage, see:
Container Comprehensions
Many "iterate and generate new results" operations in Python can be expressed with comprehensions:
squares = [x * x for x in range(5)]
score_map = {name: len(name) for name in ["alice", "bob"]}
unique_lengths = {len(name) for name in ["alice", "bob", "eve"]}
Distinguish mutable from immutable
Many pitfalls aren't syntax issues but rather "can this object be modified" issues.
- Mutable objects:
list,dict,set - Immutable objects:
int,float,str,tuple
If you assign a list to another variable, you get "another name for the same object", not a copy.
For related issues, see:
A few useful additions from the standard library
Besides the four core containers, the following are also worth remembering:
collections.deque: Double-ended queuecollections.Counter: Countercollections.defaultdict: Dictionary with default valuesheapq: Heap
My usage habits
- Default to
listfirst - Switch to
dictwhen you need to look up values by name - Think of
setimmediately when you need deduplication or duplicate checking - Use
tuplewhen you need a fixed structure