Skip to main content

Core Data Types and Containers

One of the most representative experiences in Python is that the built-in containers are good enough. Many problems don't require implementing complex structures from scratch -- getting comfortable with list, tuple, dict, and set first will make you much more efficient.

Build a selection framework first

  • Ordered, mutable, allows duplicates: list
  • Ordered, immutable: tuple
  • Key-value mapping: dict
  • Deduplication, set operations, fast membership testing: set

list: the most common sequential container

numbers = [1, 2, 3]
numbers.append(4)
numbers[0] = 100
print(numbers[1:3])

Common operations:

  • append(): Add to the end
  • extend(): Concatenate multiple elements
  • pop(): Remove and return an element
  • sort(): Sort in place

tuple: lighter, more stable fixed data

point = (3, 5)
x, y = point

I typically use tuples in these scenarios:

  • Returning multiple values from a function
  • Expressing a structure that won't change, such as coordinates, colors, or config items
  • When you need the object to serve as a dictionary key or set element

dict: key-value mapping

user = {"name": "alice", "score": 95}

print(user["name"])
print(user.get("age", 0))
user["city"] = "Chengdu"

My most commonly used operations:

  • get(): Safe value retrieval
  • items(): Iterate over keys and values together
  • keys() / values()
  • update(): Batch update

set: deduplication and membership testing

visited = {"a", "b", "c"}
visited.add("d")
print("a" in visited)

Sets are especially useful for:

  • Deduplication
  • Checking whether an element has appeared
  • Union, intersection, difference, and other set operations

For more complete coverage, see:

Container Comprehensions

Many "iterate and generate new results" operations in Python can be expressed with comprehensions:

squares = [x * x for x in range(5)]
score_map = {name: len(name) for name in ["alice", "bob"]}
unique_lengths = {len(name) for name in ["alice", "bob", "eve"]}

Distinguish mutable from immutable

Many pitfalls aren't syntax issues but rather "can this object be modified" issues.

  • Mutable objects: list, dict, set
  • Immutable objects: int, float, str, tuple

If you assign a list to another variable, you get "another name for the same object", not a copy.

For related issues, see:

A few useful additions from the standard library

Besides the four core containers, the following are also worth remembering:

  • collections.deque: Double-ended queue
  • collections.Counter: Counter
  • collections.defaultdict: Dictionary with default values
  • heapq: Heap

My usage habits

  • Default to list first
  • Switch to dict when you need to look up values by name
  • Think of set immediately when you need deduplication or duplicate checking
  • Use tuple when you need a fixed structure