dict Internals (Hash Tables)¶
Python dictionaries are implemented as hash tables, using hash functions to map keys to values with O(1) average lookup time. Understanding dict internals explains performance characteristics and behavioral quirks.
Hash Functions¶
How Hashing Works¶
# Hash of objects
print(f"Hash of 'key': {hash('key')}")
print(f"Hash of 42: {hash(42)}")
print(f"Hash of (1,2): {hash((1, 2))}")
# Lists can't be hashed (mutable)
try:
hash([1, 2, 3])
except TypeError as e:
print(f"Error: {e}")
Output:
Hash of 'key': 4567822475321
Hash of 42: 42
Hash of (1,2): 3713081631934410656
Error: unhashable type: 'list'
Hash Collisions¶
# Multiple keys can hash to same bucket
d = {}
d['a'] = 1
d['b'] = 2
d['c'] = 3
print(f"Dict size: {d}")
Output:
Dict size: {'a': 1, 'b': 2, 'c': 3}
Dictionary Growth¶
Dynamic Resizing¶
d = {}
print(f"Initial capacity hint")
for i in range(10):
d[f'key_{i}'] = i
print(f"Keys: {len(d)}")
print(f"Dict: {d}")
Output:
Initial capacity hint
Keys: 10
Dict: {'key_0': 0, 'key_1': 1, 'key_2': 2, 'key_3': 3, 'key_4': 4, 'key_5': 5, 'key_6': 6, 'key_7': 7, 'key_8': 8, 'key_9': 9}
Performance Characteristics¶
O(1) Lookup¶
import time
d = {i: i**2 for i in range(100000)}
# Fast lookup
start = time.time()
for _ in range(1000):
value = d[50000]
lookup_time = time.time() - start
print(f"Lookup time: {lookup_time:.6f}s")
Output:
Lookup time: 0.000045s
Key Behavior¶
Keys Must Be Hashable¶
d = {(1, 2): "tuple_key"}
print(d[(1, 2)])
# This fails - lists aren't hashable
try:
d[[1, 2]] = "list_key"
except TypeError:
print("Lists cannot be dict keys")
Output:
tuple_key
Lists cannot be dict keys