Memory Management¶
Understanding copy semantics, memory views, and __slots__ for efficient memory usage.
Copy vs Deepcopy¶
Shallow Copy¶
A shallow copy creates a new container but references the same nested objects:
import copy
original = [[1, 2], [3, 4]]
shallow = copy.copy(original)
# Top level is copied
print(shallow is original) # False
# Nested objects are shared
print(shallow[0] is original[0]) # True
Mutation affects both:
shallow[0].append(3)
print(original) # [[1, 2, 3], [3, 4]]
print(shallow) # [[1, 2, 3], [3, 4]]
# Both changed!
Deep Copy¶
A deep copy recursively copies all nested objects:
import copy
original = [[1, 2], [3, 4]]
deep = copy.deepcopy(original)
# Everything is copied
print(deep is original) # False
print(deep[0] is original[0]) # False
Changes are independent:
deep[0].append(3)
print(original) # [[1, 2], [3, 4]]
print(deep) # [[1, 2, 3], [3, 4]]
# Only deep changed
When to Use Each¶
| Situation | Use | Example |
|---|---|---|
| Flat data | Shallow | [1, 2, 3].copy() |
| Nested mutable | Deep | copy.deepcopy(nested) |
| Performance critical | Shallow | Large data, no nesting |
| Complete independence | Deep | Complex structures |
# Shallow: fast, less memory
data = [1, 2, 3]
backup = data.copy()
# Deep: slow, more memory, but safe
nested = [[1, 2], {'a': [3, 4]}]
backup = copy.deepcopy(nested)
Memory Views¶
Memory views provide zero-copy access to buffer data.
Creating Memory Views¶
data = bytearray(b'Hello World')
view = memoryview(data)
# Access without copying
print(view[0]) # 72 (ASCII for 'H')
print(bytes(view[0:5])) # b'Hello'
Zero-Copy Slicing¶
data = bytearray(range(100))
view = memoryview(data)
# Slice creates a view, not a copy
slice_view = view[10:20]
print(bytes(slice_view)) # b'\n\x0b\x0c...'
# Verify no copy
slice_view[0] = 255
print(data[10]) # 255 (original modified)
Use Cases¶
Large Data Processing:
import array
arr = array.array('i', range(1000000))
view = memoryview(arr)
# Process in chunks without copying
for i in range(0, len(view), 1000):
chunk = view[i:i+1000]
process(chunk)
Network Buffers:
buffer = bytearray(4096)
view = memoryview(buffer)
# Receive directly into buffer
bytes_received = socket.recv_into(view)
# Process without copying
data = view[:bytes_received]
In-Place Modification:
data = bytearray(b'Hello')
view = memoryview(data)
view[0] = ord('J')
print(data) # bytearray(b'Jello')
Memory View Properties¶
view = memoryview(bytearray(100))
print(view.nbytes) # Total bytes
print(view.readonly) # False for bytearray
print(view.format) # 'B' (unsigned char)
print(view.itemsize) # 1 byte per item
__slots__¶
__slots__ reduces memory usage by eliminating the instance __dict__.
Without Slots (Default)¶
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
p = Point(10, 20)
print(p.__dict__) # {'x': 10, 'y': 20}
With Slots¶
class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
p = Point(10, 20)
# p.__dict__ # AttributeError: no __dict__
print(p.x, p.y) # 10 20
Memory Comparison¶
import sys
class WithDict:
def __init__(self, x, y):
self.x = x
self.y = y
class WithSlots:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
d = WithDict(1, 2)
s = WithSlots(1, 2)
print(sys.getsizeof(d)) # ~48 bytes (+ dict ~104)
print(sys.getsizeof(s)) # ~48 bytes (no dict overhead)
Restrictions¶
No Dynamic Attributes:
class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
p = Point(10, 20)
# p.z = 30 # AttributeError: 'Point' has no attribute 'z'
Inheritance Considerations:
class Point:
__slots__ = ['x', 'y']
class Point3D(Point):
__slots__ = ['z'] # Only add new slots
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
Enabling Weak References:
class Point:
__slots__ = ['x', 'y', '__weakref__'] # Add __weakref__
When to Use __slots__¶
# Good: millions of instances
class Particle:
__slots__ = ['x', 'y', 'vx', 'vy', 'mass']
particles = [Particle() for _ in range(1000000)]
# Saves significant memory
# Not needed: few instances
class Config:
def __init__(self):
self.debug = False
self.verbose = True
# Only one instance, slots unnecessary
Summary¶
| Technique | Purpose | Use When |
|---|---|---|
| Shallow copy | Quick duplicate | Flat structures |
| Deep copy | Full independence | Nested mutables |
| Memory views | Zero-copy access | Large buffers |
__slots__ |
Reduce memory | Many instances |
Key points:
- Understand shallow vs deep copy to avoid aliasing bugs
- Use memory views for efficient buffer manipulation
- Use __slots__ when creating many instances of a class
- __slots__ prevents dynamic attribute addition
- Always profile before optimizing
Runnable Example: list_internals.py¶
"""
01_list_internals.py
TOPIC: Python List Implementation Internals
LEVEL: Advanced
DURATION: 60-75 minutes
PREREQUISITES:
- Topic #24: Memory Deep Dive (mutable types, references, memory management)
LEARNING OBJECTIVES:
1. Understand list memory layout
2. Learn about over-allocation strategy
3. Explore capacity vs length
4. Understand why lists are fast for certain operations
This is ADVANCED implementation details - builds on memory concepts from #24
"""
import sys
if __name__ == "__main__":
print("=" * 70)
print("PYTHON LIST INTERNALS")
print("=" * 70)
# ============================================================================
# SECTION 1: List Memory Layout
# ============================================================================
print("\nSECTION 1: How Lists Are Structured")
print("-" * 70)
print("""
PYTHON LIST STRUCTURE (CPython):
┌─────────────────────────────────────────┐
│ PyListObject │
├─────────────────────────────────────────┤
│ ob_refcnt: reference count │
│ ob_type: pointer to list type │
│ ob_size: number of elements (LENGTH) │ ← What len() returns
│ allocated: capacity (slots) │ ← Total allocated space
│ ob_item: pointer to array of pointers │ ← Points to actual data
└─────────────────────────────────────────┘
│
↓
┌──────────────────────┐
│ Array of Pointers │
├──────────────────────┤
│ [0] → Object 1 │
│ [1] → Object 2 │
│ [2] → Object 3 │
│ [3] → Object 4 │
│ [4] → (unused) │ ← OVER-ALLOCATION
│ [5] → (unused) │ ← Extra capacity
└──────────────────────┘
KEY INSIGHT:
- Lists store POINTERS to objects, not objects themselves
- ob_size (length) ≤ allocated (capacity)
- Extra capacity allows fast appends!
""")
# ============================================================================
# SECTION 2: Length vs Capacity
# ============================================================================
print("\nSECTION 2: Length vs Capacity")
print("-" * 70)
# Create empty list
my_list = []
print(f"Empty list: {my_list}")
print(f" len() = {len(my_list)}") # ob_size
print(f" sys.getsizeof() = {sys.getsizeof(my_list)} bytes") # includes allocated capacity
# Add one element
my_list.append(1)
print(f"\nAfter append(1): {my_list}")
print(f" len() = {len(my_list)}")
print(f" sys.getsizeof() = {sys.getsizeof(my_list)} bytes")
# Add more elements
sizes = [sys.getsizeof([])]
for i in range(2, 20):
my_list.append(i)
size = sys.getsizeof(my_list)
if size > sizes[-1]:
print(f"\nAfter append({i}): len={len(my_list)}, size={size} bytes (RESIZED!)")
sizes.append(size)
else:
print(f"After append({i}): len={len(my_list)}, size={size} bytes")
print("""
OBSERVATIONS:
- Size doesn't increase with EVERY append
- Size increases in JUMPS (when capacity is exceeded)
- This is OVER-ALLOCATION strategy
- Trade-off: waste some memory for faster appends
""")
# ============================================================================
# SECTION 3: Over-Allocation Strategy
# ============================================================================
print("\n" + "=" * 70)
print("SECTION 3: Why Over-Allocation?")
print("-" * 70)
print("""
WITHOUT OVER-ALLOCATION (naive approach):
───────────────────────────────────────
lst = [] # Allocate 0 slots
lst.append(1) # Allocate 1 slot, copy 0 items
lst.append(2) # Allocate 2 slots, copy 1 item
lst.append(3) # Allocate 3 slots, copy 2 items
lst.append(4) # Allocate 4 slots, copy 3 items
...
lst.append(n) # Allocate n slots, copy n-1 items
Total copies: 0 + 1 + 2 + ... + (n-1) = O(n²) ← TERRIBLE!
WITH OVER-ALLOCATION (Python's approach):
───────────────────────────────────────
lst = [] # Allocate 0 slots
lst.append(1) # Allocate 4 slots (over-allocate!)
lst.append(2) # Use existing capacity
lst.append(3) # Use existing capacity
lst.append(4) # Use existing capacity
lst.append(5) # Allocate 8 slots, copy 4 items
lst.append(6) # Use existing capacity
...
Total copies: Much fewer! = O(n) ← MUCH BETTER!
GROWTH PATTERN (approximately):
0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
Formula (approximately): new_capacity = old_capacity + (old_capacity >> 3) + 6
Which is roughly: new_capacity ≈ 1.125 * old_capacity + 6
""")
# ============================================================================
# SECTION 4: Performance Implications
# ============================================================================
print("\nSECTION 4: Performance Analysis")
print("-" * 70)
print("""
OPERATION PERFORMANCE:
FAST (O(1) amortized):
✓ append() - usually fits in capacity
✓ Access by index: lst[i]
✓ Modify by index: lst[i] = x
✓ len(lst)
✓ pop() - remove last element
SLOW (O(n)):
✗ insert(0, x) - shift all elements right
✗ pop(0) - shift all elements left
✗ remove(x) - search then shift
✗ Concatenation: lst1 + lst2
WHY append() is O(1) amortized:
- Most appends use existing capacity: O(1)
- Occasional resize: O(n)
- But resizes become increasingly rare
- Average over many operations: O(1)
""")
# Demonstrate fast append
import time
n = 100000
start = time.time()
test_list = []
for i in range(n):
test_list.append(i)
append_time = time.time() - start
print(f"\nAppending {n} elements: {append_time:.4f} seconds")
print(f"Average per append: {append_time/n*1000000:.2f} microseconds")
# ============================================================================
# SECTION 5: Memory Trade-offs
# ============================================================================
print("\n" + "=" * 70)
print("SECTION 5: Memory Trade-offs")
print("-" * 70)
# Show wasted space
test_list = list(range(100))
actual_size = sys.getsizeof(test_list)
ptr_size = 8 # 64-bit system
base_size = sys.getsizeof([])
used_size = base_size + (100 * ptr_size)
print(f"List with 100 elements:")
print(f" Total memory: {actual_size} bytes")
print(f" Base overhead: {base_size} bytes")
print(f" Used for data: {100 * ptr_size} bytes (100 pointers × 8 bytes)")
print(f" Estimated waste: {actual_size - used_size} bytes (~{(actual_size-used_size)/actual_size*100:.1f}%)")
print("""
TRADE-OFF:
- Waste ~10-30% of memory (varies with size)
- Gain O(1) amortized append performance
- For most applications: speed > memory
WHEN TO CARE:
- Creating MANY small lists → use tuples if immutable
- Memory-constrained environments
- Lists that rarely grow after creation
""")
# ============================================================================
# SECTION 6: Practical Implications
# ============================================================================
print("\nSECTION 6: Practical Programming Implications")
print("-" * 70)
print("""
BEST PRACTICES:
1. Pre-allocate if you know size:
✓ lst = [None] * 1000 # Better than 1000 appends
2. Use list comprehensions:
✓ [x**2 for x in range(1000)] # Optimized
3. Avoid repeated insertions at beginning:
✗ for x in data: lst.insert(0, x) # O(n²)
✓ Use collections.deque for queue operations
4. Don't pre-allocate unnecessarily:
✗ lst = [None] * 1000000 # Wastes memory if not all used
5. Concatenation alternatives:
✗ result = []
for lst in many_lists:
result = result + lst # Creates new list each time!
✓ result = []
for lst in many_lists:
result.extend(lst) # Modifies in place
""")
# ============================================================================
# SECTION 7: Comparison to Other Data Structures
# ============================================================================
print("\nSECTION 7: When NOT to Use Lists")
print("-" * 70)
print("""
USE THESE INSTEAD:
collections.deque:
- Fast O(1) operations at both ends
- Use for queues, stacks
- Slower random access
array.array:
- Compact storage for numeric data
- No over-allocation waste
- Only homogeneous types
numpy.ndarray:
- Multi-dimensional arrays
- Vectorized operations
- Scientific computing
tuple:
- Immutable, no over-allocation
- Slightly faster, less memory
- Use when size fixed
""")
# ============================================================================
# SECTION 8: Key Takeaways
# ============================================================================
print("\nKEY TAKEAWAYS")
print("-" * 70)
print("""
1. Lists use OVER-ALLOCATION for performance
2. Length (ob_size) ≤ Capacity (allocated)
3. append() is O(1) amortized due to over-allocation
4. Occasional resize operations are O(n)
5. Growth pattern: ~1.125x + constant
6. Trades memory (10-30% waste) for speed
7. insert(0) and pop(0) are O(n) - avoid for queues
8. Pre-allocate if size known
9. Use deque for queue operations
10. Understanding internals helps choose right data structure
CONNECTS TO #24 MEMORY CONCEPTS:
- Mutable type behavior
- Reference storage
- Memory allocation patterns
- Performance vs memory trade-offs
""")
print("\nSee exercises.py for practice!")