Memory Leaks¶
메모리 누수의 원인, 탐지, 예방 방법입니다.
Mental Model
A memory leak is like filling a bathtub with the drain plugged -- memory keeps rising because objects stay reachable even though your program no longer needs them. The usual culprits are global caches that grow forever, circular references with custom __del__ methods, and closures that accidentally capture large objects.
Common Causes¶
1. Global References¶
```python
Bad: accumulates forever¶
cache = []
def process(data): cache.append(data) # Never cleared!
Better: limit size¶
from collections import deque
cache = deque(maxlen=100) ```
2. Circular References¶
```python class Node: def init(self): self.ref = None
a = Node() b = Node() a.ref = b b.ref = a
Cycle keeps both alive¶
```
3. Closures Capturing Large Objects¶
```python
Leak: large object captured¶
def process_data(): large = [0] * 1000000
def get_first():
return large[0] # Captures entire list
return get_first # Keeps large alive
Better: capture only what's needed¶
def process_data(): large = [0] * 1000000 first = large[0]
def get_first():
return first # Only captures the value
return get_first
```
4. Event Handlers Not Removed¶
```python
Leak: handler keeps object alive¶
class Widget: def init(self, event_system): event_system.subscribe('click', self.handle)
def handle(self, event):
pass
# Widget never collected until unsubscribed
```
Finalizers (__del__)¶
__del__은 신뢰할 수 없고 문제를 일으킬 수 있습니다.
Problems with __del__¶
```python class Resource: def del(self): self.cleanup()
May not be called if:¶
- Program exits abruptly¶
- Circular references exist¶
- Exceptions in del¶
```
Object Resurrection¶
```python saved = None
class Item: def del(self): global saved saved = self # Resurrect!
obj = Item() del obj print(saved) # Object still exists! ```
Best Practices: Avoid __del__¶
```python
Instead of del, use explicit cleanup¶
class Resource: def close(self): # Explicit cleanup pass
r = Resource() try: use(r) finally: r.close() ```
Better: Context Managers¶
```python class Resource: def enter(self): return self
def __exit__(self, *args):
self.cleanup()
Guaranteed cleanup¶
with Resource() as r: use(r) ```
Or: weakref Callbacks¶
```python import weakref
def cleanup(ref): print("Object collected")
obj = SomeObject() ref = weakref.ref(obj, cleanup)
cleanup called on GC (more reliable than del)¶
```
Detection¶
1. Track Memory Growth¶
```python import tracemalloc
tracemalloc.start()
for i in range(100): process()
if i % 10 == 0:
current, peak = tracemalloc.get_traced_memory()
print(f"Current: {current / 10**6:.1f} MB")
tracemalloc.stop() ```
2. Monitor Object Count¶
```python import gc
before = len(gc.get_objects())
process()
after = len(gc.get_objects()) print(f"Objects created: {after - before}") ```
3. Find Top Allocations¶
```python import tracemalloc
tracemalloc.start()
Run code¶
process()
snapshot = tracemalloc.take_snapshot() top_stats = snapshot.statistics('lineno')
print("Top 10 allocations:") for stat in top_stats[:10]: print(stat) ```
GC Performance¶
Collection Pauses¶
```python import gc import time
start = time.time() collected = gc.collect() pause = time.time() - start
print(f"Pause: {pause*1000:.2f}ms") print(f"Collected: {collected}") ```
Disable During Critical Sections¶
```python import gc
gc.disable() try: # Performance-critical code compute() finally: gc.enable() gc.collect() ```
Manual Collection Control¶
```python import gc
gc.disable()
for i in range(1000): process()
if i % 100 == 0:
gc.collect()
gc.enable() ```
GC Tuning¶
Adjust Thresholds¶
```python import gc
Default: (700, 10, 10)¶
Fewer collections, longer pauses:¶
gc.set_threshold(10000, 100, 100)
More collections, shorter pauses:¶
gc.set_threshold(100, 5, 5) ```
Reduce Allocations¶
```python
Bad: many allocations¶
result = [] for i in range(1000): result.append([i])
Better: pre-allocate¶
result = [None] * 1000 for i in range(1000): result[i] = i ```
Monitor GC Impact¶
```python import gc
Before¶
before = gc.get_count()
process()
After¶
after = gc.get_count() print(f"Gen0 delta: {after[0] - before[0]}") ```
Prevention Checklist¶
| Strategy | Implementation |
|---|---|
| Limit caches | functools.lru_cache(maxsize=128) |
| Break cycles | obj.parent = None |
| Weak references | weakref.WeakValueDictionary() |
| Context managers | with open_resource() as r: |
| Explicit cleanup | try/finally with .close() |
Avoid __del__ |
Use context managers instead |
Summary¶
- Causes: globals, cycles, closures, event handlers
- Detection:
tracemalloc,gc.get_objects() - Prevention: limit caches, break cycles, use weak refs
- Finalizers: avoid
__del__, use context managers - Performance: tune thresholds, control collection timing
Exercises¶
Exercise 1.
Create a class EventSystem that stores callbacks in a plain list (creating a memory leak because callbacks hold references to large objects). Demonstrate the leak by registering 1,000 callbacks that each capture a 10,000-element list. Measure memory with tracemalloc, then fix the leak using weakref.WeakMethod or weakref.ref and show the memory improvement.
Solution to Exercise 1
```python
import tracemalloc
import weakref
import gc
class EventSystem:
def __init__(self):
self.callbacks = []
def register(self, callback):
self.callbacks.append(callback)
class EventSystemWeak:
def __init__(self):
self.callbacks = []
def register(self, callback):
self.callbacks.append(weakref.ref(callback))
# Leaky version
tracemalloc.start()
es = EventSystem()
holders = []
for i in range(1_000):
data = list(range(10_000))
def cb(d=data):
return d[0]
es.register(cb)
holders.append(cb)
_, peak_leak = tracemalloc.get_traced_memory()
tracemalloc.stop()
# Fixed version using bounded approach
tracemalloc.start()
es2 = EventSystemWeak()
for i in range(1_000):
data = list(range(10_000))
first = data[0] # capture only what's needed
def cb2(v=first):
return v
es2.register(cb2)
_, peak_fixed = tracemalloc.get_traced_memory()
tracemalloc.stop()
print(f"Leaky peak: {peak_leak / 1024 / 1024:.1f} MB")
print(f"Fixed peak: {peak_fixed / 1024 / 1024:.1f} MB")
```
Exercise 2.
Write a BoundedCache class that wraps a dictionary and limits it to a maximum of 100 entries using a FIFO eviction policy (use collections.OrderedDict). Demonstrate that unlike an unbounded dictionary, the cache's memory stays constant after inserting 10,000 items. Use sys.getsizeof() to compare sizes.
Solution to Exercise 2
```python
import sys
from collections import OrderedDict
class BoundedCache:
def __init__(self, maxsize=100):
self.maxsize = maxsize
self.cache = OrderedDict()
def put(self, key, value):
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.maxsize:
self.cache.popitem(last=False)
def get(self, key):
return self.cache.get(key)
# Unbounded dict
unbounded = {}
for i in range(10_000):
unbounded[i] = f"value_{i}" * 10
# Bounded cache
bounded = BoundedCache(maxsize=100)
for i in range(10_000):
bounded.put(i, f"value_{i}" * 10)
print(f"Unbounded dict size: {sys.getsizeof(unbounded):,} bytes, "
f"entries: {len(unbounded)}")
print(f"Bounded cache size: {sys.getsizeof(bounded.cache):,} bytes, "
f"entries: {len(bounded.cache)}")
```
Exercise 3.
Write a function that intentionally creates a closure-based memory leak: an inner function captures a large list but only uses its first element. Use tracemalloc to show the leak, then refactor to capture only the needed value. Compare peak memory before and after the fix.
Solution to Exercise 3
```python
import tracemalloc
# Leaky version: closure captures entire large list
def make_getter_leaky():
large = list(range(1_000_000))
def getter():
return large[0]
return getter
# Fixed version: capture only the needed value
def make_getter_fixed():
large = list(range(1_000_000))
first = large[0]
def getter():
return first
return getter
tracemalloc.start()
getters_leaky = [make_getter_leaky() for _ in range(10)]
_, peak_leaky = tracemalloc.get_traced_memory()
tracemalloc.stop()
tracemalloc.start()
getters_fixed = [make_getter_fixed() for _ in range(10)]
_, peak_fixed = tracemalloc.get_traced_memory()
tracemalloc.stop()
print(f"Leaky closures peak: {peak_leaky / 1024 / 1024:.1f} MB")
print(f"Fixed closures peak: {peak_fixed / 1024 / 1024:.1f} MB")
```