memory_profiler¶

The memory_profiler module tracks memory usage at the line level, helping identify memory leaks and inefficient memory access patterns.

Mental Model

memory_profiler does for memory what line_profiler does for time -- it shows you the memory increment of each line. Decorate a function with @profile, run the script, and you get a line-by-line ledger of memory consumption. Look for lines with large positive increments to find your biggest memory consumers.

Installation¶

bash pip install memory_profiler

Basic Usage¶

```python

memory_example.py¶

from memory_profiler import profile

@profile def create_list(): large_list = [i ** 2 for i in range(100000)] filtered = [x for x in large_list if x % 2 == 0] return filtered

if name == "main": result = create_list() ```

Run with: python -m memory_profiler memory_example.py

Output Format¶

``` Filename: memory_example.py

Line # Mem usage Increment Occurrences Line Contents¶

 3   38.4 MiB      0.0 MiB           1   @profile
 4                                        def create_list():
 5   42.8 MiB      4.4 MiB           1       large_list = [i ** 2 for i in range(100000)]
 6   43.2 MiB      0.4 MiB           1       filtered = [x for x in large_list if x % 2 == 0]
 7                                        return filtered

```

Programmatic Memory Profiling¶

```python from memory_profiler import profile

def process_arrays(): # This will be tracked arr1 = list(range(1000000)) arr2 = [x ** 2 for x in arr1] del arr1 # Memory freed return arr2

Get memory without decorator¶

from memory_profiler import show_results

profile(process_arrays)() ```

Memory Optimization Patterns¶

```python

Inefficient: creates multiple intermediate lists¶

def inefficient(): data = [i for i in range(100000)] filtered = [x for x in data if x > 50000] squared = [x ** 2 for x in filtered] return squared

Efficient: single pass generator¶

def efficient(): return (x ** 2 for x in range(100000) if x > 50000) ```

Practical Tips¶

Use generators instead of list comprehensions for large datasets
Delete large objects explicitly when done: del large_obj
Profile before and after optimization
Monitor peak memory usage, not just line-by-line

Exercises¶

Exercise 1. Write two versions of a function that reads a range of 1,000,000 integers and returns only the even squares: one using list comprehensions (creating intermediate lists) and one using a single generator expression. Use tracemalloc to compare peak memory for each approach.

Solution to Exercise 1

```python
import tracemalloc

def with_lists():
    nums = list(range(1_000_000))
    evens = [x for x in nums if x % 2 == 0]
    squares = [x ** 2 for x in evens]
    return squares

def with_generator():
    return list(
        x ** 2 for x in range(1_000_000) if x % 2 == 0
    )

tracemalloc.start()
with_lists()
_, peak_lists = tracemalloc.get_traced_memory()
tracemalloc.stop()

tracemalloc.start()
with_generator()
_, peak_gen = tracemalloc.get_traced_memory()
tracemalloc.stop()

print(f"List approach peak:      {peak_lists / 1024 / 1024:.1f} MB")
print(f"Generator approach peak: {peak_gen / 1024 / 1024:.1f} MB")
```

Exercise 2. Write a function that builds a dictionary of 100,000 entries, then deletes it with del and calls gc.collect(). Use tracemalloc snapshots before creation, after creation, and after deletion to show the memory growth and reclamation.

Solution to Exercise 2

```python
import tracemalloc
import gc

tracemalloc.start()
snap1 = tracemalloc.take_snapshot()

d = {f"key_{i}": list(range(10)) for i in range(100_000)}
snap2 = tracemalloc.take_snapshot()

del d
gc.collect()
snap3 = tracemalloc.take_snapshot()

growth = snap2.compare_to(snap1, 'lineno')
reclaim = snap3.compare_to(snap2, 'lineno')

total_growth = sum(s.size_diff for s in growth if s.size_diff > 0)
total_freed = sum(s.size_diff for s in reclaim if s.size_diff < 0)

print(f"Growth:     +{total_growth / 1024 / 1024:.1f} MB")
print(f"Reclaimed:  {total_freed / 1024 / 1024:.1f} MB")
tracemalloc.stop()
```

Exercise 3. Create a function memory_report(func, *args) that wraps any callable with tracemalloc, runs it, and returns a dictionary with keys current_kb, peak_kb, and top_allocations (a list of the top 3 allocation lines). Test it with a function that creates a nested list of 1,000 sublists with 1,000 elements each.

Solution to Exercise 3

```python
import tracemalloc

def memory_report(func, *args):
    tracemalloc.start()
    func(*args)
    current, peak = tracemalloc.get_traced_memory()
    snapshot = tracemalloc.take_snapshot()
    top = snapshot.statistics('lineno')[:3]
    tracemalloc.stop()
    return {
        "current_kb": current / 1024,
        "peak_kb": peak / 1024,
        "top_allocations": [str(s) for s in top],
    }

def build_nested():
    return [[i * j for j in range(1_000)]
            for i in range(1_000)]

report = memory_report(build_nested)
print(f"Current: {report['current_kb']:.1f} KB")
print(f"Peak:    {report['peak_kb']:.1f} KB")
print("Top allocations:")
for entry in report["top_allocations"]:
    print(f"  {entry}")
```