Skip to content

memory_profiler

The memory_profiler module tracks memory usage at the line level, helping identify memory leaks and inefficient memory access patterns.

Mental Model

memory_profiler does for memory what line_profiler does for time -- it shows you the memory increment of each line. Decorate a function with @profile, run the script, and you get a line-by-line ledger of memory consumption. Look for lines with large positive increments to find your biggest memory consumers.


Installation

bash pip install memory_profiler

Basic Usage

```python

memory_example.py

from memory_profiler import profile

@profile def create_list(): large_list = [i ** 2 for i in range(100000)] filtered = [x for x in large_list if x % 2 == 0] return filtered

if name == "main": result = create_list() ```

Run with: python -m memory_profiler memory_example.py

Output Format

``` Filename: memory_example.py

Line # Mem usage Increment Occurrences Line Contents

 3   38.4 MiB      0.0 MiB           1   @profile
 4                                        def create_list():
 5   42.8 MiB      4.4 MiB           1       large_list = [i ** 2 for i in range(100000)]
 6   43.2 MiB      0.4 MiB           1       filtered = [x for x in large_list if x % 2 == 0]
 7                                        return filtered

```

Programmatic Memory Profiling

```python from memory_profiler import profile

def process_arrays(): # This will be tracked arr1 = list(range(1000000)) arr2 = [x ** 2 for x in arr1] del arr1 # Memory freed return arr2

Get memory without decorator

from memory_profiler import show_results

profile(process_arrays)() ```

Memory Optimization Patterns

```python

Inefficient: creates multiple intermediate lists

def inefficient(): data = [i for i in range(100000)] filtered = [x for x in data if x > 50000] squared = [x ** 2 for x in filtered] return squared

Efficient: single pass generator

def efficient(): return (x ** 2 for x in range(100000) if x > 50000) ```

Practical Tips

  • Use generators instead of list comprehensions for large datasets
  • Delete large objects explicitly when done: del large_obj
  • Profile before and after optimization
  • Monitor peak memory usage, not just line-by-line

Exercises

Exercise 1. Write two versions of a function that reads a range of 1,000,000 integers and returns only the even squares: one using list comprehensions (creating intermediate lists) and one using a single generator expression. Use tracemalloc to compare peak memory for each approach.

Solution to Exercise 1
```python
import tracemalloc

def with_lists():
    nums = list(range(1_000_000))
    evens = [x for x in nums if x % 2 == 0]
    squares = [x ** 2 for x in evens]
    return squares

def with_generator():
    return list(
        x ** 2 for x in range(1_000_000) if x % 2 == 0
    )

tracemalloc.start()
with_lists()
_, peak_lists = tracemalloc.get_traced_memory()
tracemalloc.stop()

tracemalloc.start()
with_generator()
_, peak_gen = tracemalloc.get_traced_memory()
tracemalloc.stop()

print(f"List approach peak:      {peak_lists / 1024 / 1024:.1f} MB")
print(f"Generator approach peak: {peak_gen / 1024 / 1024:.1f} MB")
```

Exercise 2. Write a function that builds a dictionary of 100,000 entries, then deletes it with del and calls gc.collect(). Use tracemalloc snapshots before creation, after creation, and after deletion to show the memory growth and reclamation.

Solution to Exercise 2
```python
import tracemalloc
import gc

tracemalloc.start()
snap1 = tracemalloc.take_snapshot()

d = {f"key_{i}": list(range(10)) for i in range(100_000)}
snap2 = tracemalloc.take_snapshot()

del d
gc.collect()
snap3 = tracemalloc.take_snapshot()

growth = snap2.compare_to(snap1, 'lineno')
reclaim = snap3.compare_to(snap2, 'lineno')

total_growth = sum(s.size_diff for s in growth if s.size_diff > 0)
total_freed = sum(s.size_diff for s in reclaim if s.size_diff < 0)

print(f"Growth:     +{total_growth / 1024 / 1024:.1f} MB")
print(f"Reclaimed:  {total_freed / 1024 / 1024:.1f} MB")
tracemalloc.stop()
```

Exercise 3. Create a function memory_report(func, *args) that wraps any callable with tracemalloc, runs it, and returns a dictionary with keys current_kb, peak_kb, and top_allocations (a list of the top 3 allocation lines). Test it with a function that creates a nested list of 1,000 sublists with 1,000 elements each.

Solution to Exercise 3
```python
import tracemalloc

def memory_report(func, *args):
    tracemalloc.start()
    func(*args)
    current, peak = tracemalloc.get_traced_memory()
    snapshot = tracemalloc.take_snapshot()
    top = snapshot.statistics('lineno')[:3]
    tracemalloc.stop()
    return {
        "current_kb": current / 1024,
        "peak_kb": peak / 1024,
        "top_allocations": [str(s) for s in top],
    }

def build_nested():
    return [[i * j for j in range(1_000)]
            for i in range(1_000)]

report = memory_report(build_nested)
print(f"Current: {report['current_kb']:.1f} KB")
print(f"Peak:    {report['peak_kb']:.1f} KB")
print("Top allocations:")
for entry in report["top_allocations"]:
    print(f"  {entry}")
```