Skip to content

Profiling Tools

Empirical performance analysis guides optimization efforts.

IPython %timeit

Quick timing in interactive environments.

1. Basic Usage

%timeit arr ** 2

Output:

1.23 µs ± 45.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

2. Multi-line Timing

%%timeit
result = np.zeros(1000)
for i in range(1000):
    result[i] = i ** 2

3. Control Runs

%timeit -n 100 -r 5 arr ** 2  # 100 loops, 5 runs

timeit Module

Scripted benchmarks outside IPython.

1. Basic Script

import timeit
import numpy as np

def main():
    setup = "import numpy as np; arr = np.random.randn(10000)"
    stmt = "arr ** 2"

    time = timeit.timeit(stmt, setup, number=1000)
    print(f"Total time: {time:.4f} sec")
    print(f"Per call:   {time/1000*1000:.4f} ms")

if __name__ == "__main__":
    main()

2. Compare Functions

import timeit
import numpy as np

def method_loop(arr):
    result = np.empty_like(arr)
    for i in range(len(arr)):
        result[i] = arr[i] ** 2
    return result

def method_vectorized(arr):
    return arr ** 2

def main():
    arr = np.random.randn(10000)

    loop_time = timeit.timeit(
        lambda: method_loop(arr), number=100
    )
    vec_time = timeit.timeit(
        lambda: method_vectorized(arr), number=100
    )

    print(f"Loop time:       {loop_time:.4f} sec")
    print(f"Vectorized time: {vec_time:.4f} sec")
    print(f"Speedup:         {loop_time/vec_time:.0f}x")

if __name__ == "__main__":
    main()

time.perf_counter

Manual timing with high precision.

1. Basic Pattern

import time
import numpy as np

def main():
    arr = np.random.randn(1_000_000)

    start = time.perf_counter()
    result = arr ** 2
    elapsed = time.perf_counter() - start

    print(f"Elapsed: {elapsed:.6f} sec")

if __name__ == "__main__":
    main()

2. Multiple Runs

import time
import numpy as np

def main():
    arr = np.random.randn(1_000_000)
    times = []

    for _ in range(10):
        start = time.perf_counter()
        result = arr ** 2
        times.append(time.perf_counter() - start)

    print(f"Mean: {np.mean(times):.6f} sec")
    print(f"Std:  {np.std(times):.6f} sec")

if __name__ == "__main__":
    main()

cProfile Module

Function-level profiling for entire programs.

1. Command Line

python -m cProfile -s cumtime my_script.py

2. In Script

import cProfile
import numpy as np

def my_function():
    arr = np.random.randn(100000)
    for _ in range(100):
        result = arr ** 2

cProfile.run('my_function()')

3. Output Example

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.050    0.001    0.050    0.001 {method 'random' ...}
      100    0.030    0.000    0.030    0.000 {built-in method numpy...}

line_profiler

Line-by-line timing for detailed analysis.

1. Installation

pip install line_profiler

2. Decorator Usage

@profile
def my_function():
    arr = np.random.randn(100000)  # Line 1
    result = arr ** 2               # Line 2
    total = np.sum(result)          # Line 3
    return total

3. Run Command

kernprof -l -v my_script.py

memory_profiler

Track memory usage during execution.

1. Installation

pip install memory_profiler

2. Usage

from memory_profiler import profile

@profile
def my_function():
    arr = np.random.randn(1_000_000)
    result = arr ** 2
    return result

3. Output Example

Line #    Mem usage    Increment  Line Contents
     3     50.0 MiB     50.0 MiB   arr = np.random.randn(1_000_000)
     4     57.6 MiB      7.6 MiB   result = arr ** 2

Best Practices

Guidelines for effective profiling.

1. Profile First

Identify bottlenecks before optimizing.

2. Representative Data

Use realistic data sizes for meaningful results.

3. Multiple Runs

Average over multiple runs to reduce variance.