Skip to content

Performance Traps

pandas is built for vectorized operations. Using Python loops or incorrect patterns can make code orders of magnitude slower.

The Golden Rule

Vectorized operations > apply() > itertuples() > iterrows() > for loops

Trap 1: Row-by-Row Iteration

The Problem

import pandas as pd
import numpy as np
import time

# Create test data
n = 100_000
df = pd.DataFrame({
    'A': np.random.randn(n),
    'B': np.random.randn(n)
})

# SLOW: Using iterrows
def slow_sum():
    result = []
    for idx, row in df.iterrows():
        result.append(row['A'] + row['B'])
    return pd.Series(result)

start = time.time()
slow_result = slow_sum()
print(f"iterrows: {time.time() - start:.2f}s")

The Solution: Vectorize

# FAST: Vectorized
start = time.time()
fast_result = df['A'] + df['B']
print(f"Vectorized: {time.time() - start:.4f}s")

Speedup: 100-1000x faster

Trap 2: Growing DataFrames in a Loop

The Problem

# SLOW: Appending in a loop
df = pd.DataFrame(columns=['A', 'B'])

for i in range(10000):
    df = pd.concat([df, pd.DataFrame({'A': [i], 'B': [i*2]})])
    # Each concat creates a new DataFrame!

The Solution: Build List, Then DataFrame

# FAST: Collect data, then create DataFrame
data = []
for i in range(10000):
    data.append({'A': i, 'B': i*2})

df = pd.DataFrame(data)

Speedup: 100x+ faster

Trap 3: Using apply() When Vectorized Option Exists

The Problem

# SLOW: Using apply
df['C'] = df['A'].apply(lambda x: x ** 2)

# SLOW: Row-wise apply
df['D'] = df.apply(lambda row: row['A'] + row['B'], axis=1)

The Solution: Use Built-in Vectorized Operations

# FAST: Vectorized
df['C'] = df['A'] ** 2

# FAST: Vectorized
df['D'] = df['A'] + df['B']

Speedup: 10-100x faster

Trap 4: String Operations in Loops

The Problem

df = pd.DataFrame({'text': ['hello', 'world', 'python'] * 10000})

# SLOW: Loop
result = []
for text in df['text']:
    result.append(text.upper())
df['upper'] = result

The Solution: Use str Accessor

# FAST: Vectorized string operation
df['upper'] = df['text'].str.upper()

Trap 5: Conditional Assignment in Loops

The Problem

df = pd.DataFrame({'value': np.random.randn(100000)})

# SLOW: Loop with condition
for idx in df.index:
    if df.loc[idx, 'value'] > 0:
        df.loc[idx, 'category'] = 'positive'
    else:
        df.loc[idx, 'category'] = 'negative'

The Solution: Vectorized Conditional

# FAST: np.where
df['category'] = np.where(df['value'] > 0, 'positive', 'negative')

# FAST: .loc with boolean indexing
df.loc[df['value'] > 0, 'category'] = 'positive'
df.loc[df['value'] <= 0, 'category'] = 'negative'

Trap 6: Using Object Dtype for Numbers

The Problem

# Accidental object dtype
df = pd.DataFrame({'values': ['1', '2', '3', '4', '5']})
print(df['values'].dtype)  # object

# Operations on object dtype are slow
df['values'].astype(float).sum()  # Conversion on every operation

The Solution: Use Correct Dtypes

# Convert once, use many times
df['values'] = pd.to_numeric(df['values'])
print(df['values'].dtype)  # int64 or float64

# Operations are now fast
df['values'].sum()

Trap 7: Inefficient GroupBy

The Problem

# SLOW: Multiple separate groupby calls
mean_by_group = df.groupby('group')['value'].mean()
std_by_group = df.groupby('group')['value'].std()
count_by_group = df.groupby('group')['value'].count()

The Solution: Single GroupBy with agg

# FAST: Single groupby with multiple aggregations
stats = df.groupby('group')['value'].agg(['mean', 'std', 'count'])

Trap 8: Large DataFrame Copies

The Problem

# SLOW: Unnecessary copies
df_copy = df.copy()
df_copy['new_col'] = df_copy['A'] * 2
# Then only use df_copy briefly

The Solution: Modify In-Place or Use Views

# Option 1: Modify original if appropriate
df['new_col'] = df['A'] * 2

# Option 2: Use assign for method chaining (creates copy only when needed)
result = df.assign(new_col=lambda x: x['A'] * 2)

Trap 9: Using Python max/min/sum

The Problem

# SLOW: Python built-ins
result = sum(df['A'])  # Python sum, iterates one by one
result = max(df['A'])  # Python max

The Solution: Use pandas/NumPy Methods

# FAST: pandas methods
result = df['A'].sum()  # Vectorized
result = df['A'].max()  # Vectorized

Benchmarking Your Code

import time

def benchmark(func, name, n_runs=3):
    """Simple benchmarking function."""
    times = []
    for _ in range(n_runs):
        start = time.time()
        func()
        times.append(time.time() - start)
    avg_time = sum(times) / n_runs
    print(f"{name}: {avg_time:.4f}s (avg of {n_runs} runs)")
    return avg_time

# Compare approaches
df = pd.DataFrame({'A': np.random.randn(100000)})

benchmark(lambda: df['A'].apply(lambda x: x**2), "apply")
benchmark(lambda: df['A'] ** 2, "vectorized")

When apply() is Acceptable

Sometimes apply() is necessary:

# Complex logic that can't be vectorized
def complex_business_logic(row):
    if row['status'] == 'A' and row['value'] > 100:
        return row['value'] * 1.1
    elif row['status'] == 'B':
        return row['value'] * 0.9
    else:
        return row['value']

# This is acceptable (but try to vectorize if possible)
df['adjusted'] = df.apply(complex_business_logic, axis=1)

Better alternative with np.select:

conditions = [
    (df['status'] == 'A') & (df['value'] > 100),
    df['status'] == 'B'
]
choices = [
    df['value'] * 1.1,
    df['value'] * 0.9
]
df['adjusted'] = np.select(conditions, choices, default=df['value'])

Summary: Performance Hierarchy

Approach Relative Speed Use When
Vectorized (NumPy/pandas) 1x (baseline) Always prefer
Boolean indexing ~1x Conditional assignment
str/dt accessor ~2-5x slower String/datetime ops
apply() on column ~10-100x slower Last resort for complex logic
apply() on rows (axis=1) ~100-500x slower Very last resort
itertuples() ~50-200x slower Need row access, can't vectorize
iterrows() ~500-2000x slower Avoid
Python for loop ~1000-5000x slower Never for data ops