Skip to content

Performance Traps

pandas is built for vectorized operations. Using Python loops or incorrect patterns can make code orders of magnitude slower.

Mental Model

The performance hierarchy is: vectorized ops > apply() > itertuples() > iterrows() > Python loops. Every time you write a loop over DataFrame rows, ask "can I express this as a column operation instead?" Other common traps include growing a DataFrame row-by-row (use a list of dicts then construct once) and calling .apply() when a built-in method exists.

The Golden Rule

Vectorized operations > apply() > itertuples() > iterrows() > for loops

Trap 1: Row-by-Row Iteration

The Problem

```python import pandas as pd import numpy as np import time

Create test data

n = 100_000 df = pd.DataFrame({ 'A': np.random.randn(n), 'B': np.random.randn(n) })

SLOW: Using iterrows

def slow_sum(): result = [] for idx, row in df.iterrows(): result.append(row['A'] + row['B']) return pd.Series(result)

start = time.time() slow_result = slow_sum() print(f"iterrows: {time.time() - start:.2f}s") ```

The Solution: Vectorize

```python

FAST: Vectorized

start = time.time() fast_result = df['A'] + df['B'] print(f"Vectorized: {time.time() - start:.4f}s") ```

Speedup: 100-1000x faster

Trap 2: Growing DataFrames in a Loop

The Problem

```python

SLOW: Appending in a loop

df = pd.DataFrame(columns=['A', 'B'])

for i in range(10000): df = pd.concat([df, pd.DataFrame({'A': [i], 'B': [i*2]})]) # Each concat creates a new DataFrame! ```

The Solution: Build List, Then DataFrame

```python

FAST: Collect data, then create DataFrame

data = [] for i in range(10000): data.append({'A': i, 'B': i*2})

df = pd.DataFrame(data) ```

Speedup: 100x+ faster

Trap 3: Using apply() When Vectorized Option Exists

The Problem

```python

SLOW: Using apply

df['C'] = df['A'].apply(lambda x: x ** 2)

SLOW: Row-wise apply

df['D'] = df.apply(lambda row: row['A'] + row['B'], axis=1) ```

The Solution: Use Built-in Vectorized Operations

```python

FAST: Vectorized

df['C'] = df['A'] ** 2

FAST: Vectorized

df['D'] = df['A'] + df['B'] ```

Speedup: 10-100x faster

Trap 4: String Operations in Loops

The Problem

```python df = pd.DataFrame({'text': ['hello', 'world', 'python'] * 10000})

SLOW: Loop

result = [] for text in df['text']: result.append(text.upper()) df['upper'] = result ```

The Solution: Use str Accessor

```python

FAST: Vectorized string operation

df['upper'] = df['text'].str.upper() ```

Trap 5: Conditional Assignment in Loops

The Problem

```python df = pd.DataFrame({'value': np.random.randn(100000)})

SLOW: Loop with condition

for idx in df.index: if df.loc[idx, 'value'] > 0: df.loc[idx, 'category'] = 'positive' else: df.loc[idx, 'category'] = 'negative' ```

The Solution: Vectorized Conditional

```python

FAST: np.where

df['category'] = np.where(df['value'] > 0, 'positive', 'negative')

FAST: .loc with boolean indexing

df.loc[df['value'] > 0, 'category'] = 'positive' df.loc[df['value'] <= 0, 'category'] = 'negative' ```

Trap 6: Using Object Dtype for Numbers

The Problem

```python

Accidental object dtype

df = pd.DataFrame({'values': ['1', '2', '3', '4', '5']}) print(df['values'].dtype) # object

Operations on object dtype are slow

df['values'].astype(float).sum() # Conversion on every operation ```

The Solution: Use Correct Dtypes

```python

Convert once, use many times

df['values'] = pd.to_numeric(df['values']) print(df['values'].dtype) # int64 or float64

Operations are now fast

df['values'].sum() ```

Trap 7: Inefficient GroupBy

The Problem

```python

SLOW: Multiple separate groupby calls

mean_by_group = df.groupby('group')['value'].mean() std_by_group = df.groupby('group')['value'].std() count_by_group = df.groupby('group')['value'].count() ```

The Solution: Single GroupBy with agg

```python

FAST: Single groupby with multiple aggregations

stats = df.groupby('group')['value'].agg(['mean', 'std', 'count']) ```

Trap 8: Large DataFrame Copies

The Problem

```python

SLOW: Unnecessary copies

df_copy = df.copy() df_copy['new_col'] = df_copy['A'] * 2

Then only use df_copy briefly

```

The Solution: Modify In-Place or Use Views

```python

Option 1: Modify original if appropriate

df['new_col'] = df['A'] * 2

Option 2: Use assign for method chaining (creates copy only when needed)

result = df.assign(new_col=lambda x: x['A'] * 2) ```

Trap 9: Using Python max/min/sum

The Problem

```python

SLOW: Python built-ins

result = sum(df['A']) # Python sum, iterates one by one result = max(df['A']) # Python max ```

The Solution: Use pandas/NumPy Methods

```python

FAST: pandas methods

result = df['A'].sum() # Vectorized result = df['A'].max() # Vectorized ```

Benchmarking Your Code

```python import time

def benchmark(func, name, n_runs=3): """Simple benchmarking function.""" times = [] for _ in range(n_runs): start = time.time() func() times.append(time.time() - start) avg_time = sum(times) / n_runs print(f"{name}: {avg_time:.4f}s (avg of {n_runs} runs)") return avg_time

Compare approaches

df = pd.DataFrame({'A': np.random.randn(100000)})

benchmark(lambda: df['A'].apply(lambda x: x**2), "apply") benchmark(lambda: df['A'] ** 2, "vectorized") ```

When apply() is Acceptable

Sometimes apply() is necessary:

```python

Complex logic that can't be vectorized

def complex_business_logic(row): if row['status'] == 'A' and row['value'] > 100: return row['value'] * 1.1 elif row['status'] == 'B': return row['value'] * 0.9 else: return row['value']

This is acceptable (but try to vectorize if possible)

df['adjusted'] = df.apply(complex_business_logic, axis=1) ```

Better alternative with np.select:

python conditions = [ (df['status'] == 'A') & (df['value'] > 100), df['status'] == 'B' ] choices = [ df['value'] * 1.1, df['value'] * 0.9 ] df['adjusted'] = np.select(conditions, choices, default=df['value'])

Summary: Performance Hierarchy

Approach Relative Speed Use When
Vectorized (NumPy/pandas) 1x (baseline) Always prefer
Boolean indexing ~1x Conditional assignment
str/dt accessor ~2-5x slower String/datetime ops
apply() on column ~10-100x slower Last resort for complex logic
apply() on rows (axis=1) ~100-500x slower Very last resort
itertuples() ~50-200x slower Need row access, can't vectorize
iterrows() ~500-2000x slower Avoid
Python for loop ~1000-5000x slower Never for data ops

Exercises

Exercise 1. Explain why growing a DataFrame row-by-row in a loop with pd.concat() inside the loop is slow. What is the correct alternative?

Solution to Exercise 1

```python import pandas as pd import numpy as np

Solution for the specific exercise

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(10), 'B': np.random.randn(10)}) print(df.head()) ```


Exercise 2. Write code showing the performance difference between iterrows() and vectorized operations for computing a column.

Solution to Exercise 2

See the main content for the detailed explanation. The key concept involves understanding the Pandas API and its behavior for this specific operation.


Exercise 3. Explain why apply() with a Python function is slower than using built-in Pandas/NumPy vectorized methods.

Solution to Exercise 3

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(20), 'B': np.random.randn(20)}) result = df.describe() print(result) ```


Exercise 4. Write code that collects results in a list and creates a DataFrame at the end, instead of appending to a DataFrame in a loop.

Solution to Exercise 4

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(50), 'group': np.random.choice(['X', 'Y'], 50)}) result = df.groupby('group').mean() print(result) ```