Skip to content

fillna Keywords

The fillna() method accepts several keyword arguments that control how missing values are filled.

Mental Model

method='ffill' carries the last known value forward; method='bfill' pulls the next known value backward. limit caps how many consecutive NaN cells get filled. These keywords turn fillna from a simple constant-fill into a directional propagation tool, ideal for time series where "no new data means the old value still holds."

method Keyword

Propagate non-null values forward or backward.

1. Forward Fill (ffill)

```python import pandas as pd

url = "https://raw.githubusercontent.com/codebasics/py/master/pandas/5_handling_missing_data_fillna_dropna_interpolate/weather_data.csv" df = pd.read_csv(url, index_col='day', parse_dates=True) print(df)

dg = df.fillna(method='ffill') print(dg) ```

Forward fill propagates the last valid observation forward.

2. Backward Fill (bfill)

python dg = df.fillna(method='bfill') print(dg)

Backward fill uses the next valid observation to fill gaps.

3. pad Alias

python df.fillna(method='pad') # Same as ffill df.fillna(method='backfill') # Same as bfill

axis Keyword

Specify the axis along which to fill missing values.

1. Fill Along Rows (axis=0)

python dg = df.fillna(method='ffill', axis=0) print(dg)

Default behavior: fill down columns.

2. Fill Along Columns (axis=1)

python dg = df.fillna(method='ffill', axis=1) print(dg)

Fill across rows from left to right.

3. Numeric vs String Columns

When filling along axis=1, be aware that mixed types may cause issues.

limit Keyword

Limit the number of consecutive NaN values to fill.

1. Limit Forward Fill

python dg = df.fillna(method='ffill', limit=1) print(dg)

Only fills up to 1 consecutive NaN value.

2. Preventing Over-filling

```python

If there are 3 consecutive NaNs and limit=2

Only the first 2 will be filled

```

3. Use Case

```python

In time series, limit prevents filling across

long gaps where forward fill may be inappropriate

df['price'].fillna(method='ffill', limit=5) ```

Combined Keywords

Use multiple keywords together for precise control.

1. Forward Fill with Limit

python df.fillna(method='ffill', axis=0, limit=2)

2. Backward Fill with Limit

python df.fillna(method='bfill', axis=0, limit=1)

3. Fill Strategy

```python

First forward fill, then backward fill remaining

df_filled = df.fillna(method='ffill').fillna(method='bfill') ```

Modern Syntax

In recent pandas versions, prefer explicit methods over the method keyword.

1. ffill Method

python df.ffill() # Forward fill df.ffill(limit=2) # With limit

2. bfill Method

python df.bfill() # Backward fill df.bfill(limit=1) # With limit

3. Deprecation Note

The method parameter in fillna is deprecated in newer pandas versions:

```python

Deprecated

df.fillna(method='ffill')

Preferred

df.ffill() ```


Exercises

Exercise 1. Create a Series with multiple consecutive NaN values. Use fillna(method='ffill', limit=1) to forward-fill only one step. Verify that the second consecutive NaN remains unfilled.

Solution to Exercise 1

Forward fill with a limit of 1.

import pandas as pd
import numpy as np

s = pd.Series([1, np.nan, np.nan, np.nan, 5])
result = s.fillna(method='ffill', limit=1)
print(result)
# Index 1 gets filled (1.0), index 2 and 3 stay NaN

Exercise 2. Create a DataFrame and use fillna() with a dictionary to apply different fill strategies per column (e.g., column A gets 0, column B gets the column mean).

Solution to Exercise 2

Per-column fill strategies using a dictionary.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'A': [1, np.nan, 3],
    'B': [np.nan, 5, np.nan]
})
fill_values = {'A': 0, 'B': df['B'].mean()}
result = df.fillna(fill_values)
print(result)

Exercise 3. Create a DataFrame with NaN values. Compare the results of fillna(method='ffill') and fillna(method='bfill') when the first or last row has NaN. Identify which positions remain unfilled by each method.

Solution to Exercise 3

Compare ffill and bfill at edges.

import pandas as pd
import numpy as np

df = pd.DataFrame({'val': [np.nan, 2, np.nan, 4, np.nan]})
ffilled = df.fillna(method='ffill')
bfilled = df.fillna(method='bfill')
print("ffill:\n", ffilled)
print("\nbfill:\n", bfilled)
# ffill cannot fill the first row; bfill cannot fill the last row