replace Method¶
The replace() method substitutes values in a DataFrame or Series. It is more general than fillna() and can replace any value, not just NaN.
Basic Usage¶
Replace specific values with new values.
1. Replace NaN¶
import pandas as pd
import numpy as np
url = "https://raw.githubusercontent.com/codebasics/py/master/pandas/5_handling_missing_data_fillna_dropna_interpolate/weather_data.csv"
df = pd.read_csv(url, index_col='day', parse_dates=True)
print(df)
dg = df.replace(to_replace=np.nan, value=0)
print(dg)
2. Replace Single Value¶
df.replace(to_replace=-999, value=np.nan)
3. Shorthand Syntax¶
df.replace(-999, np.nan) # Positional arguments
Dictionary Replacement¶
Use dictionaries for multiple replacements.
1. Simple Dictionary¶
df.replace({'old_value': 'new_value'})
2. Multiple Replacements¶
df.replace({
-999: np.nan,
-1: 0,
'N/A': np.nan
})
3. Column-specific¶
df.replace({
'column_A': {0: 100},
'column_B': {'x': 'y'}
})
List Replacement¶
Replace multiple values with a single value.
1. List to Scalar¶
df.replace([np.nan, -999, 'NA'], 0)
2. List to List¶
df.replace(
to_replace=['low', 'medium', 'high'],
value=[1, 2, 3]
)
3. Order Matters¶
# Lists must have same length for one-to-one mapping
Regex Replacement¶
Use regular expressions for pattern matching.
1. Enable Regex¶
df['text'].replace(
to_replace=r'^ba.$',
value='new',
regex=True
)
2. Pattern Replacement¶
# Replace all strings starting with 'test'
df.replace(r'^test.*', 'replaced', regex=True)
3. Capture Groups¶
df['col'].replace(
r'(\d+)-(\d+)',
r'\2-\1',
regex=True
) # Swap groups
Comparison with fillna¶
When to use replace vs fillna.
1. fillna for NaN Only¶
df.fillna(0) # Only replaces NaN
2. replace for Any Value¶
df.replace(-999, np.nan) # Replaces any value
df.replace(np.nan, 0) # Also replaces NaN
3. Use Case Guidance¶
# Use fillna when specifically handling missing values
# Use replace when substituting specific values
Practical Examples¶
Common replacement scenarios.
1. Clean Sentinel Values¶
# Replace common missing value indicators
df.replace([-999, -1, 'NA', 'N/A', ''], np.nan)
2. Standardize Categories¶
df['status'].replace({
'Y': 'Yes', 'y': 'Yes', 'YES': 'Yes',
'N': 'No', 'n': 'No', 'NO': 'No'
})
3. Fix Data Entry Errors¶
df['country'].replace({
'USA': 'United States',
'U.S.A.': 'United States',
'US': 'United States'
})
Method Parameters¶
Additional parameters for replace.
1. inplace¶
df.replace(-999, np.nan, inplace=True)
2. limit¶
df.replace(-999, np.nan, limit=10) # Max 10 replacements
3. method (Deprecated)¶
# method parameter was used for forward/backward fill
# Now deprecated; use fillna instead