drop Method¶
The drop() method removes specified rows or columns from a DataFrame.
Drop Columns¶
Remove columns by name.
1. Single Column¶
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]
})
result = df.drop('B', axis=1)
print(result)
A C
0 1 7
1 2 8
2 3 9
2. Multiple Columns¶
result = df.drop(['B', 'C'], axis=1)
3. columns Parameter¶
result = df.drop(columns=['B', 'C'])
# More explicit than axis=1
Drop Rows¶
Remove rows by index.
1. Single Row¶
result = df.drop(0) # axis=0 is default
print(result)
A B C
1 2 5 8
2 3 6 9
2. Multiple Rows¶
result = df.drop([0, 2])
3. index Parameter¶
result = df.drop(index=[0, 2])
inplace Parameter¶
Modify DataFrame directly.
1. Without inplace¶
result = df.drop(columns=['B'])
# df unchanged, result has change
2. With inplace¶
df.drop(columns=['B'], inplace=True)
# df modified directly, returns None
3. Best Practice¶
# Prefer reassignment over inplace
df = df.drop(columns=['B'])
LeetCode Example: Tree Node¶
Drop unnecessary columns.
1. Sample Data¶
tree = pd.DataFrame({
'id': [1, 2, 3, 4, 5],
'p_id': [None, 1, 1, 2, 2],
'type': ['Root', 'Inner', 'Inner', 'Leaf', 'Leaf']
})
2. Drop Column¶
result = tree.drop(columns='p_id')
print(result)
id type
0 1 Root
1 2 Inner
2 3 Inner
3 4 Leaf
4 5 Leaf
3. axis=1 Syntax¶
result = tree.drop('p_id', axis=1)
LeetCode Example: Top Travellers¶
Drop ID column from rides.
1. Sample Data¶
rides = pd.DataFrame({
'id': [1, 2, 3, 4],
'user_id': [1, 1, 2, 3],
'distance': [10, 15, 20, 25]
})
2. Drop ID¶
result = rides.drop('id', axis=1)
3. Result¶
print(result)
user_id distance
0 1 10
1 1 15
2 2 20
3 3 25
Drop by Condition¶
Remove rows based on conditions.
1. Drop NaN Rows¶
# Use dropna instead
df = df.dropna()
2. Drop Specific Values¶
# Filter instead of drop
df = df[df['status'] != 'Deleted']
3. Get Index Then Drop¶
# Find rows to drop
to_drop = df[df['value'] < 0].index
df = df.drop(to_drop)
errors Parameter¶
Handle missing labels.
1. Default (raise)¶
# Raises KeyError if column doesn't exist
df.drop(columns=['NonExistent']) # Error!
2. Ignore Errors¶
df.drop(columns=['NonExistent'], errors='ignore')
# No error, returns df unchanged
3. Safe Drop¶
# Drop if exists
columns_to_drop = ['B', 'NonExistent']
df.drop(columns=columns_to_drop, errors='ignore')
Drop Duplicates Context¶
Use drop_duplicates for duplicate removal.
1. drop vs drop_duplicates¶
# drop: remove by label
df.drop(index=[0, 1])
# drop_duplicates: remove duplicate rows
df.drop_duplicates()
2. Different Purposes¶
# drop: known indices/columns
# drop_duplicates: based on values
3. See drop_duplicates¶
Refer to drop_duplicates.md for duplicate removal.
Method Chaining¶
drop in pipelines.
1. Chain Operations¶
result = (
df
.drop(columns=['temp_col'])
.drop(index=[0])
.reset_index(drop=True)
)
2. With Other Methods¶
result = (
df
.assign(calculated=df['a'] + df['b'])
.drop(columns=['a', 'b'])
.rename(columns={'calculated': 'sum'})
)
3. Clean Pipeline¶
result = (
raw_df
.dropna()
.drop(columns=['unnecessary_col'])
.reset_index(drop=True)
)