drop Method¶
The drop() method removes specified rows or columns from a DataFrame.
Mental Model
drop() removes by label, not by condition. Pass column names with axis=1 to remove columns, or index labels with axis=0 to remove rows. It returns a new DataFrame by default -- the original is unchanged unless you set inplace=True.
Drop Columns¶
Remove columns by name.
1. Single Column¶
```python import pandas as pd
df = pd.DataFrame({ 'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9] })
result = df.drop('B', axis=1) print(result) ```
A C
0 1 7
1 2 8
2 3 9
2. Multiple Columns¶
python
result = df.drop(['B', 'C'], axis=1)
3. columns Parameter¶
```python result = df.drop(columns=['B', 'C'])
More explicit than axis=1¶
```
Drop Rows¶
Remove rows by index.
1. Single Row¶
python
result = df.drop(0) # axis=0 is default
print(result)
A B C
1 2 5 8
2 3 6 9
2. Multiple Rows¶
python
result = df.drop([0, 2])
3. index Parameter¶
python
result = df.drop(index=[0, 2])
inplace Parameter¶
Modify DataFrame directly.
1. Without inplace¶
```python result = df.drop(columns=['B'])
df unchanged, result has change¶
```
2. With inplace¶
```python df.drop(columns=['B'], inplace=True)
df modified directly, returns None¶
```
3. Best Practice¶
```python
Prefer reassignment over inplace¶
df = df.drop(columns=['B']) ```
LeetCode Example: Tree Node¶
Drop unnecessary columns.
1. Sample Data¶
python
tree = pd.DataFrame({
'id': [1, 2, 3, 4, 5],
'p_id': [None, 1, 1, 2, 2],
'type': ['Root', 'Inner', 'Inner', 'Leaf', 'Leaf']
})
2. Drop Column¶
python
result = tree.drop(columns='p_id')
print(result)
id type
0 1 Root
1 2 Inner
2 3 Inner
3 4 Leaf
4 5 Leaf
3. axis=1 Syntax¶
python
result = tree.drop('p_id', axis=1)
LeetCode Example: Top Travellers¶
Drop ID column from rides.
1. Sample Data¶
python
rides = pd.DataFrame({
'id': [1, 2, 3, 4],
'user_id': [1, 1, 2, 3],
'distance': [10, 15, 20, 25]
})
2. Drop ID¶
python
result = rides.drop('id', axis=1)
3. Result¶
python
print(result)
user_id distance
0 1 10
1 1 15
2 2 20
3 3 25
Drop by Condition¶
Remove rows based on conditions.
1. Drop NaN Rows¶
```python
Use dropna instead¶
df = df.dropna() ```
2. Drop Specific Values¶
```python
Filter instead of drop¶
df = df[df['status'] != 'Deleted'] ```
3. Get Index Then Drop¶
```python
Find rows to drop¶
to_drop = df[df['value'] < 0].index df = df.drop(to_drop) ```
errors Parameter¶
Handle missing labels.
1. Default (raise)¶
```python
Raises KeyError if column doesn't exist¶
df.drop(columns=['NonExistent']) # Error! ```
2. Ignore Errors¶
```python df.drop(columns=['NonExistent'], errors='ignore')
No error, returns df unchanged¶
```
3. Safe Drop¶
```python
Drop if exists¶
columns_to_drop = ['B', 'NonExistent'] df.drop(columns=columns_to_drop, errors='ignore') ```
Drop Duplicates Context¶
Use drop_duplicates for duplicate removal.
1. drop vs drop_duplicates¶
```python
drop: remove by label¶
df.drop(index=[0, 1])
drop_duplicates: remove duplicate rows¶
df.drop_duplicates() ```
2. Different Purposes¶
```python
drop: known indices/columns¶
drop_duplicates: based on values¶
```
3. See drop_duplicates¶
Refer to drop_duplicates.md for duplicate removal.
Method Chaining¶
drop in pipelines.
1. Chain Operations¶
python
result = (
df
.drop(columns=['temp_col'])
.drop(index=[0])
.reset_index(drop=True)
)
2. With Other Methods¶
python
result = (
df
.assign(calculated=df['a'] + df['b'])
.drop(columns=['a', 'b'])
.rename(columns={'calculated': 'sum'})
)
3. Clean Pipeline¶
python
result = (
raw_df
.dropna()
.drop(columns=['unnecessary_col'])
.reset_index(drop=True)
)
Exercises¶
Exercise 1.
Create a DataFrame with 5 columns. Drop two columns using drop(columns=[...]). Verify the resulting DataFrame has 3 columns.
Solution to Exercise 1
Drop columns and verify the result.
import pandas as pd
df = pd.DataFrame({
'A': [1], 'B': [2], 'C': [3], 'D': [4], 'E': [5]
})
result = df.drop(columns=['B', 'D'])
print(result.columns.tolist())
assert len(result.columns) == 3
Exercise 2.
Create a DataFrame with rows indexed [0, 1, 2, 3, 4]. Drop rows at indices 1 and 3 using drop([1, 3]). Verify the remaining indices are [0, 2, 4].
Solution to Exercise 2
Drop rows by index label.
import pandas as pd
df = pd.DataFrame({'val': [10, 20, 30, 40, 50]})
result = df.drop([1, 3])
print(result)
assert result.index.tolist() == [0, 2, 4]
Exercise 3.
Create a DataFrame and attempt to drop a non-existent column. Use the errors='ignore' parameter to suppress the KeyError. Then drop the column without errors='ignore' to see the error.
Solution to Exercise 3
Use errors='ignore' to handle missing labels gracefully.
import pandas as pd
df = pd.DataFrame({'A': [1], 'B': [2]})
result = df.drop(columns=['C'], errors='ignore')
print(result) # No error, df unchanged
try:
df.drop(columns=['C'])
except KeyError as e:
print(f"KeyError: {e}")