Skip to content

apply Method

The apply() method applies a function along an axis of a DataFrame or to elements of a Series. It is one of the most versatile pandas methods.

Series apply

Apply a function to each element of a Series.

1. Lambda Function

import pandas as pd

url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url, index_col='PassengerId')

bool_mask = df.Sex.apply(lambda x: x == "female")
print(bool_mask.head())
PassengerId
1    False
2     True
3     True
4     True
5    False
Name: Sex, dtype: bool

2. Named Function

def classify_age(age):
    if pd.isna(age):
        return 'Unknown'
    elif age < 18:
        return 'Child'
    elif age < 65:
        return 'Adult'
    else:
        return 'Senior'

df['AgeGroup'] = df['Age'].apply(classify_age)

3. String Methods Alternative

# Instead of apply for simple string operations:
df['Name'].apply(lambda x: x.upper())

# Use vectorized string methods:
df['Name'].str.upper()

DataFrame apply

Apply a function along rows or columns.

1. Column-wise (axis=0)

df[['Age', 'Fare']].apply(lambda x: x.mean())
Age     29.699118
Fare    32.204208
dtype: float64

2. Row-wise (axis=1)

bool_mask = df.apply(lambda x: x.Sex == "female", axis=1)
print(bool_mask.head())

3. Multiple Columns

df['Total'] = df.apply(
    lambda row: row['Quantity'] * row['Price'],
    axis=1
)

LeetCode Example: Class Attendance

Count students per class using apply.

1. Sample Data

courses = pd.DataFrame({
    'class': ['Math', 'Science', 'Math', 'History', 
              'Math', 'Science', 'Math', 'History'],
    'student': ['Alice', 'Bob', 'Carol', 'Dave', 
                'Eve', 'Frank', 'Grace', 'Helen']
})

2. GroupBy with apply

result = courses.groupby('class')['student'].apply(len)
print(result)
class
History    2
Math       4
Science    2
Name: student, dtype: int64

3. Alternative with size

courses.groupby('class').size()

LeetCode Example: Triangle Judgement

Convert boolean to Yes/No string.

1. Apply with Lambda

triangle = pd.DataFrame({
    'x': [3, 1, 5],
    'y': [4, 2, 10],
    'z': [5, 3, 7],
    'is_valid': [True, False, True]
})

triangle["result"] = triangle["is_valid"].apply(
    lambda x: "Yes" if x else "No"
)
print(triangle)

2. Result

   x   y   z  is_valid result
0  3   4   5      True    Yes
1  1   2   3     False     No
2  5  10   7      True    Yes

3. Alternative with map

triangle["result"] = triangle["is_valid"].map({True: "Yes", False: "No"})

LeetCode Example: Special Bonus

Apply with multiple conditions.

1. Bonus Criteria Function

def bonus_criteria(employee_id, name):
    return employee_id % 2 != 0 and not name.startswith('M')

2. Apply Row-wise

employees = pd.DataFrame({
    'employee_id': [1, 2, 3, 4, 5],
    'name': ['Alice', 'Bob', 'Mike', 'Molly', 'Eve'],
    'salary': [50000, 60000, 70000, 80000, 90000]
})

employees['bonus'] = employees.apply(
    lambda row: row['salary'] if bonus_criteria(row['employee_id'], row['name']) else 0,
    axis=1
)
print(employees)

3. Result

   employee_id   name  salary  bonus
0            1  Alice   50000  50000
1            2    Bob   60000      0
2            3   Mike   70000      0
3            4  Molly   80000      0
4            5    Eve   90000  90000

Performance Considerations

When to use and avoid apply.

1. Prefer Vectorized Operations

# Slow
df['double'] = df['value'].apply(lambda x: x * 2)

# Fast
df['double'] = df['value'] * 2

2. Avoid Row-wise When Possible

# Slow (iterates rows)
df.apply(lambda row: row['a'] + row['b'], axis=1)

# Fast (vectorized)
df['a'] + df['b']

3. Use apply When Necessary

  • Complex logic that cannot be vectorized
  • Custom aggregation functions
  • Operations requiring multiple columns with conditions