Keyword - left_on right_on¶
The left_on and right_on parameters specify join columns when they have different names in each DataFrame.
Mental Model
left_on and right_on solve the "same concept, different name" problem. When the left table calls it city and the right calls it town, you tell pandas to match city against town. Both columns survive in the result, so you may want to drop the redundant one afterward.
Basic Usage¶
Join DataFrames with differently named columns.
1. Different Column Names¶
```python import pandas as pd
df1 = pd.DataFrame({ 'city': ['NY', 'SF', 'LA'], 'temperature': [21, 14, 35] })
df2 = pd.DataFrame({ 'CITY': ['SF', 'NY', 'ICN'], 'humidity': [65, 68, 75] })
df = pd.merge(df1, df2, left_on='city', right_on='CITY') print(df) ```
city temperature CITY humidity
0 NY 21 NY 68
1 SF 14 SF 65
2. Both Columns Kept¶
```python
Both join columns appear in result¶
Use drop to remove duplicate¶
df = df.drop('CITY', axis=1) ```
3. Cannot Use with on¶
```python
left_on/right_on and on are mutually exclusive¶
pd.merge(df1, df2, on='city', left_on='city') # Error¶
```
Multiple Columns¶
Specify multiple columns for each side.
1. List of Columns¶
```python df1 = pd.DataFrame({ 'year': [2023, 2024], 'qtr': [1, 1], 'sales': [100, 200] })
df2 = pd.DataFrame({ 'fiscal_year': [2023, 2024], 'quarter': [1, 1], 'expenses': [80, 120] })
df = pd.merge( df1, df2, left_on=['year', 'qtr'], right_on=['fiscal_year', 'quarter'] ) ```
2. Lists Must Match Length¶
```python
left_on and right_on must have same number of columns¶
```
3. Order Matters¶
```python
First column in left_on matches first in right_on¶
left_on=['a', 'b'] right_on=['x', 'y'] # a matches x, b matches y ```
Index as Join Key¶
Use index instead of columns.
1. left_index and right_index¶
```python df1 = pd.DataFrame({'value': [1, 2]}, index=['A', 'B']) df2 = pd.DataFrame({'other': [3, 4]}, index=['A', 'C'])
pd.merge(df1, df2, left_index=True, right_index=True) ```
2. Column to Index¶
```python
Join df1's column with df2's index¶
pd.merge(df1, df2, left_on='key', right_index=True) ```
3. Index to Column¶
```python
Join df1's index with df2's column¶
pd.merge(df1, df2, left_index=True, right_on='key') ```
LeetCode Example: Employees and Managers¶
Self-join with different column references.
1. Sample Data¶
python
employee = pd.DataFrame({
'id': [1, 2, 3, 4],
'name': ['John', 'Doe', 'Jane', 'Smith'],
'salary': [50000, 40000, 60000, 30000],
'managerId': [None, 1, 1, 2]
})
2. Self Merge¶
python
merged = pd.merge(
left=employee,
right=employee,
left_on='managerId',
right_on='id',
how='inner',
suffixes=('_employee', '_manager')
)
print(merged)
3. Result¶
id_employee name_employee salary_employee managerId_employee id_manager name_manager salary_manager managerId_manager
0 2 Doe 40000 1.0 1 John 50000 NaN
1 3 Jane 60000 1.0 1 John 50000 NaN
2 4 Smith 30000 2.0 2 Doe 40000 1.0
Cleanup After Merge¶
Remove redundant columns after merge.
1. Drop Duplicate Column¶
python
df = pd.merge(df1, df2, left_on='city', right_on='CITY')
df = df.drop('CITY', axis=1)
2. Rename Columns¶
python
df = df.rename(columns={'city': 'location'})
3. Select Columns¶
python
df = df[['city', 'temperature', 'humidity']]
Exercises¶
Exercise 1.
Create two DataFrames where the key column is named 'emp_id' in one and 'employee_id' in the other. Use left_on and right_on to merge them. Drop the duplicate key column afterward.
Solution to Exercise 1
Merge on differently named key columns.
import pandas as pd
df1 = pd.DataFrame({'emp_id': [1, 2, 3], 'salary': [50000, 60000, 70000]})
df2 = pd.DataFrame({'employee_id': [1, 2, 3], 'dept': ['HR', 'IT', 'Sales']})
result = pd.merge(df1, df2, left_on='emp_id', right_on='employee_id')
result = result.drop(columns='employee_id')
print(result)
Exercise 2.
Create a DataFrame with a 'manager_id' column and another with 'id' and 'name'. Use left_on='manager_id' and right_on='id' to look up each person's manager name.
Solution to Exercise 2
Look up manager names via left_on and right_on.
import pandas as pd
employees = pd.DataFrame({
'id': [1, 2, 3],
'name': ['Alice', 'Bob', 'Carol'],
'manager_id': [None, 1, 1]
})
result = pd.merge(
employees, employees[['id', 'name']],
left_on='manager_id', right_on='id',
suffixes=('', '_manager'), how='left'
)
print(result[['name', 'name_manager']])
Exercise 3.
Merge two DataFrames where one has the key in a column and the other has the key as its index. Use left_on='key_col' and right_index=True to perform the merge.
Solution to Exercise 3
Merge column key against index key.
import pandas as pd
orders = pd.DataFrame({'order_id': [1, 2, 3], 'product_id': [101, 102, 101]})
products = pd.DataFrame(
{'product_name': ['Widget', 'Gadget']},
index=pd.Index([101, 102], name='product_id')
)
result = pd.merge(orders, products, left_on='product_id', right_index=True)
print(result)