Skip to content

join Method

The join() method combines DataFrames based on their index. It provides a simpler syntax than merge for index-based operations.

Basic Usage

Join two DataFrames by index.

1. Simple Join

import pandas as pd

df1 = pd.DataFrame({
    'A': [1, 2, 3]
}, index=['a', 'b', 'c'])

df2 = pd.DataFrame({
    'B': [4, 5, 6]
}, index=['a', 'b', 'd'])

result = df1.join(df2)
print(result)
     A    B
a  1.0  4.0
b  2.0  5.0
c  3.0  NaN

2. Default Left Join

join defaults to left join, keeping all left DataFrame rows.

3. Method vs Function

df1.join(df2)           # Method syntax
# No pd.join() function  # Unlike merge

Financial Example

Combine stock price data by date index.

1. Download Data

import yfinance as yf

def download(ticker):
    return yf.Ticker(ticker).history(period="max")

tickers = ['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG']

2. Iterative Join

for i, ticker in enumerate(tickers):
    if i == 0:
        df = download(ticker).rename(columns={'Close': ticker})[[ticker]]
    else:
        dg = download(ticker).rename(columns={'Close': ticker})[[ticker]]
        df = df.join(dg, how="inner")

print(df.head(3))
print(df.tail(3))

3. Result

FAANG closing prices aligned by date.

Joining Multiple DataFrames

Join several DataFrames at once.

1. List of DataFrames

df1.join([df2, df3, df4])

2. Sequential Join

result = df1.join(df2).join(df3).join(df4)

3. With Different How

# All must use same how parameter
df1.join([df2, df3], how='outer')

Suffix Handling

Handle overlapping column names.

1. lsuffix and rsuffix

df1 = pd.DataFrame({'A': [1, 2]}, index=['x', 'y'])
df2 = pd.DataFrame({'A': [3, 4]}, index=['x', 'y'])

df1.join(df2, lsuffix='_left', rsuffix='_right')
   A_left  A_right
x       1        3
y       2        4

2. Required for Overlaps

# Without suffixes, overlapping columns raise error
# df1.join(df2)  # ValueError: columns overlap

3. Clear Naming

df1.join(df2, lsuffix='_2023', rsuffix='_2024')

Index Alignment

join aligns on index, not columns.

1. Different Index Types

# Works with any compatible index types
# DatetimeIndex, RangeIndex, string index

2. Partial Overlap

# Unmatched indices get NaN (or excluded in inner join)

3. on Parameter

# Join left's column to right's index
df1.join(df2, on='key_column')