concat Method¶

The concat() function concatenates pandas objects along a particular axis, stacking DataFrames vertically or horizontally.

Basic Usage¶

Concatenate DataFrames vertically.

1. Vertical Concatenation¶

import pandas as pd

df1 = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
print(df1)

df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'))
print(df2)

df = pd.concat([df1, df2])
print(df)

2. List of DataFrames¶

pd.concat([df1, df2, df3, df4])

3. Index Preserved¶

Original indices are kept (may have duplicates).

LeetCode Example: Friend Requests¶

Concatenate two columns into one Series.

1. Sample Data¶

request_accepted = pd.DataFrame({
    'requester_id': [1, 2, 3, 4, 1, 2],
    'accepter_id': [2, 3, 4, 1, 3, 4]
})

2. Concatenate Columns¶

combined_ids = pd.concat([
    request_accepted['requester_id'],
    request_accepted['accepter_id']
])
print(combined_ids)

0    1
1    2
2    3
3    4
4    1
5    2
0    2
1    3
2    4
3    1
4    3
5    4
dtype: int64

3. Count Occurrences¶

friend_counts = combined_ids.value_counts()

Horizontal Concatenation¶

Stack DataFrames side by side.

1. axis=1¶

df1 = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('CD'))

dg = pd.concat([df1, df2], axis=1)
print(dg)

   A  B  C  D
0  1  2  5  6
1  3  4  7  8

2. Index Alignment¶

# When indices differ, NaN fills missing values

3. Use join Parameter¶

pd.concat([df1, df2], axis=1, join='inner')  # Only matching indices

Handling Mismatched Columns¶

Concat with different column structures.

1. Union of Columns¶

df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'B': [5, 6], 'C': [7, 8]})

result = pd.concat([df1, df2])
# Columns: A, B, C with NaN where missing

2. join='inner'¶

result = pd.concat([df1, df2], join='inner')
# Only column B (common to both)

3. Specify Columns¶

# Ensure same columns before concat
df2 = df2.reindex(columns=df1.columns)

keys Parameter¶

Add hierarchical index to identify sources.

1. Label Sources¶

result = pd.concat([df1, df2], keys=['first', 'second'])

2. MultiIndex Result¶

print(result.index)
# MultiIndex([('first', 0), ('first', 1), ...])

3. Access by Key¶

result.loc['first']  # Get first DataFrame's rows