pivot Method¶
The pivot() method reshapes data from long format to wide format, spreading values across columns.
Basic Usage¶
Pivot a DataFrame.
1. Long to Wide¶
import pandas as pd
df = pd.DataFrame({
'date': ['2024-01-01', '2024-01-01', '2024-01-02', '2024-01-02'],
'city': ['NY', 'LA', 'NY', 'LA'],
'temperature': [30, 70, 32, 72]
})
print("Long format:")
print(df)
wide = df.pivot(index='date', columns='city', values='temperature')
print("\nWide format:")
print(wide)
Long format:
date city temperature
0 2024-01-01 NY 30
1 2024-01-01 LA 70
2 2024-01-02 NY 32
3 2024-01-02 LA 72
Wide format:
city LA NY
date
2024-01-01 70 30
2024-01-02 72 32
2. Parameters¶
# index: column to become row index
# columns: column whose values become column headers
# values: column containing data values
3. Result Structure¶
Each unique value in 'city' becomes a column.
LeetCode Example: Department Table¶
Reshape department revenue by month.
1. Sample Data¶
department = pd.DataFrame({
'id': [1, 1, 1, 2, 2],
'month': ['Jan', 'Feb', 'Mar', 'Jan', 'Feb'],
'revenue': [100, 150, 200, 80, 120]
})
2. Pivot Transform¶
bymonth = department.pivot(
index='id',
columns='month',
values='revenue'
)
print(bymonth)
month Feb Jan Mar
id
1 150 100 200.0
2 120 80 NaN
3. Handle Missing¶
bymonth = bymonth.fillna(0)
Limitations¶
pivot has strict requirements.
1. No Duplicate Entries¶
# pivot fails if index-column combination has duplicates
df = pd.DataFrame({
'date': ['2024-01-01', '2024-01-01'],
'city': ['NY', 'NY'], # Duplicate!
'temp': [30, 31]
})
# df.pivot(index='date', columns='city', values='temp') # Error!
2. Use pivot_table for Duplicates¶
# pivot_table handles duplicates with aggregation
df.pivot_table(index='date', columns='city', values='temp', aggfunc='mean')
3. Single Value Required¶
Each index-column pair must have exactly one value.
Financial Example¶
Pivot stock price data.
1. Sample Data¶
prices = pd.DataFrame({
'date': ['2024-01-01', '2024-01-01', '2024-01-02', '2024-01-02'],
'ticker': ['AAPL', 'MSFT', 'AAPL', 'MSFT'],
'close': [150, 350, 152, 355]
})
2. Pivot to Wide¶
price_matrix = prices.pivot(
index='date',
columns='ticker',
values='close'
)
3. Use for Analysis¶
# Calculate correlation
price_matrix.corr()
# Calculate returns
price_matrix.pct_change()
reset_index After Pivot¶
Flatten the result.
1. Index as Column¶
result = df.pivot(index='date', columns='city', values='temp')
result = result.reset_index()
2. Remove Column Name¶
result.columns.name = None
3. Rename Columns¶
result.columns = ['date', 'los_angeles', 'new_york']