Skip to content

unstack Method

The unstack() method pivots rows into columns, moving an index level to become a column level. This is useful for converting long-format data to wide-format.

Mental Model

unstack pulls an index level out of the row axis and spreads its unique values across as new column headers. The result is wider and shorter -- more columns, fewer rows. It is the inverse of stack and the index-based counterpart of pivot.

Basic Concept

Before unstack(): After unstack(): A B 0 A 1.0 0 1.0 2.0 B 2.0 1 3.0 4.0 1 A 3.0 B 4.0

Basic Usage

```python import pandas as pd import numpy as np

Create a Series with MultiIndex

index = pd.MultiIndex.from_tuples([ (0, 'A'), (0, 'B'), (1, 'A'), (1, 'B') ]) s = pd.Series([1.0, 2.0, 3.0, 4.0], index=index)

print("Original Series:") print(s) print()

unstacked = s.unstack() print("After unstack():") print(unstacked) ```

``` Original Series: 0 A 1.0 B 2.0 1 A 3.0 B 4.0 dtype: float64

After unstack(): A B 0 1.0 2.0 1 3.0 4.0 ```

Specifying Level to Unstack

By default, unstack() operates on the innermost level (-1):

```python

Create MultiIndex with named levels

index = pd.MultiIndex.from_tuples([ ('row1', 'A'), ('row1', 'B'), ('row2', 'A'), ('row2', 'B') ], names=['row', 'col'])

s = pd.Series([1, 2, 3, 4], index=index)

print("Original:") print(s) print()

Unstack inner level (default)

print("unstack() [default, level=-1]:") print(s.unstack()) print()

Unstack outer level

print("unstack(level=0):") print(s.unstack(level=0)) ```

``` Original: row col row1 A 1 B 2 row2 A 3 B 4 dtype: int64

unstack() [default, level=-1]: col A B row
row1 1 2 row2 3 4

unstack(level=0): row row1 row2 col
A 1 3 B 2 4 ```

Unstack by Level Name

```python

Unstack by name instead of position

print(s.unstack(level='col')) print(s.unstack(level='row')) ```

Handling Missing Values

unstack() introduces NaN when combinations don't exist:

```python

Incomplete MultiIndex

index = pd.MultiIndex.from_tuples([ ('A', 1), ('A', 2), ('B', 1) # Missing ('B', 2) ]) s = pd.Series([10, 20, 30], index=index)

print("Original (missing B-2):") print(s) print()

print("After unstack():") print(s.unstack()) ```

``` Original (missing B-2): A 1 10 2 20 B 1 30 dtype: int64

After unstack(): 1 2 A 10.0 20.0 B 30.0 NaN ```

Fill Missing Values

```python

Fill NaN with a specific value

print(s.unstack(fill_value=0)) ```

1 2 A 10 20 B 30 0

DataFrame unstack

Works on DataFrame index levels:

```python

DataFrame with MultiIndex

index = pd.MultiIndex.from_tuples([ ('AAPL', 'day1'), ('AAPL', 'day2'), ('MSFT', 'day1'), ('MSFT', 'day2') ], names=['ticker', 'date'])

df = pd.DataFrame({ 'price': [150, 151, 300, 301], 'volume': [1000, 1100, 2000, 2100] }, index=index)

print("Original:") print(df) print()

Unstack the date level

print("unstack('date'):") print(df.unstack('date')) ```

``` Original: price volume ticker date
AAPL day1 150 1000 day2 151 1100 MSFT day1 300 2000 day2 301 2100

unstack('date'): price volume
date day1 day2 day1 day2 ticker
AAPL 150 151 1000 1100 MSFT 300 301 2000 2100 ```

Practical Examples

Panel Data Reshaping

```python

Long format panel data

np.random.seed(42) tickers = ['AAPL', 'MSFT', 'GOOGL'] dates = pd.date_range('2024-01-01', periods=5)

Create MultiIndex panel

index = pd.MultiIndex.from_product([tickers, dates], names=['ticker', 'date']) returns = pd.Series(np.random.randn(15) * 0.02, index=index, name='return')

print("Long format (panel):") print(returns.head(10)) print()

Unstack to wide format (each ticker as column)

returns_wide = returns.unstack('ticker') print("Wide format:") print(returns_wide) ```

Cross-Sectional Analysis

```python

Wide format is useful for correlation analysis

correlation = returns_wide.corr() print("Correlation matrix:") print(correlation) ```

Time Series Pivot

```python

Data by sector and date

index = pd.MultiIndex.from_product([ ['Tech', 'Finance', 'Health'], pd.date_range('2024-01-01', periods=3) ], names=['sector', 'date'])

df = pd.DataFrame({ 'returns': np.random.randn(9) * 0.02, 'volume': np.random.randint(100, 1000, 9) }, index=index)

print("Original:") print(df) print()

Pivot: dates as columns, sectors as rows

returns_pivot = df['returns'].unstack('date') print("Returns pivoted:") print(returns_pivot) ```

Multiple Unstacks

You can unstack multiple times:

```python

Three-level MultiIndex

index = pd.MultiIndex.from_product([ ['2024', '2025'], ['Q1', 'Q2'], ['AAPL', 'MSFT'] ], names=['year', 'quarter', 'ticker'])

s = pd.Series(range(8), index=index) print("Original (3 levels):") print(s) print()

Unstack once

print("After one unstack():") print(s.unstack()) print()

Unstack twice

print("After two unstacks:") print(s.unstack().unstack()) ```

unstack vs pivot

Both reshape data, but differently:

Aspect unstack() pivot()
Input Series/DataFrame with MultiIndex DataFrame with columns
Source Index levels Column values
Operation Index level → columns Column → columns

```python

Using unstack (from MultiIndex)

s = pd.Series([1, 2, 3, 4], index=pd.MultiIndex.from_product([['A', 'B'], ['x', 'y']])) print("unstack():") print(s.unstack()) print()

Using pivot (from columns)

df = pd.DataFrame({ 'row': ['A', 'A', 'B', 'B'], 'col': ['x', 'y', 'x', 'y'], 'value': [1, 2, 3, 4] }) print("pivot():") print(df.pivot(index='row', columns='col', values='value')) ```

stack and unstack are Inverses

```python df = pd.DataFrame({ 'A': [1, 2], 'B': [3, 4] }, index=['x', 'y'])

print("Original:") print(df) print()

Stack then unstack returns original

stacked = df.stack() unstacked = stacked.unstack() print("After stack().unstack():") print(unstacked) ```

Summary

Parameter Description Default
level Which level to unstack -1 (innermost)
fill_value Value for missing combinations NaN

Key Points:

  • unstack() pivots index level to columns
  • Inverse of stack()
  • Creates NaN for missing combinations
  • Use fill_value to handle missing data
  • Essential for converting long to wide format

Exercises

Exercise 1. Write code that creates a Series with a MultiIndex and uses .unstack() to convert it to a DataFrame.

Solution to Exercise 1

```python import pandas as pd import numpy as np

Solution for the specific exercise

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(10), 'B': np.random.randn(10)}) print(df.head()) ```


Exercise 2. Explain what .unstack() does. Which index level is moved to columns by default?

Solution to Exercise 2

See the main content for the detailed explanation. The key concept involves understanding the Pandas API and its behavior for this specific operation.


Exercise 3. Create a MultiIndex DataFrame and use .unstack(fill_value=0) to fill NaN values created by the reshape.

Solution to Exercise 3

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(20), 'B': np.random.randn(20)}) result = df.describe() print(result) ```


Exercise 4. Write code that unstacks a grouped aggregation result (e.g., df.groupby(['a', 'b']).mean().unstack()).

Solution to Exercise 4

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(50), 'group': np.random.choice(['X', 'Y'], 50)}) result = df.groupby('group').mean() print(result) ```