Skip to content

pivot vs melt

pivot and melt are inverse operations for reshaping DataFrames between wide and long formats.

Mental Model

pivot spreads long data into wide columns; melt gathers wide columns back into long rows. They are inverses: melt(pivot(df)) recovers the original shape (up to sorting). If your data is too wide for plotting or groupby, melt it; if it is too long for correlation or comparison, pivot it.

Conceptual Relationship

pivot and melt transform data in opposite directions.

1. melt: Wide to Long

```python import pandas as pd

Wide format

wide = pd.DataFrame({ 'month': ['Jan', 'Feb'], 'New York': [5, 3], 'Los Angeles': [15, 17] }) print("Wide format:") print(wide) ```

month New York Los Angeles 0 Jan 5 15 1 Feb 3 17

2. Apply melt

python long = pd.melt( wide, id_vars=['month'], var_name='city', value_name='temperature' ) print("Long format (after melt):") print(long)

month city temperature 0 Jan New York 5 1 Feb New York 3 2 Jan Los Angeles 15 3 Feb Los Angeles 17

3. Apply pivot

python back_to_wide = long.pivot( index='month', columns='city', values='temperature' ) print("Back to wide (after pivot):") print(back_to_wide)

city Los Angeles New York month Feb 17 3 Jan 15 5

When to Use Each

Guidelines for choosing between pivot and melt.

1. Use melt When

```python

Converting columns to rows

Preparing data for visualization (seaborn, plotly)

Normalizing for database storage

Input for groupby operations

```

2. Use pivot When

```python

Creating summary tables

Converting rows to columns

Preparing data for comparison

Creating cross-tabulation

```

3. Format Characteristics

Format Rows Columns Use Case
Wide Few Many Display, comparison
Long Many Few Analysis, storage

Complete Round-trip

Transform and reverse without data loss.

1. Start with Wide

python original = pd.DataFrame({ 'product': ['A', 'B'], 'Q1': [100, 150], 'Q2': [200, 250] })

2. Melt to Long

python long_form = original.melt( id_vars=['product'], var_name='quarter', value_name='sales' )

3. Pivot Back to Wide

```python wide_again = long_form.pivot( index='product', columns='quarter', values='sales' ).reset_index()

wide_again.columns.name = None # Remove column name ```

Practical Example

Temperature data transformation.

1. Original Wide Data

python temps = pd.DataFrame({ 'date': ['2024-01-01', '2024-01-02'], 'city_A': [20, 22], 'city_B': [15, 17], 'city_C': [25, 27] })

2. Melt for Analysis

```python temps_long = temps.melt( id_vars=['date'], var_name='city', value_name='temperature' )

Now can easily compute:

temps_long.groupby('city')['temperature'].mean() ```

3. Pivot for Display

```python temps_wide = temps_long.pivot( index='date', columns='city', values='temperature' )

Good for side-by-side comparison

```

Key Differences

Summary of differences.

1. Direction

```python

melt: columns → rows (wide to long)

pivot: rows → columns (long to wide)

```

2. Data Volume

```python

melt: increases row count

pivot: decreases row count (typically)

```

3. Required Parameters

```python

melt: id_vars (optional), var_name, value_name

pivot: index, columns, values

```


Runnable Example: pivot_reshape_tutorial.py

```python """ Pandas Tutorial: Pivot Tables and Reshaping Data.

Covers pivot, pivot_table, melt, stack, and unstack. """

import pandas as pd import numpy as np

=============================================================================

Main

=============================================================================

if name == "main":

print("="*70)
print("PIVOT TABLES AND RESHAPING")
print("="*70)

# Create sample data
np.random.seed(42)
data = pd.DataFrame({
    'Date': pd.date_range('2024-01-01', periods=12, freq='M'),
    'Product': ['A', 'B'] * 6,
    'Region': ['North', 'North', 'South', 'South'] * 3,
    'Sales': np.random.randint(100, 1000, 12)
})

print("\nOriginal Data:")
print(data)

# Pivot
print("\n1. Pivot (reshape from long to wide):")
pivoted = data.pivot(index='Date', columns='Product', values='Sales')
print(pivoted.head())

# Pivot table (with aggregation)
print("\n2. Pivot Table with aggregation:")
pivot_table = data.pivot_table(values='Sales', 
                                index='Product', 
                                columns='Region',
                                aggfunc='mean')
print(pivot_table)

# Multiple aggregation functions
print("\n3. Pivot table with multiple functions:")
pivot_multi = data.pivot_table(values='Sales',
                               index='Product',
                               columns='Region',
                               aggfunc=['sum', 'mean', 'count'])
print(pivot_multi)

# Melt (reshape from wide to long)
wide_data = pd.DataFrame({
    'ID': [1, 2, 3],
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Math': [85, 92, 78],
    'Science': [88, 95, 82],
    'English': [90, 87, 85]
})

print("\n4. Wide format data:")
print(wide_data)

print("\n5. Melt (reshape to long format):")
melted = pd.melt(wide_data, 
                 id_vars=['ID', 'Name'],
                 value_vars=['Math', 'Science', 'English'],
                 var_name='Subject',
                 value_name='Score')
print(melted)

# Stack and Unstack
df_multi = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
}, index=['X', 'Y', 'Z'])

print("\n6. Original DataFrame:")
print(df_multi)

print("\n7. Stack (columns to rows):")
stacked = df_multi.stack()
print(stacked)

print("\n8. Unstack (rows to columns):")
unstacked = stacked.unstack()
print(unstacked)

# Cross-tabulation
df_survey = pd.DataFrame({
    'Gender': ['M', 'F', 'M', 'F', 'M', 'F'],
    'Age_Group': ['Young', 'Young', 'Old', 'Old', 'Young', 'Old'],
    'Response': ['Yes', 'No', 'Yes', 'Yes', 'No', 'Yes']
})

print("\n9. Cross-tabulation:")
crosstab = pd.crosstab(df_survey['Gender'], 
                       df_survey['Response'],
                       margins=True)
print(crosstab)

print("\nKEY TAKEAWAYS:")
print("- pivot(): Reshape data (needs unique index/column combinations)")
print("- pivot_table(): Pivot with aggregation")
print("- melt(): Convert wide to long format")
print("- stack(): Pivot columns to row index")
print("- unstack(): Pivot row index to columns")
print("- crosstab(): Compute frequency table")

```


Exercises

Exercise 1. Explain the relationship between pivot() and melt(). Are they inverse operations?

Solution to Exercise 1

```python import pandas as pd import numpy as np

Solution for the specific exercise

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(10), 'B': np.random.randn(10)}) print(df.head()) ```


Exercise 2. Write code that converts wide to long with melt() and then back to wide with pivot(). Is the result identical?

Solution to Exercise 2

See the main content for the detailed explanation. The key concept involves understanding the Pandas API and its behavior for this specific operation.


Exercise 3. Create a long-format DataFrame and reshape it to wide format using pivot(). Then reverse it with melt().

Solution to Exercise 3

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(20), 'B': np.random.randn(20)}) result = df.describe() print(result) ```


Exercise 4. Explain when you would choose melt() over stack() for reshaping. Give a concrete example.

Solution to Exercise 4

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(50), 'group': np.random.choice(['X', 'Y'], 50)}) result = df.groupby('group').mean() print(result) ```