pivot vs melt¶

pivot and melt are inverse operations for reshaping DataFrames between wide and long formats.

Mental Model

pivot spreads long data into wide columns; melt gathers wide columns back into long rows. They are inverses: melt(pivot(df)) recovers the original shape (up to sorting). If your data is too wide for plotting or groupby, melt it; if it is too long for correlation or comparison, pivot it.

Conceptual Relationship¶

pivot and melt transform data in opposite directions.

1. melt: Wide to Long¶

```python import pandas as pd

Wide format¶

wide = pd.DataFrame({ 'month': ['Jan', 'Feb'], 'New York': [5, 3], 'Los Angeles': [15, 17] }) print("Wide format:") print(wide) ```

month New York Los Angeles 0 Jan 5 15 1 Feb 3 17

2. Apply melt¶

python long = pd.melt( wide, id_vars=['month'], var_name='city', value_name='temperature' ) print("Long format (after melt):") print(long)

month city temperature 0 Jan New York 5 1 Feb New York 3 2 Jan Los Angeles 15 3 Feb Los Angeles 17

3. Apply pivot¶

python back_to_wide = long.pivot( index='month', columns='city', values='temperature' ) print("Back to wide (after pivot):") print(back_to_wide)

city Los Angeles New York month Feb 17 3 Jan 15 5

When to Use Each¶

Guidelines for choosing between pivot and melt.

1. Use melt When¶

```python

Converting columns to rows¶

Preparing data for visualization (seaborn, plotly)¶

Normalizing for database storage¶

Input for groupby operations¶

```

2. Use pivot When¶

```python

Creating summary tables¶

Converting rows to columns¶

Preparing data for comparison¶

Creating cross-tabulation¶

```

3. Format Characteristics¶

Format	Rows	Columns	Use Case
Wide	Few	Many	Display, comparison
Long	Many	Few	Analysis, storage

Complete Round-trip¶

Transform and reverse without data loss.

1. Start with Wide¶

python original = pd.DataFrame({ 'product': ['A', 'B'], 'Q1': [100, 150], 'Q2': [200, 250] })

2. Melt to Long¶

python long_form = original.melt( id_vars=['product'], var_name='quarter', value_name='sales' )

3. Pivot Back to Wide¶

```python wide_again = long_form.pivot( index='product', columns='quarter', values='sales' ).reset_index()

wide_again.columns.name = None # Remove column name ```

Practical Example¶

Temperature data transformation.

1. Original Wide Data¶

python temps = pd.DataFrame({ 'date': ['2024-01-01', '2024-01-02'], 'city_A': [20, 22], 'city_B': [15, 17], 'city_C': [25, 27] })

2. Melt for Analysis¶

```python temps_long = temps.melt( id_vars=['date'], var_name='city', value_name='temperature' )

Now can easily compute:¶

temps_long.groupby('city')['temperature'].mean() ```

3. Pivot for Display¶

```python temps_wide = temps_long.pivot( index='date', columns='city', values='temperature' )

Good for side-by-side comparison¶

```

Key Differences¶

Summary of differences.

1. Direction¶

```python

melt: columns → rows (wide to long)¶

pivot: rows → columns (long to wide)¶

```

2. Data Volume¶

```python

melt: increases row count¶

pivot: decreases row count (typically)¶

```

3. Required Parameters¶

```python

melt: id_vars (optional), var_name, value_name¶

pivot: index, columns, values¶

```

Runnable Example: `pivot_reshape_tutorial.py`¶

```python """ Pandas Tutorial: Pivot Tables and Reshaping Data.

Covers pivot, pivot_table, melt, stack, and unstack. """

import pandas as pd import numpy as np

=============================================================================¶

Main¶

=============================================================================¶

if name == "main":

print("="*70)
print("PIVOT TABLES AND RESHAPING")
print("="*70)

# Create sample data
np.random.seed(42)
data = pd.DataFrame({
    'Date': pd.date_range('2024-01-01', periods=12, freq='M'),
    'Product': ['A', 'B'] * 6,
    'Region': ['North', 'North', 'South', 'South'] * 3,
    'Sales': np.random.randint(100, 1000, 12)
})

print("\nOriginal Data:")
print(data)

# Pivot
print("\n1. Pivot (reshape from long to wide):")
pivoted = data.pivot(index='Date', columns='Product', values='Sales')
print(pivoted.head())

# Pivot table (with aggregation)
print("\n2. Pivot Table with aggregation:")
pivot_table = data.pivot_table(values='Sales', 
                                index='Product', 
                                columns='Region',
                                aggfunc='mean')
print(pivot_table)

# Multiple aggregation functions
print("\n3. Pivot table with multiple functions:")
pivot_multi = data.pivot_table(values='Sales',
                               index='Product',
                               columns='Region',
                               aggfunc=['sum', 'mean', 'count'])
print(pivot_multi)

# Melt (reshape from wide to long)
wide_data = pd.DataFrame({
    'ID': [1, 2, 3],
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Math': [85, 92, 78],
    'Science': [88, 95, 82],
    'English': [90, 87, 85]
})

print("\n4. Wide format data:")
print(wide_data)

print("\n5. Melt (reshape to long format):")
melted = pd.melt(wide_data, 
                 id_vars=['ID', 'Name'],
                 value_vars=['Math', 'Science', 'English'],
                 var_name='Subject',
                 value_name='Score')
print(melted)

# Stack and Unstack
df_multi = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
}, index=['X', 'Y', 'Z'])

print("\n6. Original DataFrame:")
print(df_multi)

print("\n7. Stack (columns to rows):")
stacked = df_multi.stack()
print(stacked)

print("\n8. Unstack (rows to columns):")
unstacked = stacked.unstack()
print(unstacked)

# Cross-tabulation
df_survey = pd.DataFrame({
    'Gender': ['M', 'F', 'M', 'F', 'M', 'F'],
    'Age_Group': ['Young', 'Young', 'Old', 'Old', 'Young', 'Old'],
    'Response': ['Yes', 'No', 'Yes', 'Yes', 'No', 'Yes']
})

print("\n9. Cross-tabulation:")
crosstab = pd.crosstab(df_survey['Gender'], 
                       df_survey['Response'],
                       margins=True)
print(crosstab)

print("\nKEY TAKEAWAYS:")
print("- pivot(): Reshape data (needs unique index/column combinations)")
print("- pivot_table(): Pivot with aggregation")
print("- melt(): Convert wide to long format")
print("- stack(): Pivot columns to row index")
print("- unstack(): Pivot row index to columns")
print("- crosstab(): Compute frequency table")

```

Exercises¶

Exercise 1. Explain the relationship between pivot() and melt(). Are they inverse operations?

Solution to Exercise 1

```python import pandas as pd import numpy as np

Solution for the specific exercise¶

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(10), 'B': np.random.randn(10)}) print(df.head()) ```

Exercise 2. Write code that converts wide to long with melt() and then back to wide with pivot(). Is the result identical?

Solution to Exercise 2

See the main content for the detailed explanation. The key concept involves understanding the Pandas API and its behavior for this specific operation.

Exercise 3. Create a long-format DataFrame and reshape it to wide format using pivot(). Then reverse it with melt().

Solution to Exercise 3

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(20), 'B': np.random.randn(20)}) result = df.describe() print(result) ```

Exercise 4. Explain when you would choose melt() over stack() for reshaping. Give a concrete example.

Solution to Exercise 4

```python import pandas as pd import numpy as np

np.random.seed(42) df = pd.DataFrame({'A': np.random.randn(50), 'group': np.random.choice(['X', 'Y'], 50)}) result = df.groupby('group').mean() print(result) ```

pivot vs melt¶

Conceptual Relationship¶

1. melt: Wide to Long¶

Wide format¶

2. Apply melt¶

3. Apply pivot¶

When to Use Each¶

1. Use melt When¶

Converting columns to rows¶

Preparing data for visualization (seaborn, plotly)¶

Normalizing for database storage¶

Input for groupby operations¶

2. Use pivot When¶

Creating summary tables¶

Converting rows to columns¶

Preparing data for comparison¶

Creating cross-tabulation¶

3. Format Characteristics¶

Complete Round-trip¶

1. Start with Wide¶

2. Melt to Long¶

3. Pivot Back to Wide¶

Practical Example¶

1. Original Wide Data¶

2. Melt for Analysis¶

Now can easily compute:¶

3. Pivot for Display¶

Good for side-by-side comparison¶

Key Differences¶

1. Direction¶

melt: columns → rows (wide to long)¶

pivot: rows → columns (long to wide)¶

2. Data Volume¶

melt: increases row count¶

pivot: decreases row count (typically)¶

3. Required Parameters¶

melt: id_vars (optional), var_name, value_name¶

pivot: index, columns, values¶

Runnable Example: pivot_reshape_tutorial.py¶

=============================================================================¶

Main¶

=============================================================================¶

Exercises¶

Solution for the specific exercise¶

Runnable Example: `pivot_reshape_tutorial.py`¶