Skip to content

pivot_table Method

The pivot_table() method creates a spreadsheet-style pivot table with aggregation support, handling duplicate entries and providing more flexibility than pivot().

Basic Usage

Create a pivot table with aggregation.

1. Simple Pivot Table

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'date': ['5/1/2017', '5/1/2017', '5/2/2017', '5/2/2017',
             '5/1/2017', '5/2/2017'],
    'city': ['new york', 'new york', 'new york', 'new york',
             'mumbai', 'mumbai'],
    'temperature': [65, 61, 70, 72, 75, 81],
    'humidity': [56, 54, 60, 62, 80, 55]
})
print(df.head())

2. Create Pivot Table

dg = df.pivot_table(
    index="city",
    columns="date",
    aggfunc=np.mean
)
print(dg)
         humidity          temperature         
date     5/1/2017 5/2/2017    5/1/2017 5/2/2017
city                                           
mumbai       80.0     55.0        75.0     81.0
new york     55.0     61.0        63.0     71.0

3. Automatic Aggregation

pivot_table aggregates duplicate index-column pairs.

pivot_table vs pivot

Key differences between the methods.

1. Duplicates

# pivot: fails with duplicate entries
# pivot_table: aggregates duplicates

2. Aggregation Function

# pivot: no aggregation
# pivot_table: aggfunc parameter (default='mean')

3. Multiple Values

# pivot_table can show multiple columns automatically

aggfunc Parameter

Specify how to aggregate values.

1. Built-in Functions

df.pivot_table(index='city', columns='date', values='temperature', aggfunc='sum')
df.pivot_table(index='city', columns='date', values='temperature', aggfunc='count')
df.pivot_table(index='city', columns='date', values='temperature', aggfunc='max')

2. NumPy Functions

df.pivot_table(index='city', columns='date', values='temperature', aggfunc=np.mean)
df.pivot_table(index='city', columns='date', values='temperature', aggfunc=np.std)

3. Multiple Functions

df.pivot_table(
    index='city',
    columns='date',
    values='temperature',
    aggfunc=['mean', 'std', 'count']
)

values Parameter

Specify which columns to aggregate.

1. Single Column

df.pivot_table(
    index='city',
    columns='date',
    values='temperature'
)

2. Multiple Columns

df.pivot_table(
    index='city',
    columns='date',
    values=['temperature', 'humidity']
)

3. All Numeric (Default)

# If values not specified, all numeric columns are used
df.pivot_table(index='city', columns='date')

fill_value Parameter

Replace missing values.

1. Fill with Zero

df.pivot_table(
    index='city',
    columns='date',
    values='temperature',
    fill_value=0
)

2. Fill with NaN (Default)

Missing combinations show NaN.

3. Custom Fill

df.pivot_table(..., fill_value=-999)

Practical Example

Sales transactions analysis.

1. Sample Data

sales = pd.DataFrame({
    'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
    'Product': ['A', 'A', 'B', 'B', 'C', 'C'],
    'Sales': [100, 150, 200, 300, 250, 180],
    'Profit': [20, 30, 50, 80, 60, 40]
})

2. Create Summary

pivot_sales = sales.pivot_table(
    values='Sales',
    index='Region',
    columns='Product',
    aggfunc='sum',
    fill_value=0
)
print(pivot_sales)

3. Add Margins

pivot_sales = sales.pivot_table(
    values='Sales',
    index='Region',
    columns='Product',
    aggfunc='sum',
    margins=True,
    margins_name='Total'
)