Plot Types (kind Parameter)¶

The kind parameter in pandas plot() determines the type of visualization. This document covers all available plot types.

Available Plot Types¶

kind	Plot Type	Use Case
`'line'`	Line plot	Time series, trends
`'bar'`	Vertical bar	Category comparison
`'barh'`	Horizontal bar	Category comparison
`'hist'`	Histogram	Distribution
`'box'`	Box plot	Distribution summary
`'kde'`/`'density'`	Kernel density	Smooth distribution
`'area'`	Stacked area	Composition over time
`'pie'`	Pie chart	Proportions
`'scatter'`	Scatter plot	Relationship between variables
`'hexbin'`	Hexbin plot	Dense scatter alternative

Line Plot (Default)¶

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

df = pd.DataFrame({
    'A': np.random.randn(50).cumsum(),
    'B': np.random.randn(50).cumsum()
})

df.plot(kind='line')  # or just df.plot()
plt.show()

Bar Plot¶

Vertical Bar (kind='bar')¶

# Count categories
url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/drinks.csv'
df = pd.read_csv(url)

fig, ax = plt.subplots(figsize=(10, 4))
df['continent'].value_counts().plot(kind='bar', ax=ax)
ax.set_title('Countries by Continent')
plt.show()

Horizontal Bar (kind='barh')¶

fig, ax = plt.subplots(figsize=(8, 5))
df['continent'].value_counts().plot(kind='barh', ax=ax)
ax.set_title('Countries by Continent')
plt.show()

Histogram (kind='hist')¶

url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)

fig, ax = plt.subplots(figsize=(8, 4))
df['Age'].plot(kind='hist', bins=20, ax=ax, edgecolor='black')
ax.set_title('Age Distribution')
ax.set_xlabel('Age')
plt.show()

Histogram Keywords¶

df['Age'].plot(
    kind='hist',
    bins=30,           # Number of bins
    density=True,      # Normalize to density
    alpha=0.7,         # Transparency
    edgecolor='black'  # Bar edge color
)

Box Plot (kind='box')¶

url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)

fig, ax = plt.subplots(figsize=(5, 4))
df['Age'].plot(kind='box', ax=ax)
ax.set_title('Age Distribution')
plt.show()

Horizontal Box Plot¶

fig, ax = plt.subplots(figsize=(8, 3))
df['Age'].plot(kind='box', ax=ax, vert=False)
ax.set_title('Horizontal Boxplot of Passenger Ages')
ax.set_xlabel('Age')
plt.show()

Multiple Box Plots¶

fig, ax = plt.subplots(figsize=(10, 4))
df[['Age', 'Fare']].plot(kind='box', ax=ax)
plt.show()

Density Plot (kind='density' or kind='kde')¶

Kernel Density Estimation shows a smooth distribution curve:

url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/drinks.csv'
df = pd.read_csv(url)

fig, ax = plt.subplots(figsize=(10, 4))

# Histogram with density overlay
df['beer_servings'].plot(kind='hist', bins=20, density=True, alpha=0.5, ax=ax)
df['beer_servings'].plot(kind='density', ax=ax)

ax.set_xlabel('Beer Servings')
ax.set_title('Distribution of Beer Servings')
plt.show()

Scatter Plot (kind='scatter')¶

Requires both x and y parameters:

df = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/datasets/mtcars.csv')

fig, ax = plt.subplots(figsize=(8, 5))
df.plot(
    kind='scatter',
    x='wt',
    y='mpg',
    ax=ax
)
ax.set_title('Weight vs MPG')
plt.show()

Scatter with Size and Color¶

fig, ax = plt.subplots(figsize=(10, 6))
df.plot(
    kind='scatter',
    x='wt',
    y='mpg',
    s=df['hp'],           # Point size by horsepower
    c='disp',             # Color by displacement
    colormap='Blues',
    alpha=0.6,
    ax=ax
)
ax.set_title('Weight vs MPG (size=HP, color=Displacement)')
plt.show()

Area Plot (kind='area')¶

Stacked area chart for composition over time:

df = pd.DataFrame({
    'A': np.random.rand(10) * 10,
    'B': np.random.rand(10) * 10,
    'C': np.random.rand(10) * 10
}, index=pd.date_range('2024-01-01', periods=10))

fig, ax = plt.subplots(figsize=(10, 5))
df.plot(kind='area', ax=ax, alpha=0.5)
ax.set_title('Stacked Area Plot')
plt.show()

Unstacked Area¶

df.plot(kind='area', stacked=False, alpha=0.4)

Pie Chart (kind='pie')¶

For Series data showing proportions:

data = pd.Series([30, 25, 20, 15, 10], 
                 index=['A', 'B', 'C', 'D', 'E'])

fig, ax = plt.subplots(figsize=(6, 6))
data.plot(kind='pie', ax=ax, autopct='%1.1f%%')
ax.set_ylabel('')  # Remove default ylabel
ax.set_title('Category Proportions')
plt.show()

Hexbin Plot (kind='hexbin')¶

For large scatter datasets, hexbin aggregates points:

n = 10000
df = pd.DataFrame({
    'x': np.random.randn(n),
    'y': np.random.randn(n)
})

fig, ax = plt.subplots(figsize=(8, 6))
df.plot(
    kind='hexbin',
    x='x',
    y='y',
    gridsize=25,
    cmap='YlOrRd',
    ax=ax
)
ax.set_title('Hexbin Density Plot')
plt.show()

Choosing the Right Plot Type¶

Data Type	Goal	Recommended kind
Time series	Show trend	`'line'`
Categories	Compare counts	`'bar'` or `'barh'`
Single numeric	Show distribution	`'hist'` or `'kde'`
Single numeric	Summary stats	`'box'`
Two numeric	Show relationship	`'scatter'`
Two numeric (large n)	Density	`'hexbin'`
Proportions	Part of whole	`'pie'`
Multiple series	Composition	`'area'`

Quick Reference¶

# Line (default)
df.plot()
df.plot(kind='line')

# Bar
df['col'].value_counts().plot(kind='bar')
df['col'].value_counts().plot(kind='barh')

# Histogram
df['col'].plot(kind='hist', bins=20)

# Box
df['col'].plot(kind='box')
df[['col1', 'col2']].plot(kind='box')

# Density
df['col'].plot(kind='density')
df['col'].plot(kind='kde')

# Scatter
df.plot(kind='scatter', x='col1', y='col2')

# Area
df.plot(kind='area')

# Pie
series.plot(kind='pie')

# Hexbin
df.plot(kind='hexbin', x='col1', y='col2')