Skip to content

Basic Bar Chart

Bar charts display categorical data with rectangular bars, where bar length represents the value for each category.

Mental Model

ax.bar(categories, values) draws one rectangle per category, with height proportional to the value. Think of it as a scatter plot for categorical x-axes -- instead of dots, you get bars that make magnitude differences visually obvious. Use ax.barh() for horizontal bars when category labels are long.

When to Use Bar Charts

Bar charts are for discrete comparison, not continuous relationships. Choose your plot type by the question you are answering:

Question Plot
How do categories compare? Bar chart
How does a value change over a continuous domain? Line plot
What is the relationship between two variables? Scatter plot
How is a single variable distributed? Histogram

If your x-axis is categorical (names, groups, labels), bars are almost always the right choice.

Why Bar Charts Are So Effective

Bar charts encode value as length, which is one of the most accurate visual channels humans have. We compare lengths far more precisely than areas (pie charts), angles, or colors. This perceptual advantage is why bar charts remain the default for categorical comparison.

Readability Limits

  • More than ~10 categories → the chart becomes cluttered and hard to read. Consider grouping, filtering, or switching to a horizontal layout.
  • Long category labels → use ax.barh() (horizontal bars) so labels read naturally left-to-right.

Simple Bar Chart

Create a basic vertical bar chart with ax.bar().

1. Import and Setup

python import matplotlib.pyplot as plt import numpy as np

2. Define Categories and Values

python categories = ['A', 'B', 'C', 'D', 'E'] values = [23, 45, 56, 78, 32]

3. Create Bar Chart

python fig, ax = plt.subplots() ax.bar(categories, values) ax.set_xlabel('Category') ax.set_ylabel('Value') ax.set_title('Basic Bar Chart') plt.show()

Horizontal Bar Chart

Use ax.barh() for horizontal orientation.

1. Basic Horizontal Bars

python fig, ax = plt.subplots() ax.barh(categories, values) ax.set_xlabel('Value') ax.set_ylabel('Category') plt.show()

2. Long Category Names

```python categories = ['Category Alpha', 'Category Beta', 'Category Gamma', 'Category Delta', 'Category Epsilon'] values = [23, 45, 56, 78, 32]

fig, ax = plt.subplots() ax.barh(categories, values) plt.tight_layout() plt.show() ```

3. Reversed Order

python fig, ax = plt.subplots() ax.barh(categories[::-1], values[::-1]) plt.show()

Pandas Plot Method

Use DataFrame's built-in plotting for quick visualizations.

1. Single Column

```python import pandas as pd

data = {'Courses': ['Language', 'History', 'Math', 'Chemistry', 'Physics'], 'Number of Teachers': [7, 3, 9, 3, 4]} df = pd.DataFrame(data).set_index('Courses')

fig, ax = plt.subplots(figsize=(12, 3)) df.plot(kind='bar', ax=ax) ax.spines['right'].set_visible(False) ax.spines['top'].set_visible(False) plt.show() ```

2. Multiple Columns

```python data = {'Student': ['Brandon', 'Vanessa', 'Daniel', 'Kevin', 'William'], 'Midterm': [85, 60, 60, 65, 100], 'Final': [90, 90, 65, 80, 95]} df = pd.DataFrame(data).set_index('Student')

fig, ax = plt.subplots(figsize=(12, 3)) df.plot(kind='bar', ax=ax) ax.spines['right'].set_visible(False) ax.spines['top'].set_visible(False) plt.show() ```

3. Horizontal with Pandas

python fig, ax = plt.subplots(figsize=(12, 3)) df.plot(kind='barh', ax=ax) ax.spines['right'].set_visible(False) ax.spines['top'].set_visible(False) plt.show()

Numeric X-Axis

Use numeric positions instead of categorical labels.

1. Integer Positions

```python x = np.arange(5) values = [23, 45, 56, 78, 32]

fig, ax = plt.subplots() ax.bar(x, values) ax.set_xticks(x) ax.set_xticklabels(['A', 'B', 'C', 'D', 'E']) plt.show() ```

2. Custom Spacing

```python x = [0, 1, 3, 4, 6] # Non-uniform spacing values = [23, 45, 56, 78, 32]

fig, ax = plt.subplots() ax.bar(x, values, width=0.8) plt.show() ```

3. Centered Labels

python x = np.arange(5) fig, ax = plt.subplots() ax.bar(x, values) ax.set_xticks(x) ax.set_xticklabels(['A', 'B', 'C', 'D', 'E'], ha='center') plt.show()

Data Sources

Various ways to provide data to bar charts.

1. Lists

python categories = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri'] values = [10, 25, 15, 30, 20] ax.bar(categories, values)

2. NumPy Arrays

python x = np.arange(5) values = np.array([10, 25, 15, 30, 20]) ax.bar(x, values)

3. Pandas DataFrame

```python import pandas as pd

df = pd.DataFrame({ 'day': ['Mon', 'Tue', 'Wed', 'Thu', 'Fri'], 'sales': [10, 25, 15, 30, 20] })

fig, ax = plt.subplots() ax.bar(df['day'], df['sales']) plt.show() ```

Adding Value Labels

Display values on top of bars.

1. Text Annotation

```python fig, ax = plt.subplots() bars = ax.bar(categories, values)

for bar, value in zip(bars, values): ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1, str(value), ha='center', va='bottom')

plt.show() ```

2. Using bar_label

python fig, ax = plt.subplots() bars = ax.bar(categories, values) ax.bar_label(bars) plt.show()

3. Formatted Labels

python fig, ax = plt.subplots() bars = ax.bar(categories, values) ax.bar_label(bars, fmt='%.1f', padding=3) plt.show()

Sorted Bar Charts

Order bars by value for better visualization.

1. Ascending Order

```python categories = ['A', 'B', 'C', 'D', 'E'] values = [23, 45, 56, 78, 32]

sorted_idx = np.argsort(values) sorted_categories = [categories[i] for i in sorted_idx] sorted_values = [values[i] for i in sorted_idx]

fig, ax = plt.subplots() ax.barh(sorted_categories, sorted_values) plt.show() ```

2. Descending Order

```python sorted_idx = np.argsort(values)[::-1] sorted_categories = [categories[i] for i in sorted_idx] sorted_values = [values[i] for i in sorted_idx]

fig, ax = plt.subplots() ax.barh(sorted_categories, sorted_values) plt.show() ```

3. Pandas Sorting

```python df = pd.DataFrame({'category': categories, 'value': values}) df_sorted = df.sort_values('value', ascending=True)

fig, ax = plt.subplots() ax.barh(df_sorted['category'], df_sorted['value']) plt.show() ```

Practical Example

Create a complete bar chart with styling.

1. Prepare Data

python products = ['Product A', 'Product B', 'Product C', 'Product D', 'Product E'] sales = [150, 230, 180, 310, 275]

2. Create Styled Chart

```python fig, ax = plt.subplots(figsize=(10, 6))

bars = ax.bar(products, sales, color='steelblue', edgecolor='navy', linewidth=1.5)

ax.set_xlabel('Product', fontsize=12) ax.set_ylabel('Sales ($K)', fontsize=12) ax.set_title('Quarterly Sales by Product', fontsize=14) ax.set_ylim(0, max(sales) * 1.15) ```

3. Add Annotations

```python ax.bar_label(bars, fmt='$%.0fK', padding=3) ax.spines['top'].set_visible(False) ax.spines['right'].set_visible(False) ax.grid(axis='y', alpha=0.3)

plt.tight_layout() plt.show() ```


Runnable Example: seaborn_categorical_plots.py

```python """ Tutorial 04: Categorical Plots in Seaborn

This tutorial covers plots designed specifically for categorical data. These plots help compare distributions and values across different categories.

Learning Objectives: - Create box plots and violin plots - Use strip plots and swarm plots - Build point plots and count plots - Understand when to use each categorical plot - Combine multiple plot types

Author: Educational Python Package Level: Intermediate Prerequisites: Tutorial 01-03 """

import seaborn as sns import matplotlib.pyplot as plt import pandas as pd import numpy as np

if name == "main":

sns.set_style("whitegrid")
sns.set_context("notebook")

# =============================================================================
# SECTION 1: BOX PLOTS - SHOWING DISTRIBUTION SUMMARY
# =============================================================================

"""
BOX PLOTS display the five-number summary:
- Minimum (excluding outliers)
- First quartile (Q1, 25th percentile)
- Median (Q2, 50th percentile)
- Third quartile (Q3, 75th percentile)
- Maximum (excluding outliers)
- Outliers shown as individual points

Box anatomy:
- Box: Interquartile range (IQR = Q3 - Q1)
- Line in box: Median
- Whiskers: Extend to 1.5 * IQR
- Points beyond whiskers: Outliers

Function: sns.boxplot()
"""

print("="*80)
print("SECTION 1: BOX PLOTS")
print("="*80)

tips = sns.load_dataset('tips')

# Example 1.1: Basic box plot
plt.figure(figsize=(10, 6))
sns.boxplot(data=tips, x='day', y='total_bill')
plt.title('Box Plot: Total Bill by Day', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Basic box plot created")

# Example 1.2: Box plot with grouping
plt.figure(figsize=(12, 6))
sns.boxplot(data=tips, x='day', y='total_bill', hue='time')
plt.title('Box Plot with Grouping: Bill by Day and Time', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.legend(title='Time of Day')
plt.tight_layout()
plt.show()

print("✓ Grouped box plot created")

# Example 1.3: Horizontal box plot
plt.figure(figsize=(10, 8))
sns.boxplot(data=tips, y='day', x='total_bill', orient='h')
plt.title('Horizontal Box Plot', fontsize=14, fontweight='bold')
plt.ylabel('Day of Week', fontsize=12)
plt.xlabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Horizontal box plot created")

# Example 1.4: Customized box plot
plt.figure(figsize=(10, 6))
sns.boxplot(
    data=tips, 
    x='day', 
    y='total_bill',
    palette='Set2',
    linewidth=2.5,
    width=0.6,  # Width of boxes
    fliersize=5  # Size of outlier points
)
plt.title('Customized Box Plot', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Customized box plot created\n")

# =============================================================================
# SECTION 2: VIOLIN PLOTS - DISTRIBUTION SHAPE
# =============================================================================

"""
VIOLIN PLOTS combine box plots with KDE plots. They show:
- Distribution shape (width of violin)
- Quartiles and median (inner box)
- Full data density

Advantages over box plots:
- Show bimodal distributions
- Display distribution shape
- More informative about data density

Function: sns.violinplot()
"""

print("="*80)
print("SECTION 2: VIOLIN PLOTS")
print("="*80)

# Example 2.1: Basic violin plot
plt.figure(figsize=(10, 6))
sns.violinplot(data=tips, x='day', y='total_bill')
plt.title('Violin Plot: Total Bill by Day', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Basic violin plot created")

# Example 2.2: Violin plot with inner representation options
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Box: Shows box plot inside
sns.violinplot(data=tips, x='day', y='total_bill', inner='box', ax=axes[0, 0])
axes[0, 0].set_title("inner='box' - Box plot inside")

# Quartile: Shows quartile lines
sns.violinplot(data=tips, x='day', y='total_bill', inner='quartile', ax=axes[0, 1])
axes[0, 1].set_title("inner='quartile' - Quartile lines")

# Point: Shows all data points
sns.violinplot(data=tips, x='day', y='total_bill', inner='point', ax=axes[1, 0])
axes[1, 0].set_title("inner='point' - All data points")

# Stick: Shows each observation
sns.violinplot(data=tips, x='day', y='total_bill', inner='stick', ax=axes[1, 1])
axes[1, 1].set_title("inner='stick' - Individual observations")

plt.tight_layout()
plt.show()

print("✓ Violin plots with different inner types created")

# Example 2.3: Split violin plot (comparing two groups)
plt.figure(figsize=(10, 6))
sns.violinplot(
    data=tips, 
    x='day', 
    y='total_bill', 
    hue='sex',
    split=True,  # Split violin in half for comparison
    palette='Set2'
)
plt.title('Split Violin Plot: Comparing by Gender', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.legend(title='Gender')
plt.tight_layout()
plt.show()

print("✓ Split violin plot created\n")

# =============================================================================
# SECTION 3: STRIP AND SWARM PLOTS - INDIVIDUAL POINTS
# =============================================================================

"""
STRIP PLOTS show all individual data points, optionally with jitter.
SWARM PLOTS arrange points to avoid overlap, showing density.

When to use:
- Strip: Simple, fast, good for small-medium datasets
- Swarm: More informative about density, slower for large datasets

Functions: sns.stripplot(), sns.swarmplot()
"""

print("="*80)
print("SECTION 3: STRIP AND SWARM PLOTS")
print("="*80)

# Example 3.1: Strip plot
plt.figure(figsize=(10, 6))
sns.stripplot(data=tips, x='day', y='total_bill', alpha=0.5)
plt.title('Strip Plot: Individual Data Points', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Strip plot created")

# Example 3.2: Strip plot with jitter
plt.figure(figsize=(10, 6))
sns.stripplot(
    data=tips, 
    x='day', 
    y='total_bill',
    jitter=True,  # Add random noise to x-position
    alpha=0.5,
    size=4
)
plt.title('Strip Plot with Jitter', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Strip plot with jitter created")

# Example 3.3: Swarm plot
plt.figure(figsize=(10, 6))
sns.swarmplot(data=tips, x='day', y='total_bill', size=4)
plt.title('Swarm Plot: Non-Overlapping Points', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Swarm plot created")

# Example 3.4: Swarm plot with grouping
plt.figure(figsize=(12, 6))
sns.swarmplot(data=tips, x='day', y='total_bill', hue='time', size=4)
plt.title('Swarm Plot with Grouping', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.legend(title='Time of Day')
plt.tight_layout()
plt.show()

print("✓ Grouped swarm plot created\n")

# =============================================================================
# SECTION 4: COMBINING PLOT TYPES
# =============================================================================

"""
Combining different plot types gives the most complete picture.
Common combinations:
- Box + Strip: Shows summary and individual points
- Violin + Swarm: Shows distribution and all data
"""

print("="*80)
print("SECTION 4: COMBINING PLOT TYPES")
print("="*80)

# Example 4.1: Box plot with strip plot overlay
plt.figure(figsize=(10, 6))
sns.boxplot(data=tips, x='day', y='total_bill', color='lightgray', width=0.5)
sns.stripplot(data=tips, x='day', y='total_bill', color='black', alpha=0.3, size=3)
plt.title('Box Plot with Strip Plot Overlay', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Combined box and strip plot created")

# Example 4.2: Violin plot with swarm plot overlay
plt.figure(figsize=(10, 6))
sns.violinplot(data=tips, x='day', y='total_bill', inner=None, color='lightblue', alpha=0.6)
sns.swarmplot(data=tips, x='day', y='total_bill', color='black', alpha=0.5, size=3)
plt.title('Violin Plot with Swarm Plot Overlay', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.tight_layout()
plt.show()

print("✓ Combined violin and swarm plot created\n")

# =============================================================================
# KEY TAKEAWAYS
# =============================================================================

"""
🎯 KEY TAKEAWAYS:

1. BOX PLOTS: Best for showing 5-number summary and outliers
2. VIOLIN PLOTS: Show distribution shape, good for multimodal data
3. STRIP PLOTS: Show all individual points, use jitter to reduce overlap
4. SWARM PLOTS: Like strip plots but automatically avoid overlap
5. COMBINE plots for comprehensive visualization
6. Use 'hue' for grouping, 'split' for split violins
7. Choose plot based on: data size, question, audience

NEXT: Tutorial 05 - Regression Plots
"""

print("="*80)
print("TUTORIAL 04 COMPLETE!")
print("="*80)

```

In Machine Learning

Bar charts are the standard way to compare model performance metrics side by side: accuracy, precision, recall, and F1 score across multiple models. Grouped bar charts compare metrics within models; stacked bars show class-level contributions to overall performance.


Exercises

Exercise 1. Create a vertical bar chart showing the populations (in millions) of five countries: China (1412), India (1408), USA (332), Indonesia (276), and Brazil (215). Add value labels on top of each bar using ax.bar_label().

Solution to Exercise 1
import matplotlib.pyplot as plt

countries = ['China', 'India', 'USA', 'Indonesia', 'Brazil']
populations = [1412, 1408, 332, 276, 215]

fig, ax = plt.subplots(figsize=(8, 5))
bars = ax.bar(countries, populations, color='steelblue', edgecolor='navy')
ax.bar_label(bars, padding=3)
ax.set_ylabel('Population (millions)')
ax.set_title('Population by Country')
plt.tight_layout()
plt.show()

Exercise 2. Create a horizontal bar chart of the same data from Exercise 1 using ax.barh(). Sort the bars by population in ascending order (smallest at top) and use a different color for each bar.

Solution to Exercise 2
import matplotlib.pyplot as plt

countries = ['China', 'India', 'USA', 'Indonesia', 'Brazil']
populations = [1412, 1408, 332, 276, 215]

sorted_pairs = sorted(zip(populations, countries))
sorted_pops, sorted_countries = zip(*sorted_pairs)

colors = ['#e74c3c', '#3498db', '#2ecc71', '#f39c12', '#9b59b6']

fig, ax = plt.subplots(figsize=(8, 5))
ax.barh(sorted_countries, sorted_pops, color=colors)
ax.set_xlabel('Population (millions)')
ax.set_title('Population by Country (Sorted)')
plt.tight_layout()
plt.show()

Exercise 3. Create side-by-side vertical and horizontal bar charts in a 1x2 subplot layout. Use fruit sales data: categories ['Apples', 'Bananas', 'Cherries', 'Dates'] with values [45, 62, 28, 51]. Style the vertical chart with color='coral' and the horizontal chart with color='steelblue'. Add grid lines on the value axis for both.

Solution to Exercise 3
import matplotlib.pyplot as plt

categories = ['Apples', 'Bananas', 'Cherries', 'Dates']
values = [45, 62, 28, 51]

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

ax1.bar(categories, values, color='coral', edgecolor='black')
ax1.set_ylabel('Sales')
ax1.set_title('Vertical Bar Chart')
ax1.grid(axis='y', alpha=0.3)

ax2.barh(categories, values, color='steelblue', edgecolor='black')
ax2.set_xlabel('Sales')
ax2.set_title('Horizontal Bar Chart')
ax2.grid(axis='x', alpha=0.3)

plt.tight_layout()
plt.show()