Skip to content

Box Plot Anatomy

Understanding the visual components of a box plot is essential for proper interpretation of distributional data.

Visual Components

A box plot consists of five primary visual elements that summarize the distribution.

1. The Box

The rectangular box spans from the first quartile (Q1, 25th percentile) to the third quartile (Q3, 75th percentile). This range is called the Interquartile Range (IQR).

IQR = Q3 - Q1

2. The Median Line

The horizontal line inside the box represents the median (Q2, 50th percentile). Its position within the box indicates skewness.

3. The Whiskers

Vertical lines extend from the box to show the range of non-outlier data. By default, whiskers extend to 1.5 × IQR from the box edges.

lower_whisker = Q1 - 1.5 * IQR
upper_whisker = Q3 + 1.5 * IQR

4. The Fliers (Outliers)

Points beyond the whiskers are plotted individually as outliers (fliers). These represent extreme values in the distribution.

5. The Caps

Short horizontal lines at whisker ends mark the extent of non-outlier data.

Statistical Interpretation

The box plot encodes the five-number summary plus outlier detection.

1. Five-Number Summary

import numpy as np

data = np.random.normal(100, 15, 200)

minimum = np.min(data)
q1 = np.percentile(data, 25)
median = np.percentile(data, 50)
q3 = np.percentile(data, 75)
maximum = np.max(data)

2. Spread Indicators

The box width (IQR) shows the middle 50% of data. Tall boxes indicate high variability; short boxes indicate consistency.

3. Skewness Detection

When the median line is not centered in the box, the distribution is skewed. Median closer to Q1 indicates right skew; closer to Q3 indicates left skew.

Comparison with Histogram

Box plots and histograms both show distributions but emphasize different aspects.

1. Box Plot Strengths

Box plots excel at comparing multiple distributions side-by-side, identifying outliers, and showing quartile information compactly.

2. Histogram Strengths

Histograms reveal the shape of the distribution, multimodality, and density patterns that box plots cannot show.

3. Combined View

import matplotlib.pyplot as plt
import numpy as np

np.random.seed(42)
data = np.random.normal(100, 15, 200)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))

ax1.hist(data, bins=20, edgecolor='black', alpha=0.7)
ax1.set_title('Histogram')

ax2.boxplot(data)
ax2.set_title('Box Plot')

plt.tight_layout()
plt.show()