Skip to content

Kernel Density Estimation with imshow

Kernel Density Estimation (KDE) provides a smooth estimate of the probability density function from sample data. This document covers how to visualize 2D KDE using imshow.

Mental Model

KDE turns scattered data points into a smooth density surface by placing a bell curve (kernel) at each point and summing them up. The result is like a soft-focus photograph of your scatter plot -- dense clusters glow brightly while sparse regions fade. Visualize the 2D density with imshow to reveal the underlying probability landscape.

In the density estimation spectrum, KDE is the continuous end: where hist2d and hexbin produce piecewise-constant approximations, KDE produces a smooth function that can be evaluated at any point and integrated analytically. The cost is computation — KDE scales with the number of data points, while binning methods scale with the number of bins.

KDE is an Estimate, Not Truth

KDE is a model of the data, not the data itself. The bandwidth parameter controls how much structure is revealed:

  • Small bandwidth → noisy, overfitting (shows artifacts as peaks)
  • Large bandwidth → oversmoothed, underfitting (hides real clusters)

Choosing bandwidth = choosing how much structure to believe. Always compare multiple bandwidths before interpreting peaks as real features.

Interpreting KDE Output

Visual feature Meaning
Bright region High probability — data is concentrated here
Smooth hill A cluster
Multiple peaks Multimodal distribution — distinct subgroups
Flat area Uniform distribution — no structure
Contour lines close together Sharp density gradient — abrupt boundary

What is KDE?

KDE estimates the probability density function by placing a kernel (typically Gaussian) at each data point and summing them:

\[\hat{f}(x) = \frac{1}{nh} \sum_{i=1}^{n} K\left(\frac{x - x_i}{h}\right)\]

where:

  • \(K\) is the kernel function
  • \(h\) is the bandwidth (smoothing parameter)
  • \(n\) is the number of data points

Setup

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

np.random.seed(42) ```


Basic 2D KDE

1. Generate Sample Data

```python

Generate bivariate data

n = 500 mean = [0, 0] cov = [[1, 0.5], [0.5, 1]] data = np.random.multivariate_normal(mean, cov, n)

x = data[:, 0] y = data[:, 1]

fig, ax = plt.subplots() ax.scatter(x, y, alpha=0.5, s=10) ax.set_title('Raw Data Points') ax.set_xlabel('x') ax.set_ylabel('y') plt.show() ```

2. Compute KDE

```python

Create grid

xmin, xmax = x.min() - 1, x.max() + 1 ymin, ymax = y.min() - 1, y.max() + 1 xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] positions = np.vstack([xx.ravel(), yy.ravel()])

Compute KDE

kernel = stats.gaussian_kde(np.vstack([x, y])) density = kernel(positions).reshape(xx.shape) ```

3. Visualize with imshow

python fig, ax = plt.subplots(figsize=(8, 6)) im = ax.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='viridis', aspect='auto' ) ax.set_xlabel('x') ax.set_ylabel('y') ax.set_title('2D Kernel Density Estimation') plt.colorbar(im, ax=ax, label='Density') plt.show()


KDE with Data Points Overlay

1. Basic Overlay

```python fig, ax = plt.subplots(figsize=(8, 6))

KDE background

im = ax.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='Blues', aspect='auto' )

Scatter points

ax.scatter(x, y, c='red', s=5, alpha=0.3)

ax.set_xlabel('x') ax.set_ylabel('y') ax.set_title('KDE with Data Points') plt.colorbar(im, ax=ax, label='Density') plt.show() ```

2. Transparent KDE

```python fig, ax = plt.subplots(figsize=(8, 6))

ax.scatter(x, y, c='black', s=10, alpha=0.5, label='Data') im = ax.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='Reds', aspect='auto', alpha=0.6 )

ax.set_xlabel('x') ax.set_ylabel('y') ax.set_title('Transparent KDE Overlay') plt.colorbar(im, ax=ax, label='Density') plt.show() ```


Bandwidth Selection

The bandwidth parameter controls smoothing. Larger values = smoother estimates.

1. Bandwidth Comparison

```python fig, axes = plt.subplots(1, 3, figsize=(15, 4))

bandwidths = [0.1, 0.3, 0.8]

for ax, bw in zip(axes, bandwidths): kernel = stats.gaussian_kde(np.vstack([x, y]), bw_method=bw) density = kernel(positions).reshape(xx.shape)

im = ax.imshow(
    density.T,
    extent=[xmin, xmax, ymin, ymax],
    origin='lower',
    cmap='viridis',
    aspect='auto'
)
ax.set_title(f'Bandwidth = {bw}')
plt.colorbar(im, ax=ax)

plt.suptitle('Effect of Bandwidth on KDE', fontsize=14) plt.tight_layout() plt.show() ```

2. Scott's Rule vs Silverman's Rule

```python fig, axes = plt.subplots(1, 3, figsize=(15, 4))

methods = ['scott', 'silverman', 0.3] titles = ["Scott's Rule", "Silverman's Rule", "Manual (0.3)"]

for ax, method, title in zip(axes, methods, titles): kernel = stats.gaussian_kde(np.vstack([x, y]), bw_method=method) density = kernel(positions).reshape(xx.shape)

im = ax.imshow(
    density.T,
    extent=[xmin, xmax, ymin, ymax],
    origin='lower',
    cmap='viridis',
    aspect='auto'
)
ax.set_title(title)
plt.colorbar(im, ax=ax)

plt.tight_layout() plt.show() ```


Different Distributions

1. Clustered Data

```python

Generate clustered data

np.random.seed(42) n1 = 300 n2 = 200

cluster1 = np.random.multivariate_normal([2, 2], [[0.5, 0], [0, 0.5]], n1) cluster2 = np.random.multivariate_normal([-1, -1], [[0.3, 0.2], [0.2, 0.3]], n2) data_clustered = np.vstack([cluster1, cluster2])

x_c = data_clustered[:, 0] y_c = data_clustered[:, 1]

Compute KDE

xmin, xmax = x_c.min() - 1, x_c.max() + 1 ymin, ymax = y_c.min() - 1, y_c.max() + 1 xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] positions = np.vstack([xx.ravel(), yy.ravel()])

kernel = stats.gaussian_kde(np.vstack([x_c, y_c])) density = kernel(positions).reshape(xx.shape)

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

Scatter

axes[0].scatter(x_c, y_c, alpha=0.5, s=10) axes[0].set_title('Scatter Plot')

KDE

im = axes[1].imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='hot', aspect='auto' ) axes[1].set_title('KDE') plt.colorbar(im, ax=axes[1])

plt.suptitle('Clustered Data', fontsize=14) plt.tight_layout() plt.show() ```

2. Ring Distribution

```python

Generate ring-shaped data

n = 1000 theta = np.random.uniform(0, 2 * np.pi, n) r = np.random.normal(2, 0.2, n) x_ring = r * np.cos(theta) y_ring = r * np.sin(theta)

Compute KDE

xmin, xmax = -4, 4 ymin, ymax = -4, 4 xx, yy = np.mgrid[xmin:xmax:150j, ymin:ymax:150j] positions = np.vstack([xx.ravel(), yy.ravel()])

kernel = stats.gaussian_kde(np.vstack([x_ring, y_ring]), bw_method=0.15) density = kernel(positions).reshape(xx.shape)

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

axes[0].scatter(x_ring, y_ring, alpha=0.3, s=5) axes[0].set_aspect('equal') axes[0].set_title('Scatter Plot')

im = axes[1].imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='magma', aspect='equal' ) axes[1].set_title('KDE') plt.colorbar(im, ax=axes[1])

plt.suptitle('Ring Distribution', fontsize=14) plt.tight_layout() plt.show() ```


Comparison Methods

KDE vs hist2d vs hexbin

```python np.random.seed(42) n = 2000 x = np.random.normal(0, 1, n) y = x + np.random.normal(0, 0.5, n)

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

hist2d

h = axes[0].hist2d(x, y, bins=30, cmap='viridis') axes[0].set_title('hist2d') plt.colorbar(h[3], ax=axes[0])

hexbin

hb = axes[1].hexbin(x, y, gridsize=20, cmap='viridis') axes[1].set_title('hexbin') plt.colorbar(hb, ax=axes[1])

KDE

xmin, xmax = x.min() - 1, x.max() + 1 ymin, ymax = y.min() - 1, y.max() + 1 xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] positions = np.vstack([xx.ravel(), yy.ravel()]) kernel = stats.gaussian_kde(np.vstack([x, y])) density = kernel(positions).reshape(xx.shape)

im = axes[2].imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='viridis', aspect='auto' ) axes[2].set_title('KDE (imshow)') plt.colorbar(im, ax=axes[2])

plt.suptitle('Density Visualization Methods', fontsize=14) plt.tight_layout() plt.show() ```

Method Comparison Table

Method Pros Cons
hist2d Fast, simple Bin edge artifacts
hexbin Better packing, no axis alignment Less familiar
KDE Smooth, accurate Computationally expensive

Combining KDE with Contours

1. KDE with Contour Lines

```python fig, ax = plt.subplots(figsize=(8, 6))

im = ax.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='Blues', aspect='auto' )

Add contour lines

cs = ax.contour( xx, yy, density, levels=5, colors='darkblue', linewidths=0.8 ) ax.clabel(cs, inline=True, fontsize=8, fmt='%.3f')

ax.set_xlabel('x') ax.set_ylabel('y') ax.set_title('KDE with Contour Lines') plt.colorbar(im, ax=ax, label='Density') plt.show() ```

2. KDE with Percentile Contours

```python

Calculate percentile levels

levels_pct = np.percentile(density.ravel(), [10, 25, 50, 75, 90])

fig, ax = plt.subplots(figsize=(8, 6))

im = ax.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='YlOrRd', aspect='auto' )

cs = ax.contour(xx, yy, density, levels=levels_pct, colors='black', linewidths=0.8)

ax.set_xlabel('x') ax.set_ylabel('y') ax.set_title('KDE with Percentile Contours') plt.colorbar(im, ax=ax, label='Density') plt.show() ```


Marginal Distributions

KDE with Marginal Histograms

```python from mpl_toolkits.axes_grid1 import make_axes_locatable

np.random.seed(42) n = 500 data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], n) x, y = data[:, 0], data[:, 1]

Compute KDE

xmin, xmax = x.min() - 0.5, x.max() + 0.5 ymin, ymax = y.min() - 0.5, y.max() + 0.5 xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] positions = np.vstack([xx.ravel(), yy.ravel()]) kernel = stats.gaussian_kde(np.vstack([x, y])) density = kernel(positions).reshape(xx.shape)

Create figure with marginals

fig, ax_main = plt.subplots(figsize=(8, 8))

divider = make_axes_locatable(ax_main) ax_top = divider.append_axes("top", 1.2, pad=0.1, sharex=ax_main) ax_right = divider.append_axes("right", 1.2, pad=0.1, sharey=ax_main)

Main KDE plot

im = ax_main.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='Blues', aspect='auto' ) ax_main.set_xlabel('x') ax_main.set_ylabel('y')

Top marginal

ax_top.hist(x, bins=30, density=True, alpha=0.7, color='steelblue') x_kde = np.linspace(xmin, xmax, 200) ax_top.plot(x_kde, stats.gaussian_kde(x)(x_kde), 'darkblue', linewidth=2) ax_top.set_ylabel('Density') plt.setp(ax_top.get_xticklabels(), visible=False)

Right marginal

ax_right.hist(y, bins=30, density=True, alpha=0.7, color='steelblue', orientation='horizontal') y_kde = np.linspace(ymin, ymax, 200) ax_right.plot(stats.gaussian_kde(y)(y_kde), y_kde, 'darkblue', linewidth=2) ax_right.set_xlabel('Density') plt.setp(ax_right.get_yticklabels(), visible=False)

plt.suptitle('2D KDE with Marginal Distributions', fontsize=14, y=1.02) plt.show() ```


Colorbar Customization

1. Discrete Colorbar

```python fig, ax = plt.subplots(figsize=(8, 6))

levels = np.linspace(density.min(), density.max(), 10) im = ax.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='viridis', aspect='auto' )

cbar = plt.colorbar(im, ax=ax, ticks=levels[::2]) cbar.set_label('Density')

ax.set_title('KDE with Custom Colorbar') plt.show() ```

2. Logarithmic Scale

```python from matplotlib.colors import LogNorm

Ensure positive values

density_pos = np.maximum(density, 1e-10)

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

Linear scale

im1 = axes[0].imshow( density_pos.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='viridis', aspect='auto' ) axes[0].set_title('Linear Scale') plt.colorbar(im1, ax=axes[0])

Log scale

im2 = axes[1].imshow( density_pos.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='viridis', aspect='auto', norm=LogNorm() ) axes[1].set_title('Logarithmic Scale') plt.colorbar(im2, ax=axes[1])

plt.tight_layout() plt.show() ```


Full Example: Publication Quality

```python np.random.seed(42)

Generate data with correlation

n = 1000 rho = 0.7 mean = [0, 0] cov = [[1, rho], [rho, 1]] data = np.random.multivariate_normal(mean, cov, n) x, y = data[:, 0], data[:, 1]

Compute KDE

xmin, xmax = -4, 4 ymin, ymax = -4, 4 xx, yy = np.mgrid[xmin:xmax:150j, ymin:ymax:150j] positions = np.vstack([xx.ravel(), yy.ravel()]) kernel = stats.gaussian_kde(np.vstack([x, y])) density = kernel(positions).reshape(xx.shape)

Create figure

fig, ax = plt.subplots(figsize=(9, 7))

KDE heatmap

im = ax.imshow( density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='Blues', aspect='equal' )

Contour lines

levels = np.percentile(density.ravel(), [20, 40, 60, 80, 95]) cs = ax.contour(xx, yy, density, levels=levels, colors='navy', linewidths=0.8, alpha=0.7)

Scatter subset

idx = np.random.choice(n, 100, replace=False) ax.scatter(x[idx], y[idx], c='darkred', s=10, alpha=0.5, zorder=5)

Labels and title

ax.set_xlabel('\(X\)', fontsize=12) ax.set_ylabel('\(Y\)', fontsize=12) ax.set_title(f'Bivariate Normal Distribution (\(\\rho = {rho}\))', fontsize=14, fontweight='bold')

Colorbar

cbar = plt.colorbar(im, ax=ax, shrink=0.85) cbar.set_label('Estimated Density', fontsize=11)

ax.tick_params(labelsize=10) plt.tight_layout() plt.show() ```


Summary

Step Code
Create grid xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
Stack positions positions = np.vstack([xx.ravel(), yy.ravel()])
Compute KDE kernel = stats.gaussian_kde(np.vstack([x, y]))
Evaluate density = kernel(positions).reshape(xx.shape)
Display ax.imshow(density.T, extent=[...], origin='lower')

Exercises

Exercise 1. Write code that generates 500 points from a bivariate normal distribution, computes a 2D KDE using scipy.stats.gaussian_kde, and visualizes it with ax.imshow(). Include origin='lower', an appropriate extent, and a colorbar.

Solution to Exercise 1

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

np.random.seed(42) data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], 500) x, y = data[:, 0], data[:, 1]

xmin, xmax = x.min() - 1, x.max() + 1 ymin, ymax = y.min() - 1, y.max() + 1 xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] positions = np.vstack([xx.ravel(), yy.ravel()])

kernel = stats.gaussian_kde(np.vstack([x, y])) density = kernel(positions).reshape(xx.shape)

fig, ax = plt.subplots(figsize=(8, 6)) im = ax.imshow(density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='viridis', aspect='auto') ax.set_xlabel('X') ax.set_ylabel('Y') ax.set_title('2D KDE Visualization') plt.colorbar(im, ax=ax, label='Density') plt.show() ```


Exercise 2. Explain the role of the bw_method parameter in scipy.stats.gaussian_kde. What happens visually when you use a very small bandwidth (e.g., 0.05) versus a very large bandwidth (e.g., 1.0)?

Solution to Exercise 2

The bw_method parameter controls the bandwidth (smoothing) of the kernel density estimate. It determines how wide each Gaussian kernel is around each data point.

  • Very small bandwidth (e.g., 0.05): Each data point contributes a very narrow Gaussian. The result is a spiky, noisy density estimate that closely follows individual data points. This is called underfitting (high variance, low bias).
  • Very large bandwidth (e.g., 1.0): Each data point contributes a very wide Gaussian. The result is an overly smooth density estimate that blurs out important features like clusters or modes. This is called oversmoothing (low variance, high bias).

The default methods ('scott' and 'silverman') attempt to find a balanced bandwidth that captures the true density structure without excessive noise.


Exercise 3. Create a figure that overlays scatter points on top of a KDE heatmap. Use alpha=0.3 for the scatter points and the 'Blues' colormap for the KDE. Demonstrate this with bimodal data (two clusters).

Solution to Exercise 3

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

np.random.seed(42) c1 = np.random.multivariate_normal([2, 2], [[0.5, 0], [0, 0.5]], 300) c2 = np.random.multivariate_normal([-1, -1], [[0.3, 0.2], [0.2, 0.3]], 200) data = np.vstack([c1, c2]) x, y = data[:, 0], data[:, 1]

xmin, xmax = x.min() - 1, x.max() + 1 ymin, ymax = y.min() - 1, y.max() + 1 xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] positions = np.vstack([xx.ravel(), yy.ravel()])

kernel = stats.gaussian_kde(np.vstack([x, y])) density = kernel(positions).reshape(xx.shape)

fig, ax = plt.subplots(figsize=(8, 6)) im = ax.imshow(density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='Blues', aspect='auto') ax.scatter(x, y, c='red', s=10, alpha=0.3) ax.set_xlabel('X') ax.set_ylabel('Y') ax.set_title('KDE with Scatter Overlay (Bimodal Data)') plt.colorbar(im, ax=ax, label='Density') plt.show() ```


Exercise 4. Write code that creates a 1x3 subplot figure comparing hist2d, hexbin, and KDE (via imshow) side by side for the same dataset of 2000 points. Add a colorbar and title to each subplot.

Solution to Exercise 4

```python import matplotlib.pyplot as plt import numpy as np from scipy import stats

np.random.seed(42) data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], 2000) x, y = data[:, 0], data[:, 1]

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

hist2d

, , _, im1 = axes[0].hist2d(x, y, bins=30, cmap='viridis') axes[0].set_title('hist2d') fig.colorbar(im1, ax=axes[0])

hexbin

hb = axes[1].hexbin(x, y, gridsize=20, cmap='viridis') axes[1].set_title('hexbin') fig.colorbar(hb, ax=axes[1])

KDE

xmin, xmax = x.min() - 1, x.max() + 1 ymin, ymax = y.min() - 1, y.max() + 1 xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j] positions = np.vstack([xx.ravel(), yy.ravel()]) kernel = stats.gaussian_kde(np.vstack([x, y])) density = kernel(positions).reshape(xx.shape)

im3 = axes[2].imshow(density.T, extent=[xmin, xmax, ymin, ymax], origin='lower', cmap='viridis', aspect='auto') axes[2].set_title('KDE (imshow)') fig.colorbar(im3, ax=axes[2])

plt.tight_layout() plt.show() ```