Skip to content

Color Mapping

Map continuous data values to colors using colormaps, enabling visualization of three-dimensional relationships in 2D scatter plots.

Mental Model

Color mapping turns a number into a color. Pass an array of values to c= and choose a colormap with cmap= -- Matplotlib linearly maps the data range to the color gradient. Add a colorbar to show readers the number-to-color lookup table. This is how you visualize a third variable on a 2D scatter plot.

Colormap Guidelines

Not all colormaps are perceptually equal:

  • Use perceptually uniform maps (viridis, plasma, inferno, magma) — brightness changes proportionally to data changes
  • Avoid rainbow maps (jet, rainbow) — they create false boundaries and misleading gradients where none exist in the data
  • Use diverging maps (RdBu, coolwarm) only when data has a meaningful center point (e.g., positive/negative, above/below average)
  • Use sequential maps for data that goes from low to high without a special midpoint

Basic Color Mapping

Use the c parameter with numeric values to map data to colors.

1. Color by Third Variable

```python import matplotlib.pyplot as plt import numpy as np

np.random.seed(42) x = np.random.rand(100) y = np.random.rand(100) z = np.random.rand(100) # Third variable for color

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z) plt.colorbar(scatter) plt.show() ```

2. Color by Computed Value

```python x = np.random.rand(100) * 10 y = np.random.rand(100) * 10 distance = np.sqrt(x2 + y2)

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=distance) plt.colorbar(scatter, label='Distance from Origin') plt.show() ```

3. Color by Category Index

```python categories = np.random.randint(0, 5, 100)

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=categories, cmap='Set1') plt.colorbar(scatter, label='Category') plt.show() ```

Colormap Selection

The cmap parameter selects the color scheme.

1. Sequential Colormaps

```python fig, axes = plt.subplots(1, 3, figsize=(12, 4)) cmaps = ['viridis', 'plasma', 'Blues']

for ax, cmap in zip(axes, cmaps): scatter = ax.scatter(x, y, c=z, cmap=cmap) ax.set_title(cmap) plt.colorbar(scatter, ax=ax)

plt.tight_layout() plt.show() ```

2. Diverging Colormaps

```python z_centered = np.random.randn(100)

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z_centered, cmap='RdBu', vmin=-3, vmax=3) plt.colorbar(scatter) plt.show() ```

3. Qualitative Colormaps

```python

For categorical data

categories = np.random.randint(0, 8, 100)

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=categories, cmap='Set2') plt.colorbar(scatter, ticks=range(8)) plt.show() ```

Value Range

Control the mapping between data values and colors.

1. Auto Range (Default)

python fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z) # Maps min(z) to max(z) plt.colorbar(scatter) plt.show()

2. Fixed Range

python fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, vmin=0, vmax=1) plt.colorbar(scatter) plt.show()

3. Centered at Zero

```python z_centered = np.random.randn(100) * 2 max_abs = np.abs(z_centered).max()

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z_centered, cmap='RdBu', vmin=-max_abs, vmax=max_abs) plt.colorbar(scatter) plt.show() ```

Normalization

Transform data before color mapping.

1. Linear Normalization (Default)

```python from matplotlib.colors import Normalize

norm = Normalize(vmin=0, vmax=1) scatter = ax.scatter(x, y, c=z, norm=norm) ```

2. Logarithmic Normalization

```python from matplotlib.colors import LogNorm

z_log = np.random.rand(100) * 1000 + 1

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z_log, norm=LogNorm(vmin=1, vmax=1000)) plt.colorbar(scatter, label='Log Scale') plt.show() ```

3. Power Normalization

```python from matplotlib.colors import PowerNorm

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, norm=PowerNorm(gamma=0.5)) plt.colorbar(scatter) plt.show() ```

Discrete Colors

Map continuous data to discrete color bins.

1. BoundaryNorm

```python from matplotlib.colors import BoundaryNorm

bounds = [0, 0.2, 0.4, 0.6, 0.8, 1.0] norm = BoundaryNorm(bounds, plt.cm.viridis.N)

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, cmap='viridis', norm=norm) plt.colorbar(scatter, boundaries=bounds, ticks=bounds) plt.show() ```

2. Fixed Number of Bins

```python n_bins = 5 cmap = plt.cm.get_cmap('viridis', n_bins)

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, cmap=cmap) plt.colorbar(scatter) plt.show() ```

3. Custom Bin Edges

```python bins = [0, 0.1, 0.3, 0.7, 0.9, 1.0] norm = BoundaryNorm(bins, plt.cm.RdYlGn.N)

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, cmap='RdYlGn', norm=norm) plt.colorbar(scatter, ticks=bins) plt.show() ```

Color and Size Combined

Encode two additional dimensions using both color and size.

1. Four-Dimensional Visualization

```python np.random.seed(42) x = np.random.rand(50) y = np.random.rand(50) colors = np.random.rand(50) sizes = np.random.rand(50) * 500

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=colors, s=sizes, alpha=0.6, cmap='viridis') plt.colorbar(scatter, label='Color Variable') ax.set_xlabel('X') ax.set_ylabel('Y') ax.set_title('Size and Color Encoding') plt.show() ```

2. Bubble Chart

```python

Population, GDP, Life Expectancy example

np.random.seed(42) gdp = np.random.rand(30) * 50000 life_exp = 50 + np.random.rand(30) * 35 population = np.random.rand(30) * 1000

fig, ax = plt.subplots(figsize=(10, 6)) scatter = ax.scatter(gdp, life_exp, c=population, s=population, alpha=0.6, cmap='YlOrRd') plt.colorbar(scatter, label='Population (millions)') ax.set_xlabel('GDP per Capita ($)') ax.set_ylabel('Life Expectancy (years)') plt.show() ```

3. Legend for Sizes

```python from matplotlib.lines import Line2D

fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=colors, s=sizes, alpha=0.6, cmap='viridis') plt.colorbar(scatter, label='Color')

Size legend

size_legend = [100, 300, 500] handles = [Line2D([0], [0], marker='o', color='w', markerfacecolor='gray', markersize=np.sqrt(s)/2, label=str(s)) for s in size_legend] ax.legend(handles=handles, title='Size', loc='upper right') plt.show() ```

Colorbar Customization

Customize the colorbar for scatter plots.

1. Label and Ticks

python fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, cmap='viridis') cbar = plt.colorbar(scatter) cbar.set_label('Value', fontsize=12) cbar.set_ticks([0, 0.25, 0.5, 0.75, 1]) plt.show()

2. Shrink and Position

python fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, cmap='viridis') plt.colorbar(scatter, shrink=0.8, pad=0.02) plt.show()

3. Horizontal Colorbar

python fig, ax = plt.subplots() scatter = ax.scatter(x, y, c=z, cmap='viridis') plt.colorbar(scatter, orientation='horizontal', pad=0.1) plt.show()

Practical Example

Create a complete color-mapped scatter plot.

1. Generate Realistic Data

python np.random.seed(42) n = 200 x = np.random.randn(n) y = 0.5 * x + np.random.randn(n) * 0.5 magnitude = np.sqrt(x**2 + y**2)

2. Create Visualization

```python fig, ax = plt.subplots(figsize=(8, 6))

scatter = ax.scatter(x, y, c=magnitude, s=50, cmap='plasma', alpha=0.7, edgecolors='white', linewidths=0.5)

ax.set_xlabel('X Variable', fontsize=12) ax.set_ylabel('Y Variable', fontsize=12) ax.set_title('Scatter Plot with Color Mapping', fontsize=14) ax.grid(True, alpha=0.3) ax.set_aspect('equal') ```

3. Add Colorbar

```python cbar = plt.colorbar(scatter, shrink=0.8) cbar.set_label('Distance from Origin', fontsize=11)

plt.tight_layout() plt.show() ```

Color Mapping as a Function

Color mapping is a transformation function:

text raw value → normalize to [0, 1] → look up in colormap → RGB color

This is the same pipeline used in heatmaps, contourf, and any color-encoded visualization. Changing the normalization (linear, log, power) changes what "structure" becomes visible — just like changing axis limits changes what spatial patterns are visible.

Color effectively adds a third axis to the scatter plot, but encoded as hue/intensity rather than position.

In Machine Learning

Color-mapped scatter plots are essential for visualizing clusters and class separability. Plot two features on x/y axes and color by cluster label (discrete cmap) or prediction confidence (continuous cmap). This immediately reveals whether clusters are well-separated and where misclassifications occur.


Exercises

Exercise 1. Write code that creates a scatter plot where each point's color is mapped to a third variable using the c parameter and cmap='viridis'. Add a colorbar.

Solution to Exercise 1

```python import matplotlib.pyplot as plt import numpy as np

np.random.seed(42)

Solution code depends on the specific exercise

x = np.linspace(0, 2 * np.pi, 100) fig, ax = plt.subplots() ax.plot(x, np.sin(x)) ax.set_title('Example Solution') plt.show() ```

See the content of this page for the relevant API details to construct the full solution.


Exercise 2. Explain the relationship between the c, cmap, vmin, and vmax parameters in ax.scatter().

Solution to Exercise 2

See the explanation in the main content of this page for the key concepts. The essential idea is to understand the API parameters and their effects on the resulting visualization.


Exercise 3. Create a scatter plot where both color and size vary with separate variables. Add a colorbar for the color variable.

Solution to Exercise 3

```python import matplotlib.pyplot as plt import numpy as np

np.random.seed(42) fig, axes = plt.subplots(1, 2, figsize=(12, 5))

x = np.linspace(0, 2 * np.pi, 100) axes[0].plot(x, np.sin(x)) axes[0].set_title('Left Subplot')

axes[1].plot(x, np.cos(x)) axes[1].set_title('Right Subplot')

plt.tight_layout() plt.show() ```

Adapt this pattern to the specific requirements of the exercise.


Exercise 4. Write code that uses a diverging colormap ('RdBu') centered at zero to color scatter points based on positive/negative values.

Solution to Exercise 4

```python import matplotlib.pyplot as plt import numpy as np

np.random.seed(42) x = np.linspace(0, 10, 100) fig, ax = plt.subplots() ax.plot(x, np.sin(x), 'b-', lw=2) ax.set_title('Solution') plt.show() ```

Refer to the code examples in the main content for the specific API calls needed.