Common Broadcasting Patterns¶

Once the broadcasting rules are understood, a small set of recurring patterns covers most practical use cases. These patterns eliminate explicit loops and temporary arrays, making numerical code both faster and more readable. This page collects the patterns that appear most frequently in data analysis and scientific computing.

Centering Data¶

Subtract the column mean from every row to produce zero-mean columns.

1. Column-wise Centering¶

import numpy as np

def main():
    X = np.array([[1.0, 2.0, 3.0],
                  [4.0, 5.0, 6.0],
                  [7.0, 8.0, 9.0]])  # (3, 3)

    col_means = X.mean(axis=0)       # (3,)
    X_centered = X - col_means       # (3, 3) - (3,) broadcasts
    print("Column means:", col_means)
    print("Centered:\n", X_centered)
    print("New column means:", X_centered.mean(axis=0))

if __name__ == "__main__":
    main()

Output:

Column means: [4. 5. 6.]
Centered:
 [[-3. -3. -3.]
 [ 0.  0.  0.]
 [ 3.  3.  3.]]
New column means: [0. 0. 0.]

2. Row-wise Centering¶

import numpy as np

def main():
    X = np.array([[1.0, 2.0, 3.0],
                  [4.0, 5.0, 6.0]])  # (2, 3)

    row_means = X.mean(axis=1, keepdims=True)  # (2, 1)
    X_centered = X - row_means                  # (2, 3) - (2, 1) broadcasts
    print("Row means:\n", row_means)
    print("Centered:\n", X_centered)

if __name__ == "__main__":
    main()

Output:

Row means:
 [[2.]
 [5.]]
Centered:
 [[-1.  0.  1.]
 [-1.  0.  1.]]

3. keepdims is Essential¶

Without keepdims=True, X.mean(axis=1) returns shape (2,) instead of (2, 1), and the subtraction broadcasts incorrectly along the wrong axis.

Standardization¶

Centering and scaling to unit variance in a single broadcasting expression.

1. Z-score Normalization¶

import numpy as np

def main():
    X = np.random.randn(100, 5)          # (100, 5)
    mu = X.mean(axis=0)                   # (5,)
    sigma = X.std(axis=0)                 # (5,)
    Z = (X - mu) / sigma                  # each column: mean 0, std 1
    print("Column means after:", np.round(Z.mean(axis=0), 10))
    print("Column stds after: ", np.round(Z.std(axis=0), 10))

if __name__ == "__main__":
    main()

Output:

Column means after: [-0. -0.  0. -0.  0.]
Column stds after:  [1. 1. 1. 1. 1.]

2. Min-Max Scaling¶

import numpy as np

def main():
    X = np.random.randn(100, 5)
    X_min = X.min(axis=0)                         # (5,)
    X_max = X.max(axis=0)                         # (5,)
    X_scaled = (X - X_min) / (X_max - X_min)      # all values in [0, 1]
    print("Min per column:", X_scaled.min(axis=0))
    print("Max per column:", X_scaled.max(axis=0))

if __name__ == "__main__":
    main()

3. Two Broadcasts in One Line¶

The expression (X - mu) / sigma performs two broadcasts: subtraction of mu with shape (5,) from X with shape (100, 5), then division by sigma with the same shapes.

Outer Products¶

Combine a column vector and a row vector to produce a 2D result.

1. Addition Table¶

import numpy as np

def main():
    a = np.array([1, 2, 3])[:, np.newaxis]  # (3, 1)
    b = np.array([10, 20, 30])[np.newaxis, :]  # (1, 3)
    table = a + b                              # (3, 3)
    print(table)

if __name__ == "__main__":
    main()

Output:

[[11 21 31]
 [12 22 32]
 [13 23 33]]

2. Multiplication Table¶

import numpy as np

def main():
    a = np.arange(1, 10)[:, np.newaxis]  # (9, 1)
    b = np.arange(1, 10)[np.newaxis, :]  # (1, 9)
    table = a * b                         # (9, 9)
    print(table)

if __name__ == "__main__":
    main()

3. np.newaxis vs reshape¶

np.newaxis (alias for None) inserts a size-1 dimension. These are equivalent:

a[:, np.newaxis]   # (n,) → (n, 1)
a[:, None]         # same
a.reshape(-1, 1)   # same

Pairwise Distances¶

Compute distances between all pairs of points without loops.

1. Euclidean Distance Matrix¶

import numpy as np

def main():
    # 4 points in 3D space
    X = np.random.randn(4, 3)             # (4, 3)

    diff = X[:, np.newaxis, :] - X[np.newaxis, :, :]  # (4, 1, 3) - (1, 4, 3) → (4, 4, 3)
    dist = np.sqrt((diff ** 2).sum(axis=2))            # (4, 4)
    print("Distance matrix:\n", np.round(dist, 2))

if __name__ == "__main__":
    main()

2. Shape Breakdown¶

Expression	Shape	Explanation
`X[:, np.newaxis, :]`	`(4, 1, 3)`	Each point as a row-block
`X[np.newaxis, :, :]`	`(1, 4, 3)`	Each point as a column-block
`diff`	`(4, 4, 3)`	All pairwise coordinate differences
`dist`	`(4, 4)`	Euclidean distances after sum and sqrt

3. Symmetry Check¶

import numpy as np

def main():
    X = np.random.randn(5, 3)
    diff = X[:, np.newaxis, :] - X[np.newaxis, :, :]
    dist = np.sqrt((diff ** 2).sum(axis=2))
    print("Symmetric:", np.allclose(dist, dist.T))
    print("Zero diagonal:", np.allclose(np.diag(dist), 0))

if __name__ == "__main__":
    main()

Row or Column Scaling¶

Multiply each row or column by a different weight.

1. Scale Columns¶

import numpy as np

def main():
    X = np.ones((3, 4))                   # (3, 4)
    weights = np.array([1, 2, 3, 4])      # (4,)
    result = X * weights                   # each column scaled
    print(result)

if __name__ == "__main__":
    main()

Output:

[[1. 2. 3. 4.]
 [1. 2. 3. 4.]
 [1. 2. 3. 4.]]

2. Scale Rows¶

import numpy as np

def main():
    X = np.ones((3, 4))                          # (3, 4)
    weights = np.array([10, 20, 30])[:, np.newaxis]  # (3, 1)
    result = X * weights                          # each row scaled
    print(result)

if __name__ == "__main__":
    main()

Output:

[[10. 10. 10. 10.]
 [20. 20. 20. 20.]
 [30. 30. 30. 30.]]

3. Key Difference¶

Column scaling uses a 1D vector with shape (n_cols,) that aligns with the last axis. Row scaling requires a column vector with shape (n_rows, 1) via np.newaxis or reshape.

Boolean Masking with Broadcasting¶

Combine boolean conditions across different dimensions.

1. Threshold per Column¶

import numpy as np

def main():
    X = np.array([[1, 5, 3],
                  [4, 2, 6],
                  [7, 8, 1]])             # (3, 3)
    thresholds = np.array([3, 4, 5])      # (3,)
    mask = X > thresholds                  # (3, 3) > (3,) broadcasts
    print("Mask:\n", mask)
    print("Values above thresholds:", X[mask])

if __name__ == "__main__":
    main()

Output:

Mask:
 [[False  True False]
 [ True False  True]
 [ True  True False]]
Values above thresholds: [5 4 6 7 8]

2. Range Check¶

import numpy as np

def main():
    X = np.random.randn(5, 3)
    lower = np.array([-1, -0.5, 0])       # (3,)
    upper = np.array([1, 0.5, 2])         # (3,)
    in_range = (X >= lower) & (X <= upper)
    print("In range:\n", in_range)

if __name__ == "__main__":
    main()

3. Combining Row and Column Conditions¶

import numpy as np

def main():
    row_mask = np.array([True, False, True])[:, np.newaxis]  # (3, 1)
    col_mask = np.array([True, True, False])[np.newaxis, :]  # (1, 3)
    combined = row_mask & col_mask                            # (3, 3)
    print(combined)

if __name__ == "__main__":
    main()

Summary¶

The most common broadcasting patterns share the same underlying mechanism: aligning a smaller array against a larger one along a specific axis.

Pattern	Typical Shapes	Key Technique
Column centering	`(m, n) - (n,)`	`mean(axis=0)`
Row centering	`(m, n) - (m, 1)`	`mean(axis=1, keepdims=True)`
Standardization	`(m, n) - (n,)` then `/ (n,)`	Two broadcasts in sequence
Outer product	`(m, 1) * (1, n)`	`np.newaxis`
Pairwise distance	`(m, 1, d) - (1, n, d)`	3D broadcasting
Row/column scaling	`(m, n) * (n,)` or `(m, 1)`	Weight vector alignment
Boolean masking	`(m, n) > (n,)`	Threshold per column