Common Broadcasting Patterns¶
Once the broadcasting rules are understood, a small set of recurring patterns covers most practical use cases. These patterns eliminate explicit loops and temporary arrays, making numerical code both faster and more readable. This page collects the patterns that appear most frequently in data analysis and scientific computing.
Centering Data¶
Subtract the column mean from every row to produce zero-mean columns.
1. Column-wise Centering¶
import numpy as np
def main():
X = np.array([[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0],
[7.0, 8.0, 9.0]]) # (3, 3)
col_means = X.mean(axis=0) # (3,)
X_centered = X - col_means # (3, 3) - (3,) broadcasts
print("Column means:", col_means)
print("Centered:\n", X_centered)
print("New column means:", X_centered.mean(axis=0))
if __name__ == "__main__":
main()
Output:
Column means: [4. 5. 6.]
Centered:
[[-3. -3. -3.]
[ 0. 0. 0.]
[ 3. 3. 3.]]
New column means: [0. 0. 0.]
2. Row-wise Centering¶
import numpy as np
def main():
X = np.array([[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0]]) # (2, 3)
row_means = X.mean(axis=1, keepdims=True) # (2, 1)
X_centered = X - row_means # (2, 3) - (2, 1) broadcasts
print("Row means:\n", row_means)
print("Centered:\n", X_centered)
if __name__ == "__main__":
main()
Output:
Row means:
[[2.]
[5.]]
Centered:
[[-1. 0. 1.]
[-1. 0. 1.]]
3. keepdims is Essential¶
Without keepdims=True, X.mean(axis=1) returns shape (2,) instead of (2, 1), and the subtraction broadcasts incorrectly along the wrong axis.
Standardization¶
Centering and scaling to unit variance in a single broadcasting expression.
1. Z-score Normalization¶
import numpy as np
def main():
X = np.random.randn(100, 5) # (100, 5)
mu = X.mean(axis=0) # (5,)
sigma = X.std(axis=0) # (5,)
Z = (X - mu) / sigma # each column: mean 0, std 1
print("Column means after:", np.round(Z.mean(axis=0), 10))
print("Column stds after: ", np.round(Z.std(axis=0), 10))
if __name__ == "__main__":
main()
Output:
Column means after: [-0. -0. 0. -0. 0.]
Column stds after: [1. 1. 1. 1. 1.]
2. Min-Max Scaling¶
import numpy as np
def main():
X = np.random.randn(100, 5)
X_min = X.min(axis=0) # (5,)
X_max = X.max(axis=0) # (5,)
X_scaled = (X - X_min) / (X_max - X_min) # all values in [0, 1]
print("Min per column:", X_scaled.min(axis=0))
print("Max per column:", X_scaled.max(axis=0))
if __name__ == "__main__":
main()
3. Two Broadcasts in One Line¶
The expression (X - mu) / sigma performs two broadcasts: subtraction of mu with shape (5,) from X with shape (100, 5), then division by sigma with the same shapes.
Outer Products¶
Combine a column vector and a row vector to produce a 2D result.
1. Addition Table¶
import numpy as np
def main():
a = np.array([1, 2, 3])[:, np.newaxis] # (3, 1)
b = np.array([10, 20, 30])[np.newaxis, :] # (1, 3)
table = a + b # (3, 3)
print(table)
if __name__ == "__main__":
main()
Output:
[[11 21 31]
[12 22 32]
[13 23 33]]
2. Multiplication Table¶
import numpy as np
def main():
a = np.arange(1, 10)[:, np.newaxis] # (9, 1)
b = np.arange(1, 10)[np.newaxis, :] # (1, 9)
table = a * b # (9, 9)
print(table)
if __name__ == "__main__":
main()
3. np.newaxis vs reshape¶
np.newaxis (alias for None) inserts a size-1 dimension. These are equivalent:
a[:, np.newaxis] # (n,) → (n, 1)
a[:, None] # same
a.reshape(-1, 1) # same
Pairwise Distances¶
Compute distances between all pairs of points without loops.
1. Euclidean Distance Matrix¶
import numpy as np
def main():
# 4 points in 3D space
X = np.random.randn(4, 3) # (4, 3)
diff = X[:, np.newaxis, :] - X[np.newaxis, :, :] # (4, 1, 3) - (1, 4, 3) → (4, 4, 3)
dist = np.sqrt((diff ** 2).sum(axis=2)) # (4, 4)
print("Distance matrix:\n", np.round(dist, 2))
if __name__ == "__main__":
main()
2. Shape Breakdown¶
| Expression | Shape | Explanation |
|---|---|---|
X[:, np.newaxis, :] |
(4, 1, 3) |
Each point as a row-block |
X[np.newaxis, :, :] |
(1, 4, 3) |
Each point as a column-block |
diff |
(4, 4, 3) |
All pairwise coordinate differences |
dist |
(4, 4) |
Euclidean distances after sum and sqrt |
3. Symmetry Check¶
import numpy as np
def main():
X = np.random.randn(5, 3)
diff = X[:, np.newaxis, :] - X[np.newaxis, :, :]
dist = np.sqrt((diff ** 2).sum(axis=2))
print("Symmetric:", np.allclose(dist, dist.T))
print("Zero diagonal:", np.allclose(np.diag(dist), 0))
if __name__ == "__main__":
main()
Row or Column Scaling¶
Multiply each row or column by a different weight.
1. Scale Columns¶
import numpy as np
def main():
X = np.ones((3, 4)) # (3, 4)
weights = np.array([1, 2, 3, 4]) # (4,)
result = X * weights # each column scaled
print(result)
if __name__ == "__main__":
main()
Output:
[[1. 2. 3. 4.]
[1. 2. 3. 4.]
[1. 2. 3. 4.]]
2. Scale Rows¶
import numpy as np
def main():
X = np.ones((3, 4)) # (3, 4)
weights = np.array([10, 20, 30])[:, np.newaxis] # (3, 1)
result = X * weights # each row scaled
print(result)
if __name__ == "__main__":
main()
Output:
[[10. 10. 10. 10.]
[20. 20. 20. 20.]
[30. 30. 30. 30.]]
3. Key Difference¶
Column scaling uses a 1D vector with shape (n_cols,) that aligns with the last axis. Row scaling requires a column vector with shape (n_rows, 1) via np.newaxis or reshape.
Boolean Masking with Broadcasting¶
Combine boolean conditions across different dimensions.
1. Threshold per Column¶
import numpy as np
def main():
X = np.array([[1, 5, 3],
[4, 2, 6],
[7, 8, 1]]) # (3, 3)
thresholds = np.array([3, 4, 5]) # (3,)
mask = X > thresholds # (3, 3) > (3,) broadcasts
print("Mask:\n", mask)
print("Values above thresholds:", X[mask])
if __name__ == "__main__":
main()
Output:
Mask:
[[False True False]
[ True False True]
[ True True False]]
Values above thresholds: [5 4 6 7 8]
2. Range Check¶
import numpy as np
def main():
X = np.random.randn(5, 3)
lower = np.array([-1, -0.5, 0]) # (3,)
upper = np.array([1, 0.5, 2]) # (3,)
in_range = (X >= lower) & (X <= upper)
print("In range:\n", in_range)
if __name__ == "__main__":
main()
3. Combining Row and Column Conditions¶
import numpy as np
def main():
row_mask = np.array([True, False, True])[:, np.newaxis] # (3, 1)
col_mask = np.array([True, True, False])[np.newaxis, :] # (1, 3)
combined = row_mask & col_mask # (3, 3)
print(combined)
if __name__ == "__main__":
main()
Summary¶
The most common broadcasting patterns share the same underlying mechanism: aligning a smaller array against a larger one along a specific axis.
| Pattern | Typical Shapes | Key Technique |
|---|---|---|
| Column centering | (m, n) - (n,) |
mean(axis=0) |
| Row centering | (m, n) - (m, 1) |
mean(axis=1, keepdims=True) |
| Standardization | (m, n) - (n,) then / (n,) |
Two broadcasts in sequence |
| Outer product | (m, 1) * (1, n) |
np.newaxis |
| Pairwise distance | (m, 1, d) - (1, n, d) |
3D broadcasting |
| Row/column scaling | (m, n) * (n,) or (m, 1) |
Weight vector alignment |
| Boolean masking | (m, n) > (n,) |
Threshold per column |