Singular Value Decomposition¶
SVD factorizes any matrix into orthogonal components.
Mental Model
SVD decomposes any matrix \(A\) into \(U \Sigma V^T\), where \(U\) and \(V\) are rotations and \(\Sigma\) is a diagonal of scaling factors (singular values). The singular values tell you how much "signal" each component carries. Truncating to the top \(k\) singular values gives the best rank-\(k\) approximation -- the basis of PCA, image compression, and noise reduction.
SVD in Machine Learning and Data Science
SVD is the computational backbone behind many ML techniques:
- PCA: projecting data onto the top singular vectors
- Linear regression:
lstsquses SVD internally for numerical stability - Recommender systems: low-rank matrix factorization of user-item matrices
- Natural language processing: latent semantic analysis (LSA)
- Neural networks: weight initialization and analyzing layer Jacobians
If you learn one decomposition deeply, make it SVD — it is the most general factorization and the one you will encounter most often in practice.
np.linalg.svd¶
1. Basic Usage¶
```python import numpy as np
def main(): A = np.array([[1, 2], [3, 4], [5, 6]])
U, s, Vt = np.linalg.svd(A)
print(f"A shape: {A.shape}")
print(f"U shape: {U.shape}")
print(f"s shape: {s.shape}")
print(f"Vt shape: {Vt.shape}")
print()
print(f"Singular values: {s}")
if name == "main": main() ```
2. Mathematical Form¶
- \(U\): Left singular vectors (orthogonal)
- \(\Sigma\): Diagonal matrix of singular values
- \(V^T\): Right singular vectors (orthogonal)
3. Reconstruct Matrix¶
```python import numpy as np
def main(): A = np.array([[1, 2], [3, 4], [5, 6]])
U, s, Vt = np.linalg.svd(A)
# Reconstruct: A = U @ Sigma @ Vt
Sigma = np.zeros((U.shape[0], Vt.shape[0]))
np.fill_diagonal(Sigma, s)
A_reconstructed = U @ Sigma @ Vt
print("Original A:")
print(A)
print()
print("Reconstructed:")
print(A_reconstructed.round(10))
if name == "main": main() ```
Reduced SVD¶
1. full_matrices=False¶
```python import numpy as np
def main(): A = np.array([[1, 2], [3, 4], [5, 6]])
# Full SVD
U_full, s, Vt_full = np.linalg.svd(A, full_matrices=True)
print("Full SVD:")
print(f" U: {U_full.shape}, Vt: {Vt_full.shape}")
# Reduced SVD
U_red, s, Vt_red = np.linalg.svd(A, full_matrices=False)
print("Reduced SVD:")
print(f" U: {U_red.shape}, Vt: {Vt_red.shape}")
if name == "main": main() ```
2. Memory Efficiency¶
Reduced SVD is more memory efficient for tall/wide matrices.
3. Simpler Reconstruction¶
```python import numpy as np
def main(): A = np.array([[1, 2], [3, 4], [5, 6]])
U, s, Vt = np.linalg.svd(A, full_matrices=False)
# Simpler reconstruction with reduced SVD
A_reconstructed = U @ np.diag(s) @ Vt
print("Reconstructed:")
print(A_reconstructed.round(10))
if name == "main": main() ```
Low-Rank Approximation¶
1. Truncated SVD¶
Keep only top k singular values.
```python import numpy as np
def main(): np.random.seed(42) A = np.random.randn(5, 4)
U, s, Vt = np.linalg.svd(A, full_matrices=False)
print(f"Singular values: {s.round(3)}")
# Rank-2 approximation
k = 2
A_approx = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
error = np.linalg.norm(A - A_approx, 'fro')
print(f"Approximation error: {error:.4f}")
if name == "main": main() ```
2. Optimal Approximation¶
SVD gives the best rank-k approximation (Eckart-Young theorem).
```python import numpy as np
def main(): np.random.seed(42) A = np.random.randn(10, 8)
U, s, Vt = np.linalg.svd(A, full_matrices=False)
print("Rank | Error")
print("-" * 20)
for k in range(1, len(s) + 1):
A_k = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
error = np.linalg.norm(A - A_k, 'fro')
print(f" {k} | {error:.4f}")
if name == "main": main() ```
3. Compression Ratio¶
```python import numpy as np
def main(): m, n = 100, 80 A = np.random.randn(m, n)
U, s, Vt = np.linalg.svd(A, full_matrices=False)
original_size = m * n
print("Rank | Compressed Size | Ratio | Error")
print("-" * 50)
for k in [5, 10, 20, 40]:
compressed_size = k * (m + n + 1)
ratio = compressed_size / original_size
A_k = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
error = np.linalg.norm(A - A_k, 'fro') / np.linalg.norm(A, 'fro')
print(f" {k:3} | {compressed_size:15} | {ratio:.2%} | {error:.4f}")
if name == "main": main() ```
Properties¶
1. Singular Values¶
Always non-negative and sorted descending.
```python import numpy as np
def main(): A = np.array([[1, 2, 3], [4, 5, 6]])
U, s, Vt = np.linalg.svd(A)
print(f"Singular values: {s}")
print(f"All non-negative: {np.all(s >= 0)}")
print(f"Sorted descending: {np.all(np.diff(s) <= 0)}")
if name == "main": main() ```
2. Matrix Rank¶
Count non-zero singular values.
```python import numpy as np
def main(): # Rank-deficient matrix A = np.array([[1, 2, 3], [2, 4, 6], [1, 2, 3]])
U, s, Vt = np.linalg.svd(A)
print(f"Singular values: {s}")
rank = np.sum(s > 1e-10)
print(f"Numerical rank: {rank}")
print(f"np.linalg.matrix_rank: {np.linalg.matrix_rank(A)}")
if name == "main": main() ```
3. Condition Number¶
Ratio of largest to smallest singular value.
```python import numpy as np
def main(): A = np.array([[1, 2], [3, 4]])
U, s, Vt = np.linalg.svd(A)
cond_svd = s[0] / s[-1]
cond_numpy = np.linalg.cond(A)
print(f"Singular values: {s}")
print(f"Condition (σ_max/σ_min): {cond_svd:.4f}")
print(f"np.linalg.cond: {cond_numpy:.4f}")
if name == "main": main() ```
Applications¶
1. Image Compression¶
```python import numpy as np import matplotlib.pyplot as plt
def main(): np.random.seed(42)
# Simulated grayscale image
img = np.random.randn(100, 100)
img = np.cumsum(np.cumsum(img, axis=0), axis=1)
img = (img - img.min()) / (img.max() - img.min())
U, s, Vt = np.linalg.svd(img, full_matrices=False)
fig, axes = plt.subplots(1, 4, figsize=(16, 4))
axes[0].imshow(img, cmap='gray')
axes[0].set_title('Original')
for ax, k in zip(axes[1:], [5, 20, 50]):
img_k = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
ax.imshow(img_k, cmap='gray')
ax.set_title(f'Rank {k}')
for ax in axes:
ax.axis('off')
plt.tight_layout()
plt.show()
if name == "main": main() ```
2. Pseudoinverse¶
```python import numpy as np
def main(): A = np.array([[1, 2], [3, 4], [5, 6]])
U, s, Vt = np.linalg.svd(A, full_matrices=False)
# Pseudoinverse: V @ S^(-1) @ U^T
s_inv = 1 / s
A_pinv_svd = Vt.T @ np.diag(s_inv) @ U.T
A_pinv_numpy = np.linalg.pinv(A)
print("Pseudoinverse via SVD:")
print(A_pinv_svd.round(4))
print()
print("np.linalg.pinv:")
print(A_pinv_numpy.round(4))
if name == "main": main() ```
3. Latent Semantic Analysis¶
```python import numpy as np
def main(): # Term-document matrix (simplified) # Rows: terms, Columns: documents A = np.array([[1, 0, 1, 0], [0, 1, 0, 1], [1, 1, 0, 0], [0, 0, 1, 1], [1, 0, 0, 1]])
U, s, Vt = np.linalg.svd(A, full_matrices=False)
print("Singular values:", s.round(3))
print()
# Reduce to 2 dimensions
k = 2
term_vectors = U[:, :k] @ np.diag(s[:k])
doc_vectors = Vt[:k, :].T @ np.diag(s[:k])
print("Term vectors (2D):")
print(term_vectors.round(3))
print()
print("Document vectors (2D):")
print(doc_vectors.round(3))
if name == "main": main() ```
scipy Alternative¶
1. Truncated SVD¶
More efficient for large sparse matrices.
```python import numpy as np from scipy.sparse.linalg import svds
def main(): np.random.seed(42) A = np.random.randn(100, 80)
# Compute only top 5 singular values
U, s, Vt = svds(A, k=5)
print(f"U shape: {U.shape}")
print(f"s: {s}")
print(f"Vt shape: {Vt.shape}")
if name == "main": main() ```
2. Randomized SVD¶
```python import numpy as np from sklearn.decomposition import TruncatedSVD
def main(): np.random.seed(42) A = np.random.randn(1000, 500)
svd = TruncatedSVD(n_components=10)
A_transformed = svd.fit_transform(A)
print(f"Singular values: {svd.singular_values_}")
print(f"Explained variance ratio: {svd.explained_variance_ratio_.sum():.4f}")
if name == "main": main() ```
3. When to Use Each¶
np.linalg.svd: Full SVD, small/medium matricesscipy.sparse.linalg.svds: Truncated SVD, sparse matricessklearn.decomposition.TruncatedSVD: Randomized, very large matrices
Exercises¶
Exercise 1.
Compute the SVD of A = np.array([[1, 2], [3, 4], [5, 6]]). Reconstruct A from U, s, and Vt by forming U[:, :2] @ np.diag(s) @ Vt. Verify the reconstruction matches A.
Solution to Exercise 1
import numpy as np
A = np.array([[1, 2], [3, 4], [5, 6]], dtype=float)
U, s, Vt = np.linalg.svd(A)
A_reconstructed = U[:, :2] @ np.diag(s) @ Vt
print(f"Match: {np.allclose(A, A_reconstructed)}")
Exercise 2.
Use SVD to compute a rank-1 approximation of A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]). Keep only the first singular value and its corresponding vectors. Compute the approximation error as the Frobenius norm of the difference.
Solution to Exercise 2
import numpy as np
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=float)
U, s, Vt = np.linalg.svd(A)
# Rank-1 approximation
A_rank1 = s[0] * np.outer(U[:, 0], Vt[0, :])
error = np.linalg.norm(A - A_rank1, 'fro')
print(f"Rank-1 approximation error: {error:.4f}")
print(f"Second singular value (expected error): {s[1]:.4f}")
Exercise 3.
Compute the condition number of A = np.array([[1, 2], [3, 4]]) using SVD (ratio of largest to smallest singular value). Verify it matches np.linalg.cond(A).
Solution to Exercise 3
import numpy as np
A = np.array([[1, 2], [3, 4]], dtype=float)
_, s, _ = np.linalg.svd(A)
cond_svd = s[0] / s[-1]
cond_np = np.linalg.cond(A)
print(f"SVD condition number: {cond_svd:.4f}")
print(f"np.linalg.cond: {cond_np:.4f}")
print(f"Match: {np.allclose(cond_svd, cond_np)}")