Skip to content

Least Squares Connection

When np.polyfit fits a polynomial to data, it solves a least squares problem under the hood. Understanding this connection reveals why polynomial fitting works, when it fails, and how to control the numerical behavior. The same least squares machinery underlies linear regression, spline fitting, and many other approximation techniques.


The Fitting Problem

Given data points \((x_i, y_i)\) for \(i = 1, \ldots, m\), find the polynomial of degree \(n\) that best fits the data.

1. Polynomial Model

A polynomial of degree \(n\) has \(n + 1\) coefficients:

\[ p(x) = c_0 + c_1 x + c_2 x^2 + \cdots + c_n x^n \]

2. Residuals

The residual at each data point measures the fitting error:

\[ r_i = y_i - p(x_i) \]

3. Least Squares Objective

Find coefficients \(c_0, c_1, \ldots, c_n\) that minimize the sum of squared residuals:

\[ \min_{c_0, \ldots, c_n} \sum_{i=1}^{m} \left( y_i - \sum_{j=0}^{n} c_j x_i^j \right)^2 \]

The Vandermonde Matrix

The fitting problem becomes a linear system through the Vandermonde matrix.

1. Matrix Formulation

Stacking all data points into a matrix equation:

\[ \underbrace{\begin{pmatrix} 1 & x_1 & x_1^2 & \cdots & x_1^n \\ 1 & x_2 & x_2^2 & \cdots & x_2^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_m & x_m^2 & \cdots & x_m^n \end{pmatrix}}_{V} \underbrace{\begin{pmatrix} c_0 \\ c_1 \\ \vdots \\ c_n \end{pmatrix}}_{\mathbf{c}} = \underbrace{\begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{pmatrix}}_{\mathbf{y}} \]

2. Building the Vandermonde in NumPy

import numpy as np

def main():
    x = np.array([0, 1, 2, 3, 4], dtype=float)
    deg = 2

    # np.vander builds the Vandermonde matrix
    V = np.vander(x, N=deg + 1, increasing=True)
    print("Vandermonde matrix:")
    print(V)

if __name__ == "__main__":
    main()

Output:

Vandermonde matrix:
[[ 1.  0.  0.]
 [ 1.  1.  1.]
 [ 1.  2.  4.]
 [ 1.  3.  9.]
 [ 1.  4. 16.]]

3. Overdetermined System

When \(m > n + 1\) (more data points than coefficients), the system \(V \mathbf{c} = \mathbf{y}\) has no exact solution. The least squares solution minimizes \(\|V \mathbf{c} - \mathbf{y}\|_2^2\).

Normal Equations

The classical approach to solving least squares.

1. Derivation

Setting the gradient to zero yields the normal equations:

\[ V^T V \, \mathbf{c} = V^T \mathbf{y} \]

2. Direct Solution

import numpy as np

def main():
    x = np.array([0, 1, 2, 3, 4], dtype=float)
    y = np.array([1, 3, 7, 13, 21], dtype=float)
    deg = 2

    V = np.vander(x, N=deg + 1, increasing=True)
    # Normal equations: (V^T V) c = V^T y
    c = np.linalg.solve(V.T @ V, V.T @ y)
    print("Coefficients (ascending):", c)

if __name__ == "__main__":
    main()

Output:

Coefficients (ascending): [1. 1. 1.]

3. Numerical Issues

The normal equations square the condition number of \(V\). If \(V\) has condition number \(\kappa\), then \(V^T V\) has condition number \(\kappa^2\), amplifying numerical errors. For this reason, np.polyfit does not use normal equations directly.

QR Decomposition Approach

The method np.polyfit actually uses for better numerical stability.

1. QR Factorization

Decompose \(V = QR\) where \(Q\) is orthogonal and \(R\) is upper triangular:

\[ V \mathbf{c} = \mathbf{y} \implies QR \mathbf{c} = \mathbf{y} \implies R \mathbf{c} = Q^T \mathbf{y} \]

2. Implementation

import numpy as np

def main():
    x = np.array([0, 1, 2, 3, 4], dtype=float)
    y = np.array([1, 3, 7, 13, 21], dtype=float)
    deg = 2

    V = np.vander(x, N=deg + 1, increasing=True)
    Q, R = np.linalg.qr(V)
    c = np.linalg.solve(R, Q.T @ y)
    print("Coefficients (QR):", c)

if __name__ == "__main__":
    main()

3. Why QR is Better

The condition number of \(R\) equals the condition number of \(V\) (not its square), providing much better numerical stability for high-degree polynomials or poorly spaced data points.

np.lstsq — Direct Least Squares

NumPy provides np.linalg.lstsq for solving least squares problems directly.

1. Basic Usage

import numpy as np

def main():
    x = np.array([0, 1, 2, 3, 4], dtype=float)
    y = np.array([1, 3, 7, 13, 21], dtype=float)
    deg = 2

    V = np.vander(x, N=deg + 1, increasing=True)
    c, residuals, rank, sv = np.linalg.lstsq(V, y, rcond=None)
    print("Coefficients:", c)
    print("Residuals:", residuals)
    print("Rank:", rank)
    print("Singular values:", sv)

if __name__ == "__main__":
    main()

Output:

Coefficients: [1. 1. 1.]
Residuals: []
Rank: 3
Singular values: [18.76166304  2.07972297  0.26600784]

2. Residuals Interpretation

The residuals array contains \(\|V \mathbf{c} - \mathbf{y}\|_2^2\). When the fit is exact (as with degree = number of points - 1), the residuals array is empty.

3. Comparison with polyfit

import numpy as np

def main():
    x = np.array([1, 2, 3, 4, 5], dtype=float)
    y = np.array([2.1, 3.9, 6.2, 7.8, 10.1], dtype=float)

    # polyfit (descending order)
    coeffs_polyfit = np.polyfit(x, y, 2)

    # lstsq (ascending order with Vandermonde)
    V = np.vander(x, N=3, increasing=True)
    coeffs_lstsq, _, _, _ = np.linalg.lstsq(V, y, rcond=None)

    print("polyfit (descending):", coeffs_polyfit)
    print("lstsq   (ascending):", coeffs_lstsq)
    print("Match:", np.allclose(coeffs_polyfit, coeffs_lstsq[::-1]))

if __name__ == "__main__":
    main()

Conditioning and Stability

When polynomial fitting fails numerically.

1. Ill-Conditioned Vandermonde

import numpy as np

def main():
    # Wide-range x values create ill-conditioned Vandermonde
    x = np.array([1, 10, 100, 1000, 10000], dtype=float)
    y = np.array([1, 2, 3, 4, 5], dtype=float)

    V = np.vander(x, N=3, increasing=True)
    cond = np.linalg.cond(V)
    print(f"Condition number: {cond:.2e}")

if __name__ == "__main__":
    main()

2. Centering and Scaling

Improve conditioning by normalizing the x values:

import numpy as np

def main():
    x = np.array([1, 10, 100, 1000, 10000], dtype=float)
    y = np.array([1, 2, 3, 4, 5], dtype=float)

    # Center and scale
    x_mean = x.mean()
    x_std = x.std()
    x_norm = (x - x_mean) / x_std

    V_raw = np.vander(x, N=3, increasing=True)
    V_norm = np.vander(x_norm, N=3, increasing=True)

    print(f"Raw condition:        {np.linalg.cond(V_raw):.2e}")
    print(f"Normalized condition: {np.linalg.cond(V_norm):.2e}")

if __name__ == "__main__":
    main()

3. np.polynomial Uses This Automatically

The modern np.polynomial.Polynomial.fit method automatically centers and scales internally, which is one reason it is numerically superior to np.polyfit.

Weighted Least Squares

Assign different importance to different data points.

1. Weight Parameter in polyfit

import numpy as np

def main():
    x = np.array([0, 1, 2, 3, 4], dtype=float)
    y = np.array([1.0, 2.8, 5.1, 7.0, 20.0])  # last point is outlier

    # Unweighted fit
    c_unweighted = np.polyfit(x, y, 1)

    # Downweight the outlier
    w = np.array([1, 1, 1, 1, 0.1])
    c_weighted = np.polyfit(x, y, 1, w=w)

    print(f"Unweighted: y = {c_unweighted[0]:.2f}x + {c_unweighted[1]:.2f}")
    print(f"Weighted:   y = {c_weighted[0]:.2f}x + {c_weighted[1]:.2f}")

if __name__ == "__main__":
    main()

2. Mathematical Formulation

Weighted least squares minimizes:

\[ \sum_{i=1}^{m} w_i \left( y_i - p(x_i) \right)^2 \]

This is equivalent to solving \(W V \mathbf{c} = W \mathbf{y}\) where \(W = \text{diag}(\sqrt{w_1}, \ldots, \sqrt{w_m})\).

3. Use Cases

Weights are useful when data points have different measurement uncertainties. Set \(w_i = 1/\sigma_i^2\) where \(\sigma_i\) is the standard deviation of the measurement at \(x_i\).

Summary

Method Function Stability Notes
Normal equations np.linalg.solve(V.T @ V, V.T @ y) Poor Condition number squared
QR decomposition np.linalg.qr + solve Good Used by np.polyfit
SVD-based lstsq np.linalg.lstsq Best Returns rank and singular values
Modern API Polynomial.fit Best Auto-centering and scaling

The least squares framework unifies all polynomial fitting: np.polyfit builds a Vandermonde matrix and solves via QR, np.linalg.lstsq uses SVD for maximum stability, and np.polynomial.Polynomial.fit adds automatic conditioning improvements.