Polynomial Fitting: polyfit and poly1d¶

NumPy provides tools for fitting polynomials to data and working with polynomial objects.

Mental Model

polyfit finds the best polynomial curve through your data points by minimizing squared errors, and poly1d wraps those coefficients into a callable object. Think of polyfit as the fitting step and poly1d as the evaluation step -- together they give you a smooth function you can evaluate anywhere.

Under the hood, polyfit builds a Vandermonde matrix from your x-values and solves \(V\mathbf{c} \approx \mathbf{y}\) via QR decomposition — the same least squares machinery described in the previous page. Understanding this connection explains why high-degree fits on wide x-ranges become unstable (the Vandermonde matrix becomes ill-conditioned) and why the modern Polynomial.fit API, which scales x to \([-1, 1]\), is more reliable.

python import numpy as np

np.polyfit() — Fit Polynomial to Data¶

Fit a polynomial of specified degree to data points using least squares:

```python

Data points¶

x = np.array([0, 1, 2, 3, 4]) y = np.array([1, 3, 7, 13, 21])

Fit quadratic (degree 2): y = ax² + bx + c¶

coefficients = np.polyfit(x, y, deg=2) print(coefficients) # [1. 1. 1.] → y = x² + x + 1 ```

Understanding the Output¶

```python

coefficients are in descending order of power¶

[a, b, c] for ax² + bx + c¶

x = np.array([1, 2, 3, 4, 5]) y = np.array([2.1, 3.9, 6.2, 7.8, 10.1])

Linear fit (degree 1)¶

m, b = np.polyfit(x, y, 1) print(f"Slope: {m:.2f}, Intercept: {b:.2f}")

Slope: 2.00, Intercept: 0.02¶

```

np.poly1d() — Polynomial Object¶

Create a polynomial object from coefficients for easy evaluation:

```python

Create polynomial: x² + 2x + 3¶

p = np.poly1d([1, 2, 3]) print(p)

2¶

1 x + 2 x + 3¶

Evaluate at x = 2¶

print(p(2)) # 14 + 22 + 3 = 11

Evaluate at multiple points¶

x_vals = np.array([0, 1, 2, 3]) print(p(x_vals)) # [ 3 6 11 18] ```

Combining polyfit and poly1d¶

```python x = np.array([0, 1, 2, 3, 4]) y = np.array([1, 2.1, 4.9, 9.2, 16.1])

Fit and create polynomial object¶

coeffs = np.polyfit(x, y, 2) p = np.poly1d(coeffs)

Evaluate fitted polynomial¶

y_fitted = p(x) print(y_fitted) # [1.02 2.04 4.98 9.84 16.62]

Evaluate at new points¶

x_new = np.linspace(0, 5, 100) y_new = p(x_new) ```

Polynomial Operations¶

Arithmetic¶

```python p1 = np.poly1d([1, 2]) # x + 2 p2 = np.poly1d([1, 0, 1]) # x² + 1

Addition¶

print(p1 + p2) # x² + x + 3

Multiplication¶

print(p1 * p2) # x³ + 2x² + x + 2

Power¶

print(p1 ** 2) # x² + 4x + 4 ```

Derivative and Integral¶

```python p = np.poly1d([1, -2, 1]) # x² - 2x + 1 print(p)

2¶

1 x - 2 x + 1¶

Derivative¶

dp = p.deriv() print(dp) # 2x - 2

Second derivative¶

d2p = p.deriv(2) print(d2p) # 2

Integral (with constant of integration)¶

ip = p.integ() print(ip) # 0.3333 x³ - x² + x

Integral with specific constant¶

ip = p.integ(k=5) # constant = 5 ```

Roots¶

```python p = np.poly1d([1, -5, 6]) # x² - 5x + 6

Find roots¶

roots = p.r print(roots) # [3. 2.]

Verify: (x-2)(x-3) = x² - 5x + 6¶

```

Practical Examples¶

Linear Regression¶

```python

Simple linear regression¶

x = np.array([1, 2, 3, 4, 5]) y = np.array([2.2, 4.1, 5.9, 8.1, 9.8])

Fit line¶

slope, intercept = np.polyfit(x, y, 1) print(f"y = {slope:.2f}x + {intercept:.2f}")

y = 1.94x + 0.28¶

Create polynomial for predictions¶

line = np.poly1d([slope, intercept])

Predict new values¶

x_new = np.array([6, 7, 8]) y_pred = line(x_new) print(y_pred) # [11.92 13.86 15.80] ```

Quadratic Fit¶

```python

Projectile motion data¶

t = np.array([0, 0.5, 1.0, 1.5, 2.0]) # time h = np.array([0, 11.4, 15.2, 11.5, 0.1]) # height

Fit quadratic: h = at² + bt + c¶

a, b, c = np.polyfit(t, h, 2) print(f"h = {a:.2f}t² + {b:.2f}t + {c:.2f}")

h ≈ -9.8t² + 19.6t + 0.1¶

Find maximum height¶

trajectory = np.poly1d([a, b, c]) t_max = -b / (2*a) # vertex of parabola h_max = trajectory(t_max) print(f"Max height: {h_max:.2f} at t={t_max:.2f}") ```

Polynomial Interpolation¶

```python

Given points¶

x = np.array([0, 1, 2, 3]) y = np.array([1, 2, 1, 2])

Fit polynomial passing through all points¶

Need degree = n_points - 1 for exact fit¶

coeffs = np.polyfit(x, y, 3) p = np.poly1d(coeffs)

Verify: polynomial passes through all points¶

print(p(x)) # [1. 2. 1. 2.] ```

Trend Removal¶

```python

Time series with trend¶

t = np.arange(100) signal = 0.5 * t + 10 + np.random.randn(100) * 2

Fit and remove trend¶

trend_coeffs = np.polyfit(t, signal, 1) trend = np.poly1d(trend_coeffs) detrended = signal - trend(t)

detrended now has mean ≈ 0¶

```

Goodness of Fit¶

R-squared Calculation¶

```python def r_squared(y_true, y_pred): """Calculate R² (coefficient of determination).""" ss_res = np.sum((y_true - y_pred) ** 2) ss_tot = np.sum((y_true - y_true.mean()) ** 2) return 1 - (ss_res / ss_tot)

x = np.array([1, 2, 3, 4, 5]) y = np.array([2.1, 4.0, 5.9, 8.1, 10.0])

Fit and evaluate¶

p = np.poly1d(np.polyfit(x, y, 1)) y_pred = p(x) r2 = r_squared(y, y_pred) print(f"R² = {r2:.4f}") # R² ≈ 0.9988 ```

Residual Covariance¶

```python

polyfit can return residuals and other info¶

coeffs, residuals, rank, sv, rcond = np.polyfit( x, y, 1, full=True ) print(f"Residual sum of squares: {residuals[0]:.4f}") ```

Choosing Polynomial Degree¶

```python x = np.linspace(0, 10, 50) y = np.sin(x) + np.random.randn(50) * 0.1

Compare different degrees¶

for deg in [1, 3, 5, 10]: p = np.poly1d(np.polyfit(x, y, deg)) y_pred = p(x) r2 = r_squared(y, y_pred) print(f"Degree {deg:2d}: R² = {r2:.4f}")

Higher degree → better fit, but risk overfitting¶

```

Legacy API — Know the Limits

polyfit/poly1d are fine for quick, low-degree fits with moderate x-ranges. Avoid for:

High-degree polynomials (degree > 10) — numerical instability
Large x-ranges (e.g., [0, 10000]) — Vandermonde conditioning blows up
Production code — prefer Polynomial.fit from the modern np.polynomial module

See Least Squares Connection for why conditioning matters, and np.polynomial Module for the stable alternative.

Modern Alternative: np.polynomial¶

NumPy's newer polynomial module offers improved numerical stability:

```python from numpy.polynomial import polynomial as P

x = np.array([0, 1, 2, 3, 4]) y = np.array([1, 3, 7, 13, 21])

Fit polynomial (coefficients in ascending order!)¶

coeffs = P.polyfit(x, y, 2) print(coeffs) # [1. 1. 1.] → 1 + x + x²

Evaluate¶

y_fit = P.polyval(x, coeffs) ```

Key Differences¶

Feature	`np.polyfit`	`np.polynomial.polynomial`
Coefficient order	Descending (highest first)	Ascending (constant first)
Numerical stability	Can have issues	Better conditioning
Polynomial object	`np.poly1d`	`Polynomial` class

```python from numpy.polynomial import Polynomial

Create polynomial: 1 + 2x + 3x²¶

p = Polynomial([1, 2, 3]) # ascending order!

Fit from data¶

p_fit = Polynomial.fit(x, y, deg=2) ```

Summary¶

Function	Purpose
`np.polyfit(x, y, deg)`	Fit polynomial of degree `deg`
`np.poly1d(coeffs)`	Create polynomial object
`p(x)`	Evaluate polynomial at x
`p.deriv()`	Derivative of polynomial
`p.integ()`	Integral of polynomial
`p.r`	Roots of polynomial

Key Takeaways:

polyfit returns coefficients in descending order of power
poly1d creates callable polynomial objects
Degree = n_points - 1 for exact interpolation
Higher degree risks overfitting
Check fit quality with R² or residuals
For numerical stability, consider np.polynomial module
Use deriv() and integ() for calculus operations

Exercises¶

Exercise 1. Fit a quadratic polynomial to the projectile data t = [0, 0.5, 1.0, 1.5, 2.0] and h = [0, 11.4, 15.2, 11.5, 0.1]. Use the fitted poly1d to find the time at which the maximum height occurs and the value of that maximum height.

Solution to Exercise 1

import numpy as np

t = np.array([0, 0.5, 1.0, 1.5, 2.0])
h = np.array([0, 11.4, 15.2, 11.5, 0.1])

coeffs = np.polyfit(t, h, 2)
trajectory = np.poly1d(coeffs)

# Maximum at vertex: t = -b / (2a)
a, b, c = coeffs
t_max = -b / (2 * a)
h_max = trajectory(t_max)
print(f"Polynomial: {a:.2f}t^2 + {b:.2f}t + {c:.2f}")
print(f"Max height: {h_max:.2f} at t = {t_max:.2f}")

Exercise 2. Generate 100 points from y = sin(x) on [0, 2*pi] with Gaussian noise (sigma=0.1). Fit polynomials of degrees 3, 5, 7, and 9. Compute the R-squared value for each and print which degree gives the best fit without overfitting (hint: the R-squared should plateau).

Solution to Exercise 2

import numpy as np

np.random.seed(42)
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x) + 0.1 * np.random.randn(100)

def r_squared(y_true, y_pred):
    ss_res = np.sum((y_true - y_pred)**2)
    ss_tot = np.sum((y_true - y_true.mean())**2)
    return 1 - ss_res / ss_tot

for deg in [3, 5, 7, 9]:
    p = np.poly1d(np.polyfit(x, y, deg))
    r2 = r_squared(y, p(x))
    print(f"Degree {deg}: R^2 = {r2:.6f}")

Exercise 3. Create a time series with a quadratic trend: signal = 0.01*t**2 + 0.5*t + 10 + noise for t = np.arange(200). Use np.polyfit and np.poly1d to fit and remove the quadratic trend. Verify that the detrended signal has approximately zero mean and no visible trend.

Solution to Exercise 3

import numpy as np

np.random.seed(42)
t = np.arange(200, dtype=float)
noise = np.random.randn(200) * 2
signal = 0.01*t**2 + 0.5*t + 10 + noise

# Fit and remove quadratic trend
trend_coeffs = np.polyfit(t, signal, 2)
trend = np.poly1d(trend_coeffs)
detrended = signal - trend(t)

print(f"Detrended mean: {detrended.mean():.4f}")
print(f"Detrended std:  {detrended.std():.4f}")