Chapter 12: SciPy Stats¶
This chapter covers statistical analysis with SciPy, including descriptive statistics, probability distributions, density estimation, hypothesis testing, correlation, regression, resampling methods, and information theory.
12.1 Descriptive Statistics¶
- Summary Statistics
- Moments and Skewness
- Percentiles and Quantiles
- Outlier Detection
- Robust Statistics (MAD, Trimmed)
12.2 Distribution Object Model¶
- rv_continuous and rv_discrete
- Frozen Distributions
- Sampling (rvs)
- Density and Mass (pdf, pmf)
- Cumulative Distribution (cdf, sf)
- Quantile Function (ppf, isf)
- Moments and Stats (mean, var, moment, stats)
- Fitting Distributions to Data (fit)
- Confidence Intervals (interval)
12.3 Distribution Families¶
- Normal Distribution
- t-Distribution
- Chi-Square Distribution
- F-Distribution
- Exponential and Gamma
- Beta Distribution
- Uniform Distribution
- Lognormal and Weibull
- Binomial and Poisson
- Multivariate Normal
- Custom Distributions (rv_continuous subclass)
12.4 Density Estimation¶
12.5 Hypothesis Testing¶
- Null and Alternative Hypotheses
- p-values and Significance
- Type I and Type II Errors
- Power Analysis
- Confidence Intervals
- Multiple Testing Correction (Bonferroni, FDR)
- One-Sided vs Two-Sided Tests
- Effect Size
12.6 Statistical Tests¶
- t-Tests (ttest_1samp, ttest_ind, ttest_rel)
- ANOVA (f_oneway)
- Chi-Square Tests (chi2_contingency, chisquare)
- Normality Tests (Shapiro, Anderson, D'Agostino)
- Non-Parametric Tests (Mann-Whitney, Wilcoxon, Kruskal)
- Goodness of Fit (kstest, anderson)
- Variance Tests (Levene, Bartlett)
- Test Selection Guide
12.7 Correlation¶
- Pearson and Spearman
- Kendall's Tau
- Partial Correlation
- Autocorrelation
- Covariance and Covariance Matrix
- Correlation Pitfalls
12.8 Regression¶
- linregress
- OLS Fundamentals
- Residual Analysis
- Multiple Regression Concepts
- Polynomial Regression
- Regularization Preview (Ridge, Lasso)
- Logistic Regression Preview
12.9 Resampling Methods¶
- Bootstrap Basics (scipy.stats.bootstrap)
- Bootstrap Confidence Intervals
- Permutation Tests (permutation_test)
- Jackknife Method
- Monte Carlo Simulation
12.10 Probability Plots and Diagnostics¶
- QQ Plots (probplot)
- PP Plots
- Residual Diagnostics
- Distribution Comparison Visualization
- Normality Visualization
12.11 Information Theory¶
- Entropy (shannon_entropy)
- KL Divergence
- Cross-Entropy
- Mutual Information
- Connection to ML Loss Functions