Skip to content
Introductory Statisitcs
Maximum Likelihood for Censored Data
Initializing search
Introductory Statisitcs
Home
0 Prerequisites
0 Prerequisites
Chapter Overview
0.1 Mathematical Background (Review)
0.1 Mathematical Background (Review)
Sets, Functions, and Logic
Sequences, Limits, and Asymptotics
Linear Algebra Notation and Conventions
0.2 Computational Tools
0.2 Computational Tools
Python and Jupyter Basics
NumPy Arrays
Data Handling with pandas
Basic Visualization with Matplotlib
0.3 Square Matrices
0.3 Square Matrices
Chapter Overview
Similar Matrices
Diagonal Form of Diagonalizable Matrices
Jordan Canonical Form
Trace and Eigenvalues
Idempotent Matrices
Symmetric Matrices
Positive Definite Matrices
Gram Matrices
Projection Matrices
Orthogonal Projection Matrices
0.4 Linear Algebra and Statistics
0.4 Linear Algebra and Statistics
Chi-Squared Distribution and Quadratic Forms
Sampling Distributions (Simple Ordinary Least Squares)
Sampling Distributions (General Ordinary Least Squares)
0.5 Exercises
1 Data Collection
1 Data Collection
Chapter Overview
1.1 Classical Approach—Design Your Data Collection
1.1 Classical Approach—Design Your Data Collection
Populations and Samples
Parameters vs Statistics
Observational Studies
Confounding and Association vs Causation
Controlled Experiments
Randomization and Blinding
Sample Surveys and Sampling Methods
Bias and Nonresponse
Survivorship Bias
Strengths and Limitations
Design Your Data Collection
Analyze Available Data
1.2 Modern Approach—Analyze the Data You Have
1.2 Modern Approach—Analyze the Data You Have
Statistical Models vs Learning Algorithms
Prediction vs Inference
1.3 Three Learning Paradigms
1.3 Three Learning Paradigms
Unsupervised Learning (Pattern Discovery)
Supervised Learning (Prediction with Labels)
Reinforcement Learning (Sequential Decisions)
1.4 Exercises
2 Descriptive Statistics
2 Descriptive Statistics
Chapter Overview
2.1 Exploratory Data Analysis
2.1 Exploratory Data Analysis
Histograms and Density Plots
Empirical Distribution and Quantile-Quantile Plots
2.2 Shape of Distributions
2.2 Shape of Distributions
Modality
Skewness and Kurtosis
Outliers and Leverage
2.3 Numerical Summaries
2.3 Numerical Summaries
Mean, Median, Mode
Variance and Standard Deviation
Interquartile Range and Robust Measures
Median Absolute Deviation
2.4 Visualization
2.4 Visualization
Boxplots
Violin Plots
Group Comparisons
2.5 Code
2.5 Code
Group Comparison Examples
Descriptive Statistics (Weed Prices)
Advanced Measures (Geometric Mean, Chebyshev)
2.6 Exercises
3 Foundations of Probability
3 Foundations of Probability
Chapter Overview
3.1 Probability Theory
3.1 Probability Theory
Sample Spaces and Events
Axioms of Probability
Conditional Probability
Bayes' Theorem
3.2 Independence
3.2 Independence
Independence of Events
Conditional Independence
3.3 Random Variables
3.3 Random Variables
Discrete Random Variables
Continuous Random Variables
Mass, Density, and Cumulative Distribution Functions
3.4 Expectation and Moments
3.4 Expectation and Moments
Expectation and Linearity
Variance and Covariance
Moment Generating Functions
3.5 Limit Theorems
3.5 Limit Theorems
Law of Large Numbers
Central Limit Theorem
Berry–Esseen Theorem
3.6 Code
3.6 Code
Central Limit Theorem Multi-Distribution Visualization
Gambler's Paradox (Law of Large Numbers Failure)
3.7 Exercises
4 Distributions
4 Distributions
Chapter Overview
4.1 Discrete Distributions
4.1 Discrete Distributions
Bernoulli and Binomial
Geometric and Negative Binomial
Poisson Distribution
4.2 Continuous Distributions
4.2 Continuous Distributions
Uniform Distribution
Exponential Distribution
Normal Distribution
4.3 Multivariate Structure
4.3 Multivariate Structure
Joint Distributions
Marginal and Conditional Distributions
Covariance and Correlation
Independence vs Zero Correlation
4.4 Code
4.4 Code
Normal Density Function with scipy.stats
Normal Random Variates
Normal Cumulative Distribution and Quantiles
Student-t Density Function
Chi-Square Density Function
F-Distribution Density Function
Normal Percent Point Function (Quantile Function)
Normal Survival Function
Exponential Density Function
Uniform Density Function
Logistic Density Function (vs Normal)
Log-Normal Density Function
Weibull Density Function and Hazard
Discrete Distributions Suite
Bivariate Normal Distribution
Gaussian 2D Conditional Distributions
Gaussian 2D Eigendecomposition
Inverse Transform Sampling
4.5 Exercises
5 Sampling Distributions
5 Sampling Distributions
Chapter Overview
5.1 Foundations
5.1 Foundations
Statistics as Random Variables
Repeated Sampling Concept
5.2 The Four Fundamental Sampling Distributions
5.2 The Four Fundamental Sampling Distributions
Normal Distribution (Z)
Student's t Distribution
Chi-Square Distribution (χ²)
F Distribution
5.3 Applications to Common Statistics
5.3 Applications to Common Statistics
Sampling Distribution of the Mean
Standard Error
Sampling Distribution of Proportions
Sampling Distribution of the Variance
Difference of Two Sample Means
Difference of Two Sample Proportions
5.4 Visualization
5.4 Visualization
Sampling Distribution Visualization
5.5 Code
5.5 Code
Sampling Distribution of X̄ (Uniform)
Sampling Distribution of X̄ (Exponential)
Sampling Distribution of X̄ (Normal)
Sampling Distribution of X̄ (Bernoulli)
Sampling Distribution of S² (Normal)
Standard Error of X̄
Standard Error of S²
Sampling Distribution Income Visualization
Bootstrap Standard Error
Financial Crisis Central Limit Theorem Failure
5.6 Exercises
6 Statistical Estimation
6 Statistical Estimation
Chapter Overview
6.1 Estimator Quality
6.1 Estimator Quality
Bias–Variance Tradeoff
Mean Squared Error
Consistency and Asymptotic Normality
Efficiency and Cramér–Rao Lower Bound
Sufficiency and Minimal Sufficiency
Bias Variance
Mse
6.2 Maximum Likelihood Estimation
6.2 Maximum Likelihood Estimation
Likelihood Function
Introduction to Maximum Likelihood Estimation
Maximum Likelihood for Bernoulli Distribution
Maximum Likelihood for Normal Distribution
Maximum Likelihood for Poisson Distribution
Maximum Likelihood for Exponential Distribution
Capture-Recapture Method
Asymptotic Properties of Maximum Likelihood
Fisher Information and Standard Errors
6.3 Method of Moments
6.3 Method of Moments
Method of Moments Foundations
Method of Moments for Common Distributions
Generalized Method of Moments
Method of Moments vs Maximum Likelihood Comparison
Likelihood and Estimation Overview
Method of Moments Overview
6.4 Bayesian Estimation
6.4 Bayesian Estimation
Prior, Likelihood, and Posterior
Conjugate Priors
Maximum a Posteriori Estimation
6.5 Code
6.5 Code
Estimation Methods Comparison
Maximum Likelihood Optimization Examples
Fisher Information Computation
Bayesian Estimation Demonstrations
Capture-Recapture Maximum Likelihood
Log-Likelihood Visualization
Bayesian Beta Conjugate Prior
Geometric and Poisson Maximum Likelihood
6.6 Exercises
7 Estimation of μ and σ²
7 Estimation of μ and σ²
Chapter Overview
7.1 Estimation of the Mean
7.1 Estimation of the Mean
Sample Mean as Estimator
Bias and Consistency
Efficiency of the Sample Mean
Trimmed and Winsorized Means
7.2 Estimation of the Variance
7.2 Estimation of the Variance
Naive Variance Estimator
Bessel's Correction
Mean Squared Error of Variance Estimators
Robust Variance Estimators
7.3 Gaussian Maximum Likelihood
7.3 Gaussian Maximum Likelihood
Maximum Likelihood Estimation of μ and σ²
Bias of Gaussian Maximum Likelihood for σ²
Sufficiency and Completeness
7.4 Estimation Under Non-Normality
7.4 Estimation Under Non-Normality
Heavy-Tailed Distributions
Skewed Distributions
7.5 Code
7.5 Code
Sample Mean Properties
Consistency and Convergence
Variance Estimators
Bessel's Correction
Gaussian Maximum Likelihood
Return Estimation
Robust Estimators Comparison
Cauchy Law of Large Numbers Failure
7.6 Exercises
8 Confidence Intervals
8 Confidence Intervals
Chapter Overview
8.1 Foundations
8.1 Foundations
Confidence Level and Coverage
Interpretation and Common Pitfalls
8.2 One-Sample Intervals
8.2 One-Sample Intervals
Confidence Interval for μ
Confidence Interval for p
Confidence Interval for σ²
8.3 Two-Sample Intervals
8.3 Two-Sample Intervals
Confidence Interval for μ₁−μ₂
Confidence Interval for p₁−p₂
Confidence Interval for σ₁²/σ₂²
8.4 Paired-Sample Intervals
8.4 Paired-Sample Intervals
Confidence Interval for μ_D (Mean of Differences)
When to Use Paired vs Independent Designs
Paired Interval for Proportions (McNemar)
8.5 Sample Size Determination
8.5 Sample Size Determination
Sample Size for Desired Margin of Error
Sample Size for Comparing Two Groups
8.6 Code
8.6 Code
Confidence Interval Demonstrations
Sample Size Calculations
Mean Confidence Interval Coverage Simulation
Proportion Confidence Interval Coverage Simulation
Variance Confidence Interval Coverage Simulation
Paired Mean Confidence Interval Coverage Simulation
Difference of Means Confidence Interval Simulation
One-Sample Mean Confidence Interval Computation
One-Sample Proportion Confidence Interval Computation
8.7 Exercises
9 Hypothesis Testing
9 Hypothesis Testing
Chapter Overview
9.1 Foundations
9.1 Foundations
Null and Alternative Hypotheses
Test Statistics and p-values
Significance Level and Decision Rules
9.2 One-Sample Tests
9.2 One-Sample Tests
Z-Test for μ (Known σ)
t-Test for μ (Unknown σ)
Z-Test for p (Proportion)
Chi-Square Test for σ²
One-Sample Tests Overview
9.3 Two-Sample Tests
9.3 Two-Sample Tests
Two-Sample Z-Test for μ₁−μ₂
Two-Sample t-Test (Pooled and Welch)
Two-Sample Z-Test for p₁−p₂
F-Test for σ₁²/σ₂²
Two-Sample Tests Overview
9.4 Paired-Sample Tests
9.4 Paired-Sample Tests
Paired t-test for μ_D
When to Use Paired vs Two-Sample
9.5 Errors and Power
9.5 Errors and Power
Type I and II Errors
Power Analysis
Confidence Interval and Test Duality
9.6 Multiple Testing
9.6 Multiple Testing
Family-Wise Error Rate
Bonferroni and Holm Corrections
False Discovery Rate (Benjamini–Hochberg)
9.7 Code
9.7 Code
Hypothesis Testing Demonstrations
Power Analysis and Sample Size
Multiple Testing Corrections
One-Sample Mean Test
One-Sample Proportion Test
One-Sample Variance Test
F-Test for Two Variances
Paired Mean Test
Two-Sample Mean Test
Two-Sample Proportion Test
Coin Toss Simulation
Two-Sample t-Test (Weed Prices)
Type I/II Error Visualization
Rejection Region Demo
Height/Weight Hypothesis Tests
P-Hacking Demonstration
False Discovery Rate and Resampling Multiple Testing
9.8 Exercises
10 Chi-Square Tests
10 Chi-Square Tests
Chapter Overview
10.1 Chi-Square Distribution and Asymptotics
10.1 Chi-Square Distribution and Asymptotics
Chi-Square Distribution
Degrees of Freedom and Asymptotic Theory
10.2 Chi-Square Tests for Categorical Data
10.2 Chi-Square Tests for Categorical Data
Goodness-of-Fit Test
Test of Independence
Test of Homogeneity
10.3 Practical Considerations
10.3 Practical Considerations
Expected Cell Counts and Validity Conditions
Effect Size and Cramér's V
10.4 Code
10.4 Code
Chi-Square Goodness-of-Fit (Manual)
Chi-Square Goodness-of-Fit (scipy)
Independence Test (Manual with Plot)
Independence Test Template Function
Homogeneity Test (scipy)
Homogeneity Residual Heatmap
Fisher's Exact Test (2×2)
McNemar's Test (Paired Binary)
Cochran's Q Test (k Related Outcomes)
10.5 Exercises
11 Analysis of Variance
11 Analysis of Variance
Chapter Overview
11.1 One-Way Analysis of Variance
11.1 One-Way Analysis of Variance
Model and Assumptions
F-Test Procedure
11.2 Two-Way Analysis of Variance
11.2 Two-Way Analysis of Variance
Main Effects and Blocking
Interaction Effects
11.3 Post-Hoc Comparisons
11.3 Post-Hoc Comparisons
Tukey Honestly Significant Difference
Bonferroni and Scheffé Methods
Dunnett's Test (vs Control)
Games–Howell (Unequal Variances)
11.4 Welch's Analysis of Variance
11.4 Welch's Analysis of Variance
Welch's One-Way Analysis of Variance
Welch's Two-Way Analysis of Variance
11.5 Assumptions
11.5 Assumptions
Assumptions Overview
Checking Normality of Residuals
Checking Independence of Observations
Checking Homoscedasticity
Checking Linearity
11.6 Diagnostics
11.6 Diagnostics
Residual Analysis
Influential Data Points
Handling Assumption Violations
11.7 Practical Applications
11.7 Practical Applications
A/B Testing and Experimental Design
Financial Applications of Analysis of Variance
Case Studies
11.8 Code
11.8 Code
Analysis of Variance Diagnostics
Post-Hoc Comparison Examples
One-Way Analysis of Variance End-to-End Pipeline
One-Way Analysis of Variance with scipy and Plots
Two-Way Analysis of Variance End-to-End Pipeline
Interaction Effect Plot
Welch Analysis of Variance Type I Error and Power Simulation
Two-Way Welch Analysis of Variance (Robust HC3)
Analysis of Variance F-Statistic Simulation
Manual Analysis of Variance with Fisher Method
Bartlett Test
Chi2 Test For Variance
Chi Square Distribution
F Test Equality Of Variances
Levene Test
11.9 Exercises
12 Correlation and Causation
12 Correlation and Causation
Chapter Overview
12.1 Correlation
12.1 Correlation
Pearson Correlation Coefficient
Spearman Rank Correlation
Kendall's Tau
Partial Correlation
Point-Biserial and Phi Coefficients
Understanding Correlation
12.2 Ecological Correlation
12.2 Ecological Correlation
Ecological Fallacy
Simpson's Paradox
Aggregation Bias
Ecological Correlation Overview
12.3 Correlation, Causation, and Confounding
12.3 Correlation, Causation, and Confounding
Confounding Variables
Spurious Correlations
Lurking Variables and Common Causes
Confounding Overview
12.4 Causation
12.4 Causation
Criteria for Causal Inference
Randomized Experiments and Causation
Instrumental Variables (Introduction)
Directed Acyclic Graphs
Causation Overview
12.5 Correlation Tests
12.5 Correlation Tests
Testing Pearson's r (t-Test for Correlation)
Testing Spearman's ρ
Testing Kendall's τ
Comparing Two Correlations
Correlation Tests Overview
12.6 Correlation Matrix and Visualization
12.6 Correlation Matrix and Visualization
Correlation Heatmaps
Pair Plots and Scatter Matrices
12.7 Code
12.7 Code
Correlation Analysis Demonstrations
Causal Inference Simulations
Correlation Visualization
Correlation Ellipse Plot
Covariance from Scratch
Regression Correlation Plot
Confounding and Causation Demo
Correlation Causation
12.8 Exercises
13 Linear Regression
13 Linear Regression
Chapter Overview
13.1 Linear Regression
13.1 Linear Regression
Simple Linear Regression
Multiple Linear Regression
13.2 Estimation and Inference
13.2 Estimation and Inference
Least Squares Estimation
Sampling Distributions (Simple Ordinary Least Squares)
Confidence Intervals for Coefficients
Sampling Distributions and Tests (General Least Squares)
13.3 Testing Coefficients
13.3 Testing Coefficients
Hypothesis Tests (t-tests)
p-values and Confidence Intervals
Interpretation of Significance
13.4 Interaction and Polynomial Extensions
13.4 Interaction and Polynomial Extensions
Interaction Terms
Polynomial Regression
13.5 Assumptions and Diagnostics
13.5 Assumptions and Diagnostics
Assumptions Overview (LINE)
Linearity Assumption
Independence Assumption
Homoscedasticity Assumption
Normality Assumption
Checking Linearity
Checking Independence
Checking Homoscedasticity
Checking Normality
13.6 Diagnostics
13.6 Diagnostics
Residual Analysis
Multicollinearity and Variance Inflation Factor
Influence and Leverage (Cook's Distance, DFFITS)
13.7 Performance Metrics
13.7 Performance Metrics
R² and Adjusted R²
Mean Absolute, Mean Squared, and Root Mean Squared Error
Mean Absolute Percentage Error and Other Relative Metrics
Performance Metrics Overview
13.8 Model Selection Criteria
13.8 Model Selection Criteria
Akaike Information Criterion
Bayesian Information Criterion
Cross-Validation for Model Selection
Stepwise and Best-Subset Selection
Akaike and Bayesian Information Criteria Overview
13.9 Splines and Generalized Additive Models
13.9 Splines and Generalized Additive Models
Splines and Generalized Additive Models Overview
Generalized Additive Models
13.10 Package Usage
13.10 Package Usage
statsmodels Ordinary Least Squares Interface
sklearn LinearRegression Interface
Comparison and When to Use Which
13.10 Code
13.10 Code
Confidence Interval for Slope (Caffeine Example)
Confidence Interval and Prediction Bands
Ordinary Least Squares Regression Output Reproduction
Multiple Regression Diagnostics
Testing Coefficients Examples
Model Selection Comparison
Weighted Least Squares
Generalized Additive Model Housing Analysis
Regression Diagnostics (Housing)
Residual Sum of Squares Surface Visualization
3D Regression Plane
Cross-Validation Polynomial Model Selection
Step Functions
Splines with Patsy
Regression (Weed Price vs Demographics)
Ordinary Least Squares Simulation (Monte Carlo)
Subset and Stepwise Selection
13.12 Exercises
14 Normality Tests
14 Normality Tests
Chapter Overview
14.1 Introduction to Normality
14.1 Introduction to Normality
What Is Normality and Why It Matters
Central Role in Statistical Inference
Normality in Financial Data
14.2 Graphical Methods
14.2 Graphical Methods
Histogram and Density Plots
Quantile-Quantile Plots
Boxplots and Their Interpretation
Quantile-Quantile Plots for Financial Returns
14.3 Descriptive Statistics as Normality Indicators
14.3 Descriptive Statistics as Normality Indicators
Skewness and Kurtosis
Skewtest and Kurtosistest
D'Agostino's K-Squared Test
Jarque-Bera Test
14.4 Formal Tests for Normality
14.4 Formal Tests for Normality
Kolmogorov-Smirnov and Lilliefors Tests
Anderson-Darling Test
Shapiro-Wilk Test
14.5 Limitations and Pitfalls
14.5 Limitations and Pitfalls
Sample Size Effects on Power
Sensitivity vs Practical Significance
Choosing the Right Test
Limitations Overview
14.6 Dealing with Non-Normal Data
14.6 Dealing with Non-Normal Data
Transformations to Achieve Normality
Bootstrapping as an Alternative
Non-Parametric Methods
14.7 Applications
14.7 Applications
Normality in t-Tests and Analysis of Variance
Normality in Regression (Residual Diagnostics)
Normality of Financial Returns
Applications Overview
14.8 Code
14.8 Code
Graphical Normality Checks
Formal Normality Test Suite
Transformation Demonstrations
Quantile-Quantile Plot with Normality Tests
Quantile-Quantile Plot Confidence Band Simulation
Distribution Shapes via Boxplots
Skewness Test
Kurtosis Test
D'Agostino K² Test
Jarque-Bera Test
Kolmogorov-Smirnov Test
Lilliefors Test
Anderson-Darling Test
Shapiro-Wilk Test
Shapiro-Wilk Power Simulation
Quantile-Quantile Plot Financial Returns
14.9 Exercises
15 Variance Tests
15 Variance Tests
Chapter Overview
15.1 Introduction to Variance Testing
15.1 Introduction to Variance Testing
Why Test Variances
Overview of Variance Tests
Assumptions Common to Variance Tests
15.2 Chi-Square Test for Variance
15.2 Chi-Square Test for Variance
One-Sample Chi-Square Variance Test
Derivation and Distribution Theory
Confidence Interval for σ²
15.3 F-Test for Comparing Two Variances
15.3 F-Test for Comparing Two Variances
Two-Sample F-Test
F-Distribution and Degrees of Freedom
Sensitivity to Non-Normality
15.4 Bartlett's Test
15.4 Bartlett's Test
Bartlett's Test for Equality of Variances
Derivation and Chi-Square Approximation
Limitations Under Non-Normality
15.5 Robust Tests
15.5 Robust Tests
Levene's Test
Brown-Forsythe Test
Fligner-Killeen Test
Comparison of Robust Methods
Robust Tests Overview
15.6 Advanced Methods
15.6 Advanced Methods
Bootstrap Variance Testing
Bayesian Variance Testing
Likelihood Ratio Test for Variances
Advanced Methods Overview
15.7 Applications
15.7 Applications
Pre-Test for Analysis of Variance Homoscedasticity
Variance Testing in Regression
Financial Volatility Comparisons
Applications Overview
15.8 Code
15.8 Code
Chi-Squared Test for Variance
F-Test of Equality of Variances
Bartlett's Test
Levene's Test
Chi-Square Distribution
Robust Variance Tests Comparison
F-Test Tail Region Visualization
F-Test Normality Sensitivity and Robust Alternatives
F-Test Power Simulation
Bartlett Test Non-Normality Sensitivity
Levene Test Normal vs Skewed Simulation
Brown-Forsythe Test (scipy)
Fligner-Killeen Test (scipy)
Bootstrap Variance Test
Bayesian Variance Test
15.9 Exercises
16 Non-Parametric Tests
16 Non-Parametric Tests
Chapter Overview
16.1 Foundations
16.1 Foundations
When and Why to Use Non-Parametric Tests
Ranks and Rank Transformations
Power Comparison with Parametric Tests
16.2 One-Sample Non-Parametric Tests
16.2 One-Sample Non-Parametric Tests
Runs Test for Randomness
Sign Test
Wilcoxon Signed-Rank Test
Binomial Test
One-Sample Tests Overview
16.3 Paired-Sample Non-Parametric Tests
16.3 Paired-Sample Non-Parametric Tests
Paired Sign Test
Wilcoxon Signed-Rank for Paired Data
Paired Permutation Test
Paired Tests Overview
16.4 Two-Sample Non-Parametric Tests
16.4 Two-Sample Non-Parametric Tests
Wilcoxon Rank-Sum Test
Mann-Whitney U Test
Kolmogorov-Smirnov Two-Sample Test
Two-Sample Tests Overview
16.5 Multi-Group Non-Parametric Tests
16.5 Multi-Group Non-Parametric Tests
Kruskal-Wallis Test
Friedman Test (Repeated Measures)
Mood's Median Test
Post-Hoc Dunn's Test
16.6 Non-Parametric Correlation
16.6 Non-Parametric Correlation
Spearman's ρ (Revisited)
Kendall's τ (Revisited)
16.7 Code
16.7 Code
Runs Test
Sign Test
Wilcoxon Tests
Two-Sample and Multi-Group Tests
Non-Parametric Test Suite
16.8 Exercises
17 Resampling Methods
17 Resampling Methods
Chapter Overview
17.1 Bootstrap Foundations
17.1 Bootstrap Foundations
The Bootstrap Principle
Non-Parametric Bootstrap
Parametric Bootstrap
Resampling Method
Bootstrap Overview
17.2 Bootstrap Confidence Intervals
17.2 Bootstrap Confidence Intervals
Percentile Method
Bias-Corrected and Accelerated Bootstrap
Bootstrap-t Method
Visualization of Confidence Levels
Comparison of Bootstrap Confidence Interval Methods
17.3 Bootstrap Hypothesis Testing
17.3 Bootstrap Hypothesis Testing
Bootstrap Test for a Single Mean
Bootstrap Test for Two Means
Bootstrap Test for Correlation
17.4 Permutation Tests
17.4 Permutation Tests
Permutation Test Foundations
Permutation Test for Two-Sample Location
Permutation Test for Correlation
Permutation Test for Paired Data
Permutation Tests Overview
17.5 Comparison and Practical Guidance
17.5 Comparison and Practical Guidance
Bootstrap vs Permutation Tests
Number of Resamples and Convergence
When Resampling Fails
Comparison Overview
17.6 Code
17.6 Code
Bootstrap Confidence Interval Methods
Bootstrap Hypothesis Tests
Permutation Test Demonstrations
Resampling Methods Comparison
A/B Testing Permutation
Bootstrap Confidence Interval Visualization
Bootstrap Median
Resampling (Shoe Sales A/B Test)
Cross-Validation Methods Comparison
17.7 Exercises
18 Regularization Techniques
18 Regularization Techniques
Chapter Overview
18.1 Motivation for Regularization
18.1 Motivation for Regularization
Overfitting and the Bias–Variance Tradeoff
Ill-Conditioned Design Matrices
Multicollinearity and Regularization
18.2 Ridge Regression
18.2 Ridge Regression
Ridge Formulation and Closed-Form Solution
Geometric Interpretation (L2 Penalty)
Bayesian Interpretation (Gaussian Prior)
Ridge Trace and Choosing λ
Ridge Regression Overview
18.3 Lasso Regression
18.3 Lasso Regression
Lasso Formulation and Sparsity
Geometric Interpretation (L1 Penalty)
Bayesian Interpretation (Laplace Prior)
Coordinate Descent Algorithm
Lasso for Feature Selection
Lasso Regression Overview
18.4 Elastic Net
18.4 Elastic Net
Elastic Net Formulation
Advantages over Pure Ridge and Lasso
Grouping Effect for Correlated Features
Elastic Net Overview
18.5 Hyperparameter Tuning
18.5 Hyperparameter Tuning
Cross-Validation for λ Selection
Regularization Path
Information Criteria for Regularized Models
18.6 Dimensionality Reduction Methods
18.6 Dimensionality Reduction Methods
Principal Components and Partial Least Squares Overview
Principal Components Regression
Partial Least Squares
18.7 Overview and Comparison
18.7 Overview and Comparison
Ridge vs Lasso vs Elastic Net
Guidelines for Choosing a Method
Overview
18.7 Code
18.7 Code
Ridge Regression Examples
Lasso Regression Examples
Elastic Net Examples
Ridge, Lasso, and Elastic Net Comparison
Cross-Validation and λ Tuning
Lasso Housing Regularization Path
Principal Components and Partial Least Squares Examples
18.9 Exercises
19 Logistic Regression
19 Logistic Regression
Chapter Overview
19.1 Logistic Regression
19.1 Logistic Regression
Logit Link and Odds
Odds Ratios and Interpretation
Likelihood for Logistic Regression
19.2 Estimation and Inference
19.2 Estimation and Inference
Maximum Likelihood Estimation
Newton–Raphson and Iteratively Reweighted Least Squares
Wald and Likelihood Ratio Tests
Deviance and Goodness-of-Fit
19.3 Model Evaluation
19.3 Model Evaluation
Confusion Matrix
Precision, Recall, and F1-Score
Receiver Operating Characteristic and Area Under Curve
Decision Threshold Tuning
Calibration and Brier Score
Handling Imbalanced Data
Evaluation Metrics Overview
19.4 Regularized Logistic Regression
19.4 Regularized Logistic Regression
L1 and L2 Regularization for Logistic Regression
Feature Selection via Penalized Likelihood
19.5 Code
19.5 Code
Logistic Regression Demonstrations
Receiver Operating Characteristic and Evaluation Metrics
Regularized Logistic Regression
Logistic vs Linear Visualization
Linear and Quadratic Discriminant Analysis Classification
19.6 Exercises
20 Softmax Regression
20 Softmax Regression
Chapter Overview
20.1 Softmax Regression
20.1 Softmax Regression
Multinomial Logistic Regression
Softmax Function and Probability Simplex
Numerical Stability (Log-Sum-Exp Trick)
20.2 Estimation and Optimization
20.2 Estimation and Optimization
Cross-Entropy Loss
Gradient-Based Optimization
Regularization for Softmax
20.3 Model Evaluation and Case Studies
20.3 Model Evaluation and Case Studies
Multiclass Metrics (Accuracy, Confusion Matrix)
Macro, Micro, and Weighted Averaging
MNIST Case Study
20.4 One-vs-Rest and One-vs-One Strategies
20.4 One-vs-Rest and One-vs-One Strategies
One-vs-Rest Approach
One-vs-One Approach
Comparison with Softmax
20.5 Code
20.5 Code
Softmax Regression Implementation
Multiclass Evaluation Metrics
MNIST Classification
20.6 Exercises
21 Survival Models
21 Survival Models
Chapter Overview
21.1 Introduction to Survival Analysis
21.1 Introduction to Survival Analysis
Time-to-Event Data and Censoring
Types of Censoring (Right, Left, Interval)
Survival and Hazard Functions
Financial Applications (Default, Churn, Duration)
21.2 Non-Parametric Methods
21.2 Non-Parametric Methods
Kaplan-Meier Estimator
Nelson-Aalen Cumulative Hazard
Log-Rank Test
Confidence Intervals for Survival Curves
21.3 Parametric Survival Models
21.3 Parametric Survival Models
Exponential Model (Constant Hazard)
Weibull Model (Monotone Hazard)
Log-Normal and Log-Logistic Models
Maximum Likelihood for Censored Data
21.4 Cox Proportional Hazards Model
21.4 Cox Proportional Hazards Model
Partial Likelihood and the Cox Model
Interpreting Hazard Ratios
Proportional Hazards Assumption
Model Diagnostics (Schoenfeld Residuals)
21.5 Model Comparison and Selection
21.5 Model Comparison and Selection
Non-Parametric vs Parametric vs Semi-Parametric
Akaike Information Criterion and Concordance Index
21.6 Code
21.6 Code
Kaplan-Meier Survival Curves and Log-Rank Test
Parametric Survival Models
Cox Proportional Hazards
Survival Model Comparison
21.7 Exercises
Bonus Chapter
Bonus Chapter
Code
Code
Kaplan Meier
Maximum Likelihood for Censored Data
¶