Computational Statistics and Data Analysis (MVComp2)

Summer Semester 2024. April 15th, 2023 to July 19, 2024

Lectures: Wed 11-13; Exercises: Wed 14-16

Lectures: INF 227 01.403/404; Exercises: INF 227 01.403/404

Lecturers:

Tutors: Luis Walter, Florian Hess, Alena Braendle, Max Ingo Thurm

6 credit points

Course description

This lecture will introduce basic methods and approaches in computational statistics and data analysis, of great importance to empirical problems in the natural sciences. An overview of relevant concepts and theorems in probability theorey and statistics will be covered, all the way to more modern approaches, including automatic differentiation and machine learning. Lectures will be accompanied by computational exercises in Python. Students will learn to analyze data sets and interpret the results from a solid, thoeretically grounded statistical perspective; devise statistical and machine learning models of experimental situations; infer the parameters of these models from empirical observations; and test hypotheses.

Prerequisites

  • Linear (Matrix) Algebra
  • Basic calculus (derivatives & integrals)
  • Basic programming skills in Python

Tentative course outline

  1. Basic concepts in probability theory
  2. Random variables; expectations, variances, covariances, and their properties
  3. Discrete & continuous probability distributions
  4. Moment-generating functions, central limit theorem, and multivariate distributions
  5. Statistical models & inference: parameter estimation
  6. Hypothesis tests: tests, confidence intervals, bootstrap method
  7. Linear regression: least squares, generalized linear model
  8. Regularization: Ridge & LASSO regression, MAP estimation
  9. Nonlinear regression: basis expansions, neural networks
  10. Classification: k-nearest neighbors, logistic regression, linear discriminant analysis
  11. Kernel methods: Mercer kernels, Gaussian processes, support vector machines
  12. Model selection: Jeffreys scale, BIC, bias-variance tradeoff
  13. Dimensionality reduction: principal component analysis, factor analysis
  14. Information theory

Main references