STAT340: Introduction to Data Modeling II, Fall 2022

This course teaches students to apply statistical methods to learn from data. Topics include one- and two-sample inference; an introduction to Bayesian inference and associated probability theory; linear and logistic regression models; the bootstrap; and cross-validation. Students use an integrated statistical computing environment to explore and analyze data, develop models, make inferences, and communicate results in a reproducible manner through a project-oriented approach to learning.

  Instructor: Keith Levin, kdlevin | at | wisc | dot | edu
TAs:
      Nursultan Azhimuratov, azhimuratov | at | wisc | dot | edu
      Alex Hayes, alex.hayes | at | wisc | dot | edu
      Shane Huang, shuang457 | at | wisc | dot | edu
      Joseph Salzer, jsalzer | at | wisc | dot | edu
Lectures:
      Section 001: TuTh, 11:00AM-12:15PM in Bardeen 140
      Section 002: TuTh, 2:30PM-3:45PM in Van Vleck B130
Office Hours:
      Keith Levin: Wednesdays 12pm-2pm in Medical Science Center 6170
      Nursultan Azhimuratov: Mondays 1pm-3pm in Medical Sciences Center 1274
      Alex Hayes: Wednesdays 10am-12pm in Medical Sciences Center 1475
      Shane Huang: Tuesdays and Thursdays 1pm-2pm in Medical Sciences Center 1274
      Joseph Salzer: Tuesdays 10am-11am in Medical Sciences Center 1217C and Tuesdays 5pm-6pm in Medical Sciences Center 1274
Textbook: We will make reference to a variety of textbooks this semester, all available online:
      Introduction to Data Science by Rafael Irizarry
      R for Data Science ("R4DS") by Hadley Wickham and Garrett Grolemund
      Introduction to Probability and Statistics Using R ("IPSUR") by G. Jay Kerns
      Introduction to Probability for Data Science by Stanley H. Chan
      An Introduction to Statistical Learning, 2nd Edition ("ISLR") by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
Syllabus: available here
Prerequisites: MATH 217, 221, or 275 and STAT 240

Date Topics Readings Notes
Week 0
Sep 8
Course introduction and administrivia
  • Review R materials from STAT240 (recommended)
  • Irizarry Chapters 2,3,8 or R4DS Chapters 19-21 (recommended)
  • Slides; Lecture notes; Lecture notes source files
  • HW00 due Sep 15
  • Week 1
    Sep 13,15
    Probability review and random variables
  • IPSUR Chapters 5 and 6 or Prob4DS Chapters 3 and 4 (recommended)
  • Lecture notes; Source; Discussion section
  • HW01 due Sep 22
  • Week 2
    Sep 20,22
    Introduction to Monte Carlo
  • Introduction to Monte Carlo Simulation by R. L. Harrison (recommended)
  • Lecture notes; Source; Discussion section
  • HW02 due Sep 29
  • Week 3
    Sep 27,29
    Hypothesis testing
  • Chan Sections 9.3 and 9.4 (recommended)
  • IPSUR Section 10.1 (recommended)
  • There is only one test! by Allen Downey (recommended)
  • Permutation Test: Visual Explanation by Jared Wilber (required)
  • Lecture notes; Source; Discussion section
  • HW03 due Oct 06
  • Week 4
    Oct 4,6
    Hypothesis testing, cont'd
  • Chan Sections 9.4 and 9.5 (recommended)
  • IPSUR Sections 10.2 and 10.4 (recommended)
  • Lecture notes; Source; Discussion section
  • HW04 due Oct 13
  • Week 5
    Oct 11,13
    Independence, Conditional Probability and Bayes' Rule
  • Chan section 2.4 (recommended)
  • An introduction to Bayes' rule by James Stone (recommended)
  • Chapters 1 and 2 of Think Bayes by Allen Downey
  • Lecture notes; Source; Discussion section
  • HW05 due Oct 20
  • Week 6
    Oct 18,20
    Estimation
  • IntroDS Sections 15.1, 15.2 (recommended)
  • IPSUR 8.1 and 8.2 (recommended)
  • Lecture notes; Source; Discussion section
  • HW06 due Oct 27
  • Week 7
    Oct 25,27
    Estimation, cont'd
  • IntroDS Sections 15.4, 15.6 (recommended)
  • IPSUR Chapter 9 (recommended)
  • Lecture notes; Source; Discussion section
  • HW07 due Nov 3
  • Week 8
    Nov 1,3
    Prediction: simple linear regression
  • ISLR 3.1-3.3 (required)
  • Overview of simple linear regression based on material from Probability and Statistics for Engineering and the Sciences by Jay Devore (recommnded)
  • IPSUR Chapter 11 (recommended)
  • Lecture notes; Source; Discussion section
  • HW08 due Nov 10
  • Week 9
    Nov 8,10
    Prediction: multiple linear regression
  • ISLR 3.1-3.3 (required)
  • ISLR Section 3.6 (required)
  • IPSUR Chapter 12 (recommended)
  • Prob4DS Chapter 7 (recommended)
  • ISLR Section 18.7 Case study: Moneyball (recommended; see Section 18.1 for background)
  • Lecture notes; Source; Discussion section
  • HW09 due Nov 17
  • Week 10
    Nov 15,17
    Prediction: logistic regression
  • ISLR Section 4.1-4.3 (required)
  • Overview of logistic regression by Ashutosh Tripathi (recommended)
  • Lecture notes; Source; Discussion section
  • HW10 due Dec 1
  • Week 11
    Nov 22
    One-off lecture: causal inference
  • Lecture notes; Source
  • Week 12
    Nov 29, Dec 1
    Cross-validation and model selection
  • ISLR Section 5.1, 6.1, 6.2 (required)
  • Lecture notes; Source; Discussion section
  • HW11 due Dec 8
  • Week 13
    Dec 6,8
    The bootstrap
  • ISLR Section 5.2, 5.3 (required)
  • Lecture notes; Source; Discussion section
  • Week 14
    Dec 13
    Recap and exam review
  • Lecture notes; Source