University of Wisconsin-Madison
Statistics 327-3: Advanced Data Analysis with R

Check for updates to this tentative syllabus.

Students will integrate R with high performance computing tools to do scientific computing at an introductory level. Here is a course map.

NameOfficeOffice hourEmail (please use our Q&A forum for most things)
Yang, BoMedical Sciences Center 1227    Thur 11:50-12:50 am or by appointment
Li, QingMedical Sciences Center 1217A   Tues 3:00-4:00 pm
TA Kim, Yongjoon

Class Times
Lecture 327-003 (Teacher Li, Qing)Tues, Thur, 11:00 am-12:15 pm STERLING 2301
Lecture 327-006 (Teacher Yang, Bo)Tues, Thur, 1:00-2:15 pmSTERLING 2301
Lecture 327-009 (Teacher Yang, Bo)Tues, Thur, 2:30-3:45 pmSTERLING 2301

STAT 327: Intermediate Data Analysis with R

No textbook is required. We'll provide course notes, and we'll read R documentation and write R code.

Optional Online Reading
R for Data Science by Garrett Grolemund and Hadley Wickham
Advanced R by Hadley Wickham
An Introduction to R (pdf) by W. N. Venables, D. M. Smith and the R Development Core Team
Intro to R video lectures by Google Developers
R Programming wikibook
Using R for Data Analysis and Graphics by J. H. Maindonald
The R Inferno by Patrick Burns

Optional Reference Books
R for Data Science by Garrett Grolemund and Hadley Wickham
Advanced R by Hadley Wickham
Introductory Statistics with R by Peter Dalgaard (2008)
R in a Nutshell by Joseph Adler (2009)
A Beginner's Guide to R by Alain F. Zuur, Elena N. Ieno, and Erik Meesters (2009)
Software for Data Analysis: Programming with R by John Chambers (2008) (advanced)
Modern Applied Statistics with S by W.N. Venables and B.D. Ripley (2002)

A laptop is required in class.

Many questions outside of class should be posted at our Q&A forum. Please feel free to write answers when you know them. We are eager to help in class and office hours too.

These points are available (we might revise this as we write course materials):
≈ 3 R scripts or projects≈ 80
group practice exercises≈ 20
Answer questions in Piazza ≈ 2

We'll assign grades according to the percentage scale, A = [92,100], AB = [88,92), B = [82,88), BC = [78,82), C = [70,78), D = [60,70), F = [0,60) (92% of points => A); and according to the percentile scale, A = 70, AB = 60, B = 45, BC = 30, C = 10, D = 5, F = 0 (performing better than 70% of the class => A). Your grade will be the higher of these two grades.

If you anticipate religious or other conflicts with course requirements, or if you require accomodation due to disability, you must notify us during the first two weeks of class. You may not make up missed quizzes, homework, or exams, except in the rare case of a documented, serious problem beyond your control.

We encourage you to discuss the course, including the online quizzes, with others, but you must write the R scripts and the exam by yourself and prevent others from copying your work. (See the UW Academic Misconduct policy.)

Tentative Schedule
Day #: Date Subject Homework Due (11:59 p.m.)
01: Tue 4/10/18 (Install R and RStudio)
(Auditors: email sign up)
Optimization (goldenSectionSearch.R)
Group practice on optimization (optimization.Rmd, p. 1: optimize())
preview hw1, below
02: Thu 4/12 Optimization, continued (gradientDescent.R, Newton.R, NelderMead.R)
Discuss hw1
Finish Group practice (submit one per group), p. 2: optim()
03: Tue 4/17 Generic function programming
Creating an R package (jgUtilities, jgUtilities_0.1.tar.gz)
hw1.Rmd ( submit) (login help)
04: Thu 4/19 Discuss hw2
Debugging (numbersBug.txt, baby.dbinom.R)
05: Tue 4/24 Profiling, timing, and code efficiency
(5profile.R, nflProfile1.R, nflProfile2.R, loopTiming.R)
hw2.tar.gz ( submit)
06: Thu 4/26 Discuss hw3
Multicore computing for embarrassingly parallel problems
(nfl.R, mandelbrot.R, escape.time.R)
07: Tue 5/1 Group practice review (submit later)  
08: Thu 5/3 Calling C++ from R via Rcpp (escapeTime.cpp, mandelbrotRcpp.R)
Group practice, continued ( submit one per group)
hw3.Rmd (submit)