STAT 479 -- Deep Learning (Spring 2019)

Table of Contents
Course Logistics
- When
- Where
- Instructors
- Office Hours
Course Description
Resources
Class Project
Grading
Other Important Course Information
Schedule
- Topics Summary
- Calendar
Project Presentation Awards

Course Logistics

When

Mon 11:00-11:50 am
Wed 11:00-11:50 am
Fri 11:00-11:50 am

Where

Psychology 121

Instructors

Instructor: Sebastian Raschka
Teaching Assistant: Youran Qi

Office Hours

Sebastian Raschka:
- Wed 2:00-3:00 pm, Room MSC 1171
- Mon 2:30-3:30 pm, Room MSC 1171
Youran Qi:
- Wed 9:00-11:00 am (or by appointment), Room MSC B315

Course Description

Credits: 3

Course Description:

Deep learning is an exciting, young field that specializes in discovering and extracting intricate structures in large, unstructured datasets for parameterizing artificial neural networks with many layers. Since deep learning has pushed the state-of-the-art in many applications, it’s become indispensable for modern technology. This is owed to the vast utility of deep learning for tackling complex tasks in the fields of computer vision and natural language processing – tasks that humans are good at but are traditionally challenging for computers. This includes tasks such as image classification, object detection, and speech recognition.

The focus of this course will be on understanding artificial neural networks and deep learning algorithmically (discussing the math behind these methods on a basic level) and implementing network models in code as well as applying these to real-world datasets. Some of the topics that will be covered include convolutional neural networks for image classification and object detection, recurrent neural networks for modeling text, and generative adversarial networks for generating new data.

Familiarity with general machine learning concepts (such as the FS2018 STAT479: Machine Learning course) is recommended but not required. We will review some relevant background concepts, which include general machine learning concepts such as supervised learning, classification, model evaluation, etc. Furthermore, some lectures will focus on reviewing the use of Python’s stack for scientific computing (NumPy, SciPy, matplotlib) prior to the introduction of PyTorch as the main computational deep learning library that we are going to use in this course.

Learning Outcomes:

Developing an advanced understanding of deep learning and artificial intelligence, which represent state-of-the-art approaches for predictive modeling in today’s data-driven world, and identifying scenarios where it makes sense to deep learning for real-world problem-solving.
Building a repertoire of different algorithms and approaches to deep learning and understanding their various strengths and weaknesses.
Learning how to use the Python programming language and Python’s scientific computing stack for implementing machine learning algorithms to 1) enhance the learning experience, 2) conduct research and be able to develop novel algorithms, and 3) apply machine learning to problem-solving in various fields and application areas.
Being able to think about approaching problems with the desired outcome in mind, to navigate the typical trade-off between computational efficiency, model interpretability, and predictive accuracy effectively.
Combining both the theoretical and practical concepts taught in this class to creative, real-world problem solving and having completed a project that can be optionally shared on a resume.

Course Prerequisites: Consent of instructor.

Course Audience: Students majoring in math or statistics or those wishing to take additional statistics courses.

Credits: 3

Resources

Deep Learning (mildly recommended)

“Deep Learning” by Ian Goodfellow and Yoshua Bengio and Aaron Courville, MIT Press

Deep learning is a relatively young field that is advancing at a rapid pace. Unfortunately, there is no good textbook resource available for this topic. This book is an “older” book (~2014) that covers some of the topics we will discuss in class. Personally, I think this book is not ideal for teaching and probably more of a summary or reference resource.

Hence, the lecture will not be based on this book, but you may find it useful still. A free digital version shared by the authors can be found at https://www.deeplearningbook.org.

However, in the field of deep learning, it is highly recommended to consider reading the original papers, which I will link in this course.

PyTorch (highly recommended)

Also regarding computational technologies for deep learning, there is no good textbook resource available, yet. My deep learning background started with Theano, and I have been an avid TensorFlow user since its release in 2015. However, my own research is now more heavily focused on PyTorch these days as it is more convenient to work with (and even a tad faster on single- and multi-GPU workstations).

While there is no good textbook available on PyTorch, there is an excellent official online documentation which is the best go-to resource for PyTorch: https://pytorch.org

Tutorials: https://pytorch.org/tutorials/
API docs: https://pytorch.org/docs/stable/index.html
Note that we will be using PyTorch 1.0 (latest version as of today) in this class.

Illustrated Guide to Python (recommended)

“Illustrated Guide to Python 3: A Complete Walkthrough of Beginning Python with Unique Illustrations Showing how Python Really Works. Now covering Python 3.6 (Treading on Python) (Volume 1)” by Matt Harrison, ISBN-13: 978-1977921758.

This book will not be coverered in class. However, some readers asked me for good Python resources as preparation for this class, and this is one of the resources I would recommend. However, there are many other Python learning resources available online.

For instance, another great book is Allen Downey’s Think Python 2e (free PDF available at https://greenteapress.com/wp/think-python-2e/). Depending on your preferred learning style, also consider learning Python interactively instead/or in addition of reading a Python book. A great interactive resource for learning Python is Codecademy: https://www.codecademy.com. In particular, there is a free, < 10 hr interactive course: https://www.codecademy.com/learn/learn-python.

Python’s scientific computing stack (highly recommended)

While we will be primarily focussing on PyTorch, it will be extremely convenient if you develop a basic understanding for Python’s scientific computing stack: NumPy (linear algebra library), SciPy (additional scientific functions), Matplotlib (plotting), and Pandas (data wrangling).

Please have a look at last year’s lecture notes regarding Python: https://github.com/rasbt/stat479-machine-learning-fs18/blob/master/03_python/
Regarding NumPy, please take a look at the introduction from last year’s Machine Learning course for a concise summary: https://github.com/rasbt/stat479-machine-learning-fs18/tree/master/04_scipython

Class Project

Overview

The goal of working on a class project is three-fold. First, it will provide you with the opportunity to apply the concepts learned in this class creatively, which helps you with understanding material more deeply. Second, designing and working on a unique project in a team which is something that you will encounter, if you haven’t already, rather sooner than later in life, and this course project helps with preparing for that. Third, along with the opportunity to practice and the satisfaction of working creatively, students can use this project to enhance their portfolio or resume.

Note about grading

There is no “perfect project.” While you are encouraged to be ambitious, the most important aspect of this project is your learning experience. Hence, you don’t want to pick something that is too easy for you, but similarly, you don’t want to choose a project where you are not certain that is out of the scope of this class. The project proposal is not graded by how exciting your project is but based on whether you follow the objectives of the project proposal, project presentation, and project report. For instance, if your project ends up being unsuccessful – for example, if you choose to design a classifier and it doesn’t achieve the desired accuracy – it will not negatively affect your grade as long as you are honest, describe the potential issues well, and suggest improvements or further experiments. Again, the objective of this project is to provide you with hands-on practice and an opportunity to learn.

The project consists of 3 parts: a project proposal, a short project presentation, and a project report. The expectations for each part will be discussed in the following sections.

1) Project Proposal

Please note that you should use the proposal-latex file(s) for writing and submitting your proposal!

The main purpose of the project proposal is to receive feedback from the TAs/the instructor regarding whether your project is feasible and whether it is within the scope of this class. Also, the project proposal offers a chance to receive useful feedback and suggestions on your project.

For this project, you will be working in a team consisting of three students. You are encouraged to form groups by yourself as discussed in class. If you cannot find group members, the TA and I will randomly assign you to a group. If you have any concerns working with someone in your group, please talk to a TA or the instructor for accommodations.

Proposal Format:

The project proposal is a 1-3 page document (800-1200 words) excluding references.
You are encouraged (not required) to use 1-2 figures to illustrate technical concepts.
The proposal must be formatted and submitted as a PDF document (the submission deadline will be later announced via the schedule & email)

Introduction:

Describe what you are planning to do.
Briefly describe related work (if applicable).

Motivation:

Describe why your project is interesting. E.g., you can describe why your project could have a broader societal impact. Or, you may describe the motivation from a personal learning perspective.

Evaluation:

What would the successful outcome of your project look like? In other words, under which circumstances would you consider your project to be “successful?”
How do you measure success, specific to this project, from a technical standpoint?

Resources:

What resources are you going to use (datasets, computer hardware, computational tools, etc.)?

Contributions:

You are expected to share the workload evenly, and every group member is expected to participate in both the experiments and writing. (As a group, you only need to submit one proposal and one report, though. So you need to work together and coordinate your efforts.)

Clearly indicate what computational and writing task each member of your group will be participating in.

It is crucial that you talk to each other regularly!!! Schedule regular meetings and/or use online communication tools (e.g., Gitter, Slack, or email) to stay in touch with your group members throughout the semester regarding the process of your project.

Modifications to the Proposal. After you have received feedback from the TAs/the instructor and your project proposal has been graded, you are advised to stick to the project outline in the proposal as closely as possible. However, if there is a concept introduced in a later lecture (for instance, a machine learning algorithm that you think is more appropriate then the one you proposed), you have the option to modify your proposal, but you are not penalized if you don’t. If you wish to update your project outline, talk to a TA first.

2) Project Presentation

During the last three lectures, you will be presenting your project to the class. The presentation is “free form” but should cover the following:

introduce the topic to a general audience (your class);
summarize the main approach or method;
highlight the outcomes of your project.

The presentation should be 8-10 minutes long, plus 2 minutes will be reserved for questions. All members of the group should participate in the presentation.

To encourage attendance, we will use a random number generator in class to determine the order in which the groups will present.
Please bring your own device for the presentation (we have a VGA and a HDMI cable for this projector). Further, I will provide the following connectors: Displayport-to-HDMI, Displayport-to-VGA, USB-C-to-VGA, USB-C-to-HDMI, Lightning-to-HDMI (for iPad).
There will be 3 awards:
1. Best Oral Presentation
2. Most Creative Project
3. Best Visualizations
The awards will be determined by voting, each student will fill out a card in class (I will provide the cards), voting for each presentation (on a scale from 1-10 for each of the 3 categories, where 10 is best), and I will collect the cards at the end of the lecture.

The voting card should be filled out as follows:

Title of the Presentation, x/10, y/10, z/10
Title of the Presentation, x/10, y/10, z/10 …

where

x are the points for 1. Best Oral Presentation
y are the points for 2. Most Creative Project
z are the points 3. Best Visualizations

The awards will be computed based on the highest number of points for each category. However, one project can only receive one of the prizes.

Each of the three cards handed in will provide 3 bonus points towards your project report grade (9 pts in total).

3) Project Report

The project report is expected to be 6-8 pages long (excluding references) and should contain the follwing sections:

Introduction
Related Work
Proposed Method
Experiments
Results and Discussion
Conclusions
Contributions

More details are provided in the LaTeX report template at https://github.com/rasbt/stat479-deep-learning-ss19/tree/master/report-template.

Please note that you should use the report-latex file for writing and submitting your report!

Also, you are required to submit all the code, computations, and experiments you developed and conducted for this project. Note that the quality of code will not have any influence on your grad and will merely serve as a basis to establish that the report contains original and “real” results.

You are encouraged to share your project/final project report online after you completed the course – for example, via GitHub or on a personal website online.

Below is a list of examples from last year’s Machine Learning (not deep learning) class:

Grading

The final grade will be computed using the following weighted grading scheme:

20% Problem Sets
50% Exams:
- 20% Midterm Exam
- 30% Final Exam
30% Class Project:
- 5% Project proposal
- 10% Project presentation
- 15% Project report

Other Important Course Information

RULES, RIGHTS & RESPONSIBILITIES

See the Guides’s Rules, Rights and Responsibilities

ACADEMIC INTEGRITY

By enrolling in this course, each student assumes the responsibilities of an active participant in UW-Madison’s community of scholars in which everyone’s academic work and behavior are held to the highest academic integrity standards. Academic misconduct compromises the integrity of the university. Cheating, fabrication, plagiarism, unauthorized collaboration, and helping others commit these acts are examples of academic misconduct, which can result in disciplinary action. This includes but is not limited to failure on the assignment/course, disciplinary probation, or suspension. Substantial or repeated cases of misconduct will be forwarded to the Office of Student Conduct & Community Standards for additional review. For more information, refer to studentconduct.wiscweb.wisc.edu/academic-integrity/.

ACCOMMODATIONS FOR STUDENTS WITH DISABILITIES

McBurney Disability Resource Center syllabus statement: “The University of Wisconsin-Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute (36.12), and UW-Madison policy (Faculty Document 1071) require that students with disabilities be reasonably accommodated in instruction and campus life. Reasonable accommodations for students with disabilities is a shared faculty and student responsibility. Students are expected to inform faculty [me] of their need for instructional accommodations by the end of the third week of the semester, or as soon as possible after a disability has been incurred or recognized. Faculty [I], will work either directly with the student [you] or in coordination with the McBurney Center to identify and provide reasonable instructional accommodations. Disability information, including instructional accommodations as part of a student’s educational record, is confidential and protected under FERPA.” http://mcburney.wisc.edu/facstaffother/faculty/syllabus.php

DIVERSITY & INCLUSION

Institutional statement on diversity: “Diversity is a source of strength, creativity, and innovation for UW-Madison. We value the contributions of each person and respect the profound ways their identity, culture, background, experience, status, abilities, and opinion enrich the university community. We commit ourselves to the pursuit of excellence in teaching, research, outreach, and diversity as inextricably linked goals.

The University of Wisconsin-Madison fulfills its public mission by creating a welcoming and inclusive community for people from every background – people who as students, faculty, and staff serve Wisconsin and the world.” https://diversity.wisc.edu/

Schedule

Note that this is a tentative schedule subject to changes.

Below is a list of topics we aim to cover. However, we will take our time, and it is more important to build a good understanding of the core concepts and the field in general rather than covering one more algorithm. Keep in mind that a good foundation will enable you to study and understand additional algorithms if the need arises.

Topics Summary

History of neural networks and what makes deep learning different from “classic machine learning”
Introduction to the concept of neural networks by connecting it to familiar concepts such as logistic regression and multinomial logistic regression (which can be seen as special cases: single-layer neural nets)
Modeling and deriving non-convex loss function through computation graphs
Introduction to automatic differentiation and PyTorch for efficient data manipulation using GPUs
Convolutional neural networks for image analysis
1D convolutions for sequence analysis
Sequence analysis with recurrent neural networks
Generative models to sample from input distributions
- Autoencoders
- Variational autoencoders
- Generative Adversarial Networks

Calendar

Date

Event

Description

Lecture Material

Announcements

Wed,
Jan 23

Day 1

● Course Overview
● L01: Intro to DL

● [L01: Intro to DL -- Slides]

Fri,
Jan 25

Day 2

L01: Intro to DL cont'd

Mon,
Jan 28

Day 3

● L02: DL history

● [L02: DL History -- Slides]

Wed,
Jan 30

Day 4

Canceled due to weather-related
campus closure

Fri,
Feb 01

Day 5

● L03: The Perceptron

● [L03: Perceptron -- Slides]

[L03: Perceptron -- Code]

● Start working on HW1
Due on Thu Feb 07 (11:59 pm)

Mon,
Feb 04

Day 6

L03: The Perceptron cont'd

Wed,
Feb 06

Day 7

● L04: Linear Algebra for Deep Learning

● [L04: Linear Algebra
for DL -- Slides]

Fri,
Feb 08

Day 8

L04: Linear Algebra for DL cont'd

Deadline for
Group Assignments

Mon,
Feb 11

Day 9

● L05: Fitting Neuron Models with
Gradient Descent

● [L05: Gradient Descent -- Slides]

[L05: Linear Regression -- Code]
[L05: ADALINE -- Code]
[L05: 2nd order partial derivatives -- Code]

Wed,
Feb 13

Day 10

L05: Fitting Neuron Models... cont'd

● Start working on HW2
Due on Thu Feb 21 (11:59 pm)

Fri,
Feb 15

Day 11

● L06: Automatic Differentiation
with PyTorch

● [L06: PyTorch -- Slides]

[L06: PyTorch Autograd -- Code]
[L06: Autograd ADALINE -- Code]
[L06: Autograd Intermediate Var. -- Code]

Mon,
Feb 18

Day 12

L06: Automatic Differentiation
with PyTorch cont'd

[OpenAI discussion links]

Wed,
Feb 20

Day 13

● L07: Cloud Computing

● [L07: Cloud Computing -- Slides]

Fri,
Feb 22

Day 14

● L08: Logistic Reg. & Multiclass

● [L08: Logistic -- Slides]

[L08: Logistic Regr. -- Code]
[L08: Cross Entropy -- Code]
[L08: Softmax from Scratch -- Code]
[L08: Softmax MNIST -- Code]

Mon,
Feb 25

Day 15

HW2 discussion
L08: Logistic Reg. & Multiclass cont'd

Wed,
Feb 27

Day 16

L08: Logistic Reg. & Multiclass cont'd

Fri,
Mar 01

Day 17

● L09: Multilayer Perceptrons

● [L09: Multilayer Perceptrons -- Slides]

[L09: MLP from Scratch -- Code]
[L09: MLP in PyTorch -- Code]
[L09: XOR Problem -- Code]
[L09: DataLoader -- Code]

● Start working on HW3
Due on Fri Mar 8 (11:59 pm)

Mon,
Mar 04

Day 18

L09: Multilayer Perceptrons cont'd

Wed,
Mar 06

Day 19

L09: Multilayer Perceptrons cont'd

Fri,
Mar 08

Day 20

● L10: Regularization

● [L10: Regularization -- Slides]

Optional DL Competition
Announcement
See Current Ranking

Mon,
Mar 11

Day 21

L10: Regularization cont'd

Wed,
Mar 13

Day 22

Midterm
Exam

Fri,
Mar 15

Day 23

● L11: Normalization and Weight Initialization

● [L11: Normalization and Weight Init. -- Slides]

Submit Project
Proposal

Mon,
Mar 18

Spring recess

Wed,
Mar 20

Spring recess

Fri,
Mar 22

Spring recess

Mon,
Mar 25

Day 24

L11: Normalization and Weight Initialization cont'd

Wed,
Mar 27

Day 25

● L12: Learning Rates and Optimization

● [L12: Learning Rates and Optimization -- Slides]

Fri,
Mar 29

Day 26

L12: Learning Rates and Optimization cont'd

Mon,
Apr 01

Day 27

● L13: Intro to ConvNets (Part 1)

● [L13: Intro to ConvNets (Part 1) -- Slides

[L13 -- Code Examples]

Wed,
Apr 03

Day 28

L13: Intro to ConvNets (Part 1) cont'd

● Start working on HW4
Due on Fri, Feb 12 (11:59 pm)

Fri,
Apr 05

Day 29

L13: Intro to ConvNets (Part 1) cont'd

Mon,
Apr 08

Day 30

● L13: Intro to ConvNets (Part 2)

● [L13: Intro to ConvNets (Part 2) -- Slides

[L13 -- Code Examples]

Wed,
Apr 10

Day 31

L13: Intro to ConvNets (Part 2) cont'd
● L13: Intro to ConvNets (Part 3)

● [L13: Intro to ConvNets (Part 3) -- Slides

[L13 -- Code Examples]

Fri,
Apr 12

Day 32

L13: Intro to ConvNets (Part 3) cont'd

Mon,
Apr 15

Day 33

● L14: Intro to RNNs (Part 1)

● [L14: Intro to RNNs (Part 1) -- Slides

Wed,
Apr 17

Day 34

● L14: Intro to RNNs (Part 2)

● [L14: Intro to RNNs (Part 2) -- Slides

Fri,
Apr 19

Day 35

● L15: Autoencoders
~~● L16: Variational Autoencoders~~

● [L15: Autoencoders -- Slides]

[L15 -- Code Examples]

Mon,
Apr 22

Day 36

● L17: Generative Adversarial Networks

● [L17: GAN -- Slides]

[L17 -- Code Examples]

Wed,
Apr 24

Day 37

Project Presentations

Fri,
Apr 26

Day 38

Project Presentations

Mon,
Apr 29

Day 39

Project Presentations

Wed,
May 01

Day 40

Project Presentations

DL Competition
Submission Deadline
Winner: Tianyu Zeng
Winning Solution

Fri,
May 03

Day 41

Project Presentations

Mon, May 06

Day 42

Final Exam
5:05 - 7:05 pm, Room: VAN VLECK B130

Final Exam

Wed, May 08

Submit Final Project Report

A summary/gallery of some of the awesome student projects students in this class worked on.

Project Presentation Awards

Without exception, we had amazing project presentations this semester. Nonetheles, we have some winners the top 5 project presentations for each of the 3 categories, as determined by voting among the ~65 students:

Best Oral Presentation:

Saisharan Chimbiki, Grant Dakovich, Nick Vander Heyden (Creating Tweets inspired by Deepak Chopra), average score: 8.417
Josh Duchniak, Drew Huang, Jordan Vonderwell (Predicting Blog Authors’ Age and Gender), average score: 7.663
Sam Berglin, Jiahui Jiang, Zheming Lian (CNNs for 3D Image Classification), average score: 7.595
Christina Gregis, Wengie Wang, Yezhou Li (Music Genre Classification Based on Lyrics), average score: 7.588
Ping Yu, Ke Chen, Runfeng Yong (NLP on Amazon Fine Food Reviews) average score: 7.525

Most Creative Project:

Saisharan Chimbiki, Grant Dakovich, Nick Vander Heyden (Creating Tweets inspired by Deepak Chopra), average score: 8.313
Yien Xu, Boyang Wei, Jiongyi Cao (Judging a Book by its Cover: A Modern Approach), average score: 7.952
Xueqian Zhang, Yuhan Meng, Yuchen Zeng (Handwritten Math Symbol Recognization), average score: 7.919
Jinhyung Ahn, Jiawen Chen, Lu Li (Diagnosing Plant Diseases from Images for Improving Agricultural Food Production), average score: 7.917
Poet Larsen, Reng Chiz Der, Noah Haselow (Convolutional Neural Networks for Audio Recognition), average score: 7.854

Best Visualizations:

Ping Yu, Ke Chen, Runfeng Yong (NLP on Amazon Fine Food Reviews), average score: 8.189
Xueqian Zhang, Yuhan Meng, Yuchen Zeng (Handwritten Math Symbol Recognization), average score: 8.153
Saisharan Chimbiki, Grant Dakovich, Nick Vander Heyden (Creating Tweets inspired by Deepak Chopra), average score: 7.677
Poet Larsen, Reng Chiz Der, Noah Haselow (Convolutional Neural Networks for Audio Recognition), average score: 7.656
Yien Xu, Boyang Wei, Jiongyi Cao (Judging a Book by its Cover: A Modern Approach), average score: 7.490

STAT 479 -- Deep Learning (Spring 2019)

Table of Contents

Course Logistics

When

Where

Instructors

Office Hours

Course Description

Resources

Class Project

Overview

1) Project Proposal

2) Project Presentation

3) Project Report

Optional: Sharing your Project

Grading

Other Important Course Information

RULES, RIGHTS & RESPONSIBILITIES

ACADEMIC INTEGRITY

ACCOMMODATIONS FOR STUDENTS WITH DISABILITIES

DIVERSITY & INCLUSION

Schedule

Topics Summary

Calendar

Project Presentation Awards