Date |
Topics |
Readings |
Notes |
Tuesday, Sep 3 |
Course introduction; Administrivia; Intro to Python: data types and function definitions |
Jupyter notebook documentation (required); either A. B. Downey, Chapters 1 through 3 or Severance, Chapters 1, 2 and 4 (required) |
HW1 out; Slides; Notebook |
Thursday, Sep 5 |
Intro to Python: conditionals, iteration and recursion |
Either A. B. Downey, Chapters 5, 6 and 7 or Severance, Chapters 4 and 5 (required); Python documentation on compound statements (recommended) |
Slides; Notebook |
Tuesday, Sep 10 |
Intro to Python: Strings and Lists |
Either A. B. Downey, Chapters 8 and 10 or Severance, Chapters 6 and 8 (required); A. B. Downey, Chapter 9 (recommended); Python documentation on lists (recommended); Python documentation on sequences (recommended) |
HW2 out; Slides; Notebook |
Thursday, Sep 12 |
Intro to Python: Dictionaries and Tuples |
Either A. B. Downey, Chapters 11 and 12 or Severance, Chapters 9 and 10 (required); Python documentation on dictionaries (recommended); Python documentation on tuples (recommended); Python documentation on sets (recommended); A. B. Downey, Section B.4 (recommended); A. B. Downey, Chapter 13 (recommended)
|
Slides; Notebook |
Tuesday, Sep 17 |
File I/O and Objects |
A. B. Downey, Chapter 14 or Severance, Chapter 7 (required); Python File I/O Documentation (required);
Handling Errors and Exceptions (required);
Python pickle module (recommended); A. B. Downey, Chapters 15 and 16 (required); Python documentation on classes (only through section 9.3) (required); D. Phillips (2015). Python 3 Object-oriented Programming, Second Edition. Packt Publishing. (recommended); M. Weisfeld (2009). The Object-Oriented Thought Process, Third Edition. Addison-Wesley. (recommended) |
HW3 out; Slides; Notebook |
Thursday, Sep 19 |
File I/O and Objects (cont'd) |
|
|
Tuesday, Sep 24 |
File I/O and Objects (cont'd) |
|
|
Thursday, Sep 26 |
No lecture due to travel. |
|
No instructor office hours on Wednesday. |
Tuesday, Oct 1 |
Functional programming: itertools and functools |
Python itertools documentation (required); Python functools documentation (required); A. M. Kuchling. Functional Programming HOWTO (required); M. R. Cook. A Practical Introduction to Functional Programming (recommended); D. Mertz Functional Programming in Python (recommended) |
HW4 out; Slides; Notebook |
Thursday, Oct 3 |
Functional programming (cont'd) |
|
|
Tuesday, Oct 8 |
numpy, SciPy and matplotlib |
Numpy quickstart tutorial (required); Pyplot tutorial (required);
SciPy tutorial (recommended); Pyplot API (recommended); E. Tufte (2001). The Visual Display of Quantitative Information. Graphics Press. (recommended); E. Tufte (1997). Visual and Statistical Thinking: Displays of Evidence for Making Decisions. Graphics Press. (recommended) |
Slides; Notebook |
Thursday, Oct 10 |
matplotlib (cont'd); Python pandas |
pandas quickstart guide (required); Basic data structures (required); Basic functionality of pandas Series and DataFrames (required); pandas cookbook (recommended) |
HW5 out; Slides; Notebook |
Tuesday, Oct 15 |
Fall study break. No lecture. |
|
HW6 out; Instructor office hours as usual. |
Thursday, Oct 17 |
Python pandas (cont'd) |
pandas group-by operations (required); Reshaping and pivoting (required); Merge, join and concatenation (recommended); Time series functionality (recommended) |
Slides; Notebook |
Tuesday, Oct 22 |
Regular expressions |
Severance Chapter 11: Regular expressions (required); Python regex documentation (recommended) |
HW7 out; Slides; Notebook |
Thursday, Oct 24 |
Markup languages; HTML, XML, JSON |
Severance Chapter 12 (HTTP, HTML) and Chapter 13 (XML, JSON) (required); BeautifulSoup documentation (just Quick Start) (required); BeautifulSoup documentation (everything up to sections about CSS) (recommended); BeautifulSoup4 tutorial (recommended) |
Slides; Notebook |
Tuesday, Oct 29 |
Interacting with Databases: SQL |
Oracle relational databases overview (and only the overview!) (required); First section of Python sqlite3 documentation (required); w3schools SQL tutorial (recommended) |
Slides; Notebook |
Thursday, Oct 31 |
UNIX/Linux command line |
Introduction to UNIX Commands (required); Survival guide for UNIX newbies (recommended); GNU/Linux Command-Line Tools Summary (recommended); M. Shelley (1818). Frankenstein; or, The Modern Prometheus (recommended) |
Slides |
Tuesday, Nov 5 |
Introduction to Hadoop and MapReduce |
J. Dean and S. Ghemawat MapReduce: Simplified Data Processing on Large Clusters in Proceedings of the Sixth Symposium on Operating System Design and Implementation, 2004 (required); Introduction to HDFS by J. Hanson (recommended) |
HW8 out; Slides |
Thursday, Nov 7 |
MapReduce in Python: mrjob |
mrjob Fundamentals and Concepts (required); Hadoop wiki: How MapReduce operations are actually carried out (required) |
Slides; mrjob demo code |
Tuesday, Nov 12 |
mrjob (cont'd); MapReduce in Python: PySpark |
Spark programming guide (required); PySpark programming guide (required); Spark MLlib, a Spark machine learning library (recommended); Spark GraphX, a Spark library for processing graph data (recommended) |
Slides; demo code |
Thursday, Nov 14 |
MapReduce in Python: PySpark (cont'd) |
|
|
Tuesday, Nov 19 |
Algorithms, Profiling and Debugging |
A. B. Downey, Appendix B (required); Python cProfile/Profile documentation (recommended); Python unittest documentation (recommended) |
HW9 out; Slides; Notebook; Demo files |
Thursday, Nov 21 |
Command line: part 2 |
Data Science at the Command Line by J. Janssens (recommended); S. Das (2005, 2012). Your UNIX: the Ultimate Guide. McGraw-Hill. (recommended); Sed manual (recommended); GNU awk user’s guide (recommended) |
Slides |
Tuesday, Nov 26 |
scikit-learn |
sklearn quickstart tutorial (required); sklearn user-guide (recommended) |
Slides; Notebook |
Thursday, Nov 28 |
No lecture: Thanksgiving break |
Listen to Arlo Guthrie's Alice's Restaurant (recommended); Tune in to your local NPR station on Friday, November 29 to listen to the Ig Nobel Prize ceremony (recommended) |
No instructor office hours Wednesday, November 27 |
Tuesday, Dec 3 |
Google TensorFlow |
Introduction to Low-Level TensorFlow API (required); Abadi, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (required); Assorted tutorials on statistical and neural models in TensorFlow (recommended)
|
HW10 out; Slides; Notebook |
Thursday, Dec 5 |
TensorFlow (cont'd) |
|
Slides; Demo: Digit Recognition with Softmax Classifier; Demo: Digit Recognition with Convolutional Neural Net
|
Tuesday, Dec 10 |
TensorFlow (cont'd) |
|
|