Date |
Topics |
Readings |
Notes |
Wednesday, Jan 9 |
Course introduction; Administrivia; Intro to Python: data types and function definitions |
Jupyter notebook documentation (required); either A. B. Downey, Chapters 1 through 3 or Severance, Chapters 1, 2 and 4 (required) |
HW1 out; Slides; Notebook |
Friday, Jan 11 |
Intro to Python: conditionals, iteration and recursion |
Either A. B. Downey, Chapters 5, 6 and 7 or Severance, Chapters 4 and 5 (required); Python documentation on compound statements (recommended) |
Slides; Notebook |
Wednesday, Jan 16 |
Intro to Python: Strings and Lists |
Either A. B. Downey, Chapters 8 and 10 or Severance, Chapters 6 and 8 (required); A. B. Downey, Chapter 9 (recommended); Python documentation on lists (recommended); Python documentation on sequences (recommended) |
HW2 out; Slides; Notebook |
Friday, Jan 18 |
Intro to Python: Dictionaries and Tuples |
Either A. B. Downey, Chapters 11 and 12 or Severance, Chapters 9 and 10 (required); Python documentation on dictionaries (recommended); Python documentation on tuples (recommended); Python documentation on sets (recommended); A. B. Downey, Section B.4 (recommended); A. B. Downey, Chapter 13 (recommended)
|
No office hours. Slides; Notebook |
Wednesday, Jan 23 |
File I/O and Objects |
A. B. Downey, Chapter 14 or Severance, Chapter 7 (required); Python File I/O Documentation (required);
Handling Errors and Exceptions (required);
Python pickle module (recommended); A. B. Downey, Chapters 15 and 16 (required); Python documentation on classes (only through section 9.3) (required); D. Phillips (2015). Python 3 Object-oriented Programming, Second Edition. Packt Publishing. (recommended); M. Weisfeld (2009). The Object-Oriented Thought Process, Third Edition. Addison-Wesley. (recommended) |
HW3 out; Slides; Notebook |
Friday, Jan 25 |
File I/O and Objects, cont'd |
A. B. Downey, Chapter 14 or Severance, Chapter 7 (required); Python File I/O Documentation (required);
Handling Errors and Exceptions (required);
Python pickle module (recommended); A. B. Downey, Chapters 15 and 16 (required); Python documentation on classes (only through section 9.3) (required); D. Phillips (2015). Python 3 Object-oriented Programming, Second Edition. Packt Publishing. (recommended); M. Weisfeld (2009). The Object-Oriented Thought Process, Third Edition. Addison-Wesley. (recommended) |
|
Wednesday, Jan 30 |
Classes canceled due to extreme cold weather. |
|
|
Friday, Feb 1 |
Functional programming: itertools and functools |
Python itertools documentation (required); Python functools documentation (required); A. M. Kuchling. Functional Programming HOWTO (required); M. R. Cook. A Practical Introduction to Functional Programming (recommended); D. Mertz Functional Programming in Python (recommended) |
HW4 out; Slides; Notebook |
Wednesday, Feb 6 |
Functional programming, cont'd |
|
|
Friday, Feb 8 |
numpy, SciPy and matplotlib |
Numpy quickstart tutorial (required); Pyplot tutorial (required);
SciPy tutorial (recommended); Pyplot API (recommended); E. Tufte (2001). The Visual Display of Quantitative Information. Graphics Press. (recommended); E. Tufte (1997). Visual and Statistical Thinking: Displays of Evidence for Making Decisions. Graphics Press. (recommended) |
HW5 out; Slides; Notebook |
Wednesday, Feb 13 |
Python pandas |
pandas quickstart guide (required); Basic data structures (required); Basic functionality of pandas Series and DataFrames (required); pandas cookbook (recommended) |
HW6 out; Slides; Notebook; Baseball dataset |
Friday, Feb 15 |
Python pandas, cont'd |
pandas group-by operations (required); Reshaping and pivoting (required); Merge, join and concatenation (recommended); Time series functionality (recommended) |
Slides; Notebook |
Wednesday, Feb 20 |
Regular expressions |
Severance Chapter 11: Regular expressions (required); Python regex documentation (recommended) |
HW7 out; Slides; Notebook |
Friday, Feb 22 |
Markup languages; HTML, XML, JSON |
Severance Chapter 12 (HTTP, HTML) and Chapter 13 (XML, JSON) (required); BeautifulSoup documentation (just Quick Start) (required); BeautifulSoup documentation (everything up to sections about CSS) (recommended); BeautifulSoup4 tutorial (recommended) |
Slides; Notebook |
Wednesday, Feb 27 |
Interacting with Databases: SQL |
Oracle relational databases overview (and only the overview!) (required); First section of Python sqlite3 documentation (required); w3schools SQL tutorial (recommended) |
Slides; Notebook |
Friday, March 1 |
UNIX/Linux command line |
Introduction to UNIX Commands (required); Survival guide for UNIX newbies (recommended); GNU/Linux Command−Line Tools Summary (recommended) |
Slides |
Wednesday, March 6 |
Winter break. No lecture. |
|
No office hours. |
Friday, March 8 |
Winter break. No lecture. |
|
No office hours. |
Wednesday, March 13 |
Introduction to Hadoop and MapReduce |
J. Dean and S. Ghemawat MapReduce: Simplified Data Processing on Large Clusters in Proceedings of the Sixth Symposium on Operating System Design and Implementation, 2004 (required); Introduction to HDFS by J. Hanson (recommended) |
Slides |
Friday, March 15 |
MapReduce in Python: mrjob |
mrjob Fundamentals and Concepts (required); Hadoop wiki: How MapReduce operations are actually carried out (required) |
HW8 out; Slides; mrjob demo code |
Wednesday, March 20 |
MapReduce in Python: mrjob cont'd |
|
|
Friday, March 22 |
MapReduce in Python: PySpark |
Spark programming guide (required); PySpark programming guide (required); Spark MLlib, a Spark machine learning library (recommended); Spark GraphX, a Spark library for processing graph data (recommended) |
Slides |
Wednesday, March 27 |
Pyspark contd'd; Overview of text editors |
nano overview (recommended); vim documentation (recommended); emacs documentation (recommended) |
wordcount example |
Friday, March 29 |
Command line: part 2 |
Data Science at the Command Line by J. Janssens (recommended); S. Das (2005, 2012). Your UNIX: the Ultimate Guide. McGraw-Hill. (recommended); Sed manual (recommended); GNU awk user’s guide (recommended) |
Slides |
Wednesday, April 3 |
Algorithms, Profiling and Debugging |
A. B. Downey, Appendix B (required); Python cProfile/Profile documentation (recommended); Python unittest documentation (recommended) |
HW9 out; Slides; Notebook; Demo files |
Friday, April 5 |
scikit-learn |
sklearn quickstart tutorial (required); sklearn user-guide (recommended) |
Slides; Notebook |
Wednesday, April 10 |
Google TensorFlow |
TensorFlow tutorial: Getting Started with TensorFlow (required); Abadi, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems (required); Assorted tutorials on statistical and neural models in TensorFlow (recommended) |
HW10 out; Slides; Notebook |
Friday, April 12 |
TensorFlow cont'd |
Chapter 6 of Deep Learning by Goodfellow, Bengio and Courville (recommended) |
Slides; Softmax regression demo; Multilayer CNN demo |
Wednesday, April 17 |
TensorFlow cont'd |
|
|
Friday, April 19 |
TensorFlow cont'd; APIs |
Getting started with the Python requests package (recommended); Mozilla overview of HTTP methods (recommended); RFC Specifying HTTP methods (recommended) |
Slides; Notebook |