Data Science Computing Project: Tentative Schedule
(Syllabus)
DO NOT RELY ON THIS ROUGH DRAFT.
An official version will be posted by 9/4/25.
Day #: Date | Subject | Homework Due (11:59 p.m.) |
1: Th 9/4/25 |
Login to a linux computer Why learn linux? See TOP500 operating systems and Linux Basic Linux commands |
Read introductory email |
2: Tu 9/9 |
Mention programmer virtues. Basic linux, continued (Continue with sed ) |
|
3: Th 9/11 |
FYI: CHTC researcher forum We 2/26/25 Preview HPC (if you did not do this already) emacs text editor: reference sheet, demo1 (data.txt, tiny.R, sifting.txt) Solve basic Linux exercises |
|
4: Tu 9/16 |
emacs, continued: demo2 emacs regular expressions Q01: linux |
Q01 |
5: Th 9/18 |
(Dislike emacs? Try nano.) mention VS Code alternative to emacs (thanks, Zhaoqing) Lyman-break galaxies discuss HW2 HW1 help |
|
6: Tu 9/23 |
Lyman-break galaxies, continued: search ideas HW2 help |
HW1: emacs |
7: Th 9/25 |
HW2 help Q02: emacs |
Q02 |
8: Tu 9/30 |
git/GitHub version control system |
HW2: galaxies on public0[234] |
9: Th 10/2 |
Group1: Git Exercise (TA GitHub ID: Ming5723) |
|
10: Tu 10/7 |
Linux (bash) shell scripting |
Group1: Git Exercise |
11: Th 10/9 |
shell scripting, continued (from p. 4) |
|
12: Tu 10/14 |
Q03: git Group2: scripting exercises |
Q03 |
13: Th 10/16 |
Group2: scripting exercises, continued |
|
14: Tu 10/21 |
discuss project Statistics High Performance Computing Cluster (HPC) |
Group2: shell scripting |
15: Th 10/23 |
Statistics HPC, continued (from end of 4jobArray: run it!) Q04: shell scripting |
Group4(a): Project group Q04 |
16: Tu 10/28 |
Check project groups, report troubles discuss HW3 High Throughput Computing at CHTC: guest lecture by Amber Lim and Danny Morales (slides, documentation, help) |
|
17: Th 10/30 |
revisit seff on slurm-submit-00 CHTC commands, examples, and references: tinyExamples.tar Slurm vs. HTCondor HW3 help develop project proposals |
|
18: Tu 11/4 |
parallel sd example (wget http://www.stat.wisc.edu/~jgillett/DSCP/CHTC/sd.tar ; run condor_submit_dag sd.dag )HW3 help develop project proposals |
HW3: airlines on Slurm Group4(b): project proposal |
19: Th 11/6 |
parallel sd example, continued (run it!) Group3: parallel word counting |
HW3 (delayed to here from 3/20) |
20: Tu 11/11 |
schedule proposal feedback meetings Using R at CHTC ( wget http://www.stat.wisc.edu/~jgillett/DSCP/CHTC/calling_R_or_python.tar )Group3: parallel word counting, continued |
|
21: Th 11/13 |
discuss HW4 Group3: CHTC, continued Group4: project development Q05: distributed computing (Slurm and HTCondor) |
Q05 |
22: Tu 11/18 |
project proposal feedback: meet in class with teacher and TA |
proposal feedback meeting in class Group3: CHTC |
23: Th 11/20 |
FYI: undergraduate research Optional CHTC /staging/groups/stat_dscp/group01 ... group13
folders for large files (< 200 GB) HW4 help, project help |
|
24: Tu 11/25 |
set presentation schedule HW4 help, project help |
HW4a: galaxies on CHTC |
[Th 11/27] |
[no class--Thanksgiving] |
|
25: T 12/2 |
Save files from public0[234]/slurm-submit-00/learn to laptop/email/Github/etc. by 12/12/25. project help |
HW4b: more galaxies (extended to Th 4/24) Mo 4/28: presentation slides |
26: Th 12/4 |
Group4(c): first 1/2 of project presentations |
|
27: Tu 12/9 |
Group4(c): second 1/2 of project presentations |
We 12/10: Group4(d): project report |