Data Science Computing Project: Tentative Schedule
(Syllabus)
Day #: Date | Subject | Homework Due (11:59 p.m.) |
1: Tu 1/21/25 |
Login to a linux computer Why learn linux? See TOP500 operating systems and Linux Basic Linux commands |
Read introductory email |
2: Th 1/23 |
Mention programmer virtues. Basic linux, continued (Continue with sed ) |
|
3: Tu 1/28 |
FYI: CHTC researcher forum We 2/26/25 Preview HPC (if you did not do this already) emacs text editor: reference sheet, demo1 (data.txt, tiny.R, sifting.txt) Solve basic Linux exercises |
|
4: Th 1/30 |
emacs, continued: demo2 emacs regular expressions Q01: linux |
Q01 |
5: Tu 2/4 |
(Dislike emacs? Try nano.) mention VS Code alternative to emacs (thanks, Zhaoqing) Lyman-break galaxies discuss HW2 HW1 help |
|
6: Th 2/6 |
Lyman-break galaxies, continued: search ideas HW2 help |
HW1: emacs |
7: Tu 2/11 |
HW2 help Q02: emacs |
Q02 |
8: Th 2/13 |
git/GitHub version control system |
HW2: galaxies on public0[234] |
9: Tu 2/18 |
Group1: Git Exercise (TA GitHub ID: Ming5723) |
|
10: Th 2/20 |
Linux (bash) shell scripting |
Group1: Git Exercise |
11: Tu 2/25 |
shell scripting, continued (from p. 4) |
|
12: Th 2/27 |
Q03: git Group2: scripting exercises |
Q03 |
13: Tu 3/4 |
Group2: scripting exercises, continued |
|
14: Th 3/6 |
discuss project Statistics High Performance Computing Cluster (HPC) |
Group2: shell scripting |
15: Tu 3/11 |
Statistics HPC, continued (from end of 4jobArray: run it!) |
Group4(a): Project group |
16: Th 3/13 |
Check project groups, report troubles discuss HW3 High Throughput Computing at CHTC: guest lecture by Amber Lim and Danny Morales (slides, documentation, help) |
|
17: Tu 3/18 |
revisit seff on slurm-submit-00 CHTC commands, examples, and references: tinyExamples.tar Slurm vs. HTCondor HW3 help develop project proposals Q04 |
Q04 |
18: Th 3/20 |
parallel sd example (wget http://www.stat.wisc.edu/~jgillett/DSCP/CHTC/sd.tar ; run condor_submit_dag sd.dag )HW3 help develop project proposals |
HW3: airlines on Slurm (delayed to Tu 4/1) Group4(b): project proposal |
[Tu 3/25, Th 3/27] |
[no class: spring break] |
|
19: Tu 4/1 |
parallel sd example, continued (run it!) Group3: parallel word counting |
HW3 (delayed to here from 3/20) |
20: Th 4/3 |
schedule proposal feedback meetings Using R at CHTC ( wget http://www.stat.wisc.edu/~jgillett/DSCP/CHTC/calling_R_or_python.tar )Group3: parallel word counting, continued |
|
21: Tu 4/8 |
discuss HW4 Group3: CHTC, continued Group4: project development Q05: distributed computing (Slurm and HTCondor) |
Q05 |
22: Th 4/10 |
project proposal feedback: meet in class with teacher and TA |
proposal feedback meeting in class Group3: CHTC |
23: Tu 4/15 |
FYI: undergraduate research Optional CHTC /staging/groups/stat_dscp/group01 ... group13
folders for large files (< 200 GB) HW4 help, project help |
|
24: Th 4/17 |
set presentation schedule HW4 help, project help |
HW4a: galaxies on CHTC |
25: Tu 4/22 |
Save files from public0[234]/slurm-submit-00/learn to laptop/email/Github/etc. by 5/6/25. project help |
HW4b: more galaxies (extended to Th 4/24) Mo 4/28: presentation slides |
26: Th 4/24 |
project help |
|
27: Tu 4/29 |
Group4(c): first 1/2 of project presentations |
|
28: Th 5/1 |
Group4(c): second 1/2 of project presentations |
Group4(d): project report |