Here is a quick overview of HW3: 1. Solve the mtcars exercise: a. Split mtcars into 3 files. (Use the code I provided in examples/5mtcarsPractice5mtcarsPractice/instructions.txt.) b. To process one file, use a pipeline like the one in examples/5mtcarsPractice5mtcarsPractice/example.sh. c. To get (b)'s processing to run in parallel across the three files, start from examples/4jobArray. 2. Answer questions about air travel from data in 22 files, 1987.csv through 2008.csv: (No "pre-parallel" step is necessary.) a. To process one file like 1987.csv: - download the large file (~ 500MB) - unzip it - write a pipeline to select the required subset of columns (DayOfWeek, DepDelay, Origin, Dest, Distance) and rows (Origin=MSN) and write a much smaller file (~ 0.1 MB) for later processing. b. Same idea as 1(c). c. Then combine the 22 smaller files into a single file (~2.5 MB). Process it twice: - Use a bash pipeline to answer "How far can you get from MSN in one flight?" - Use R to answer "What is the average departure delay for each day of the week?" d. Delete the large files (like 1987.csv) to save space. I recommend starting from the same example.sh and 4jobArray examples as in (1).