Hello Students:

Start by downloading HW1.ipynb from this folder. Then develop it into your solution.
Write code where you see "... your code here ..." below. (You are welcome to use more than one cell.)
I've included the output from my solution in HW1.html so you can check your work. Your output should match or be close to mine. Use 3 significant figures for floats. e.g. We can print 3 figures for 𝜋/1000 as print(f'{np.pi/1000:.3}'). The pattern is print(f'{x:.precision}'), where x is the value to print and precision is the number of figures.
If you have questions, please ask them in class or office hours. Our TA and I are very happy to help with the programming (provided you start early enough, and provided we are not helping so much that we undermine your learning).
Please clean up your code:
- Comment out unnecessary code that is useful for orienting you, like printing the data set.
- Label your output, like writing 'weight=20.1' or 'The weight is 20.1' rather than just '20.1'.
- Simplify your code if you can.
When you are done, run these Notebook commands:
- Shift-L (once, so that line numbers are visible)
- Kernel > Restart and Run All (run all cells from scratch)
- Esc S (save)
- File > Download as > HTML
Turn in:
- HW01.ipynb to Canvas's HW01.ipynb assignment
- HW01.html to Canvas's HW01.html assignment
- As a check, download your files from Canvas to a new 'junk' folder. Try 'Kernel > Restart and Run All' on the '.ipynb' file to make sure it works. Glance through the '.html' file.
Turn in partial solutions to Canvas before the deadline. e.g. Turn in part 1, then parts 1 and 2a, then your whole solution. That way we can award partial credit even if you miss the deadline. We will grade your last submission before the deadline.

1. Use a hard-margin SVM¶

to classify cars as having automatic or manual transmissions.

Read http://www.stat.wisc.edu/~jgillett/451/01/mtcars30.csv into a DataFrame. (This is the mtcars data frame from R with two of its rows removed to get linearly separable data.)
Make an X from the wt (weight in 1000s of pounds) and mpg (miles per gallon) columns. Make y from the am column (where 0=automatic or 1=manual transmission).
Train an SVM using kernel='linear' and C=1000. Print its coefficients and intercept.
Report the training accuracy. (It's given by clf.score(X, y).)
Predict the transmission for a car weighing 4000 pounds (wt=4) that gets 20 mpg.
Use five plt.plot() calls to make a figure with wt on its x-axis and mpg on its y-axis including:
- the automatic transmission cars in red
- the manual transmission cars in blue
- the decision boundary (the center line of the road)
- the lower margin boundary (the left side of the road)
- the upper margin boundary (the right side of the road)
- a reasonable title, axis labels, and legend

2. Make three linear regression models.¶

2a: Make a simple regression model by hand.¶

Use the matrix formula $w = (X^T X)^{-1} X^T y$ we developed in class to fit these three points: (0, 5), (2, 1), (4, 3). (Use linear_model.linearRegression(), if you wish, to check your work.)

... your answer here (just give the model, $y = w x + b$) ...

2b: Make a simple linear regression model from real data.¶

Estimate the average daily trading volume of a Dow Jones Industrial Average stock from its market capitalization. That is, use $y = $ AvgVol vs. $x =$ MarketCap.

Read http://www.stat.wisc.edu/~jgillett/451/data/DJIA.csv into a DataFrame.
Find the model. Print its equation.
Print its $R^2$ value (the proportion of variability in $y$ accounted for by $x$ via the linear model, given by model.score(X, y)).
Make a plot of the data and regression line.
Use the model to predict the volume for a company with market capitalization of 0.25e12 (a quarter-trillion dollars); add this as a red point on your plot.
Say what happens to Volume as Market Capitalization increases. (Use a Markdown cell.)

Make a multiple regression model.¶

Estimate the same volume from both market capitalization and price. That is, use $y =$ AvgVol vs. $x_1 =$ MarketCap and $x_2 =$ Price.

Find the model.
Print its equation.
Print its $R^2$ value.
Say what happens to Volume as Market Capitalization increases (while holding Price fixed) and what happens to Volume as Price increases (while holding Capitalization fixed). (Use a Markdown cell.)

HW1: Practice with Python, hard-margin SVM, and linear regression¶

... your name and NetID here ...¶

1. Use a hard-margin SVM¶

2. Make three linear regression models.¶

2a: Make a simple regression model by hand.¶

2b: Make a simple linear regression model from real data.¶

Make a multiple regression model.¶