import pandas as pd
df = pd.read_csv("MBA.csv")
df.head(10)
application_id | gender | international | gpa | major | race | gmat | work_exp | work_industry | admission | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Female | False | 3.30 | Business | Asian | 620.0 | 3.0 | Financial Services | Admit |
1 | 2 | Male | False | 3.28 | Humanities | Black | 680.0 | 5.0 | Investment Management | NaN |
2 | 3 | Female | True | 3.30 | Business | NaN | 710.0 | 5.0 | Technology | Admit |
3 | 4 | Male | False | 3.47 | STEM | Black | 690.0 | 6.0 | Technology | NaN |
4 | 5 | Male | False | 3.35 | STEM | Hispanic | 590.0 | 5.0 | Consulting | NaN |
5 | 6 | Male | False | 3.18 | Business | White | 610.0 | 6.0 | Consulting | NaN |
6 | 7 | Female | False | 2.93 | STEM | Other | 590.0 | 3.0 | Technology | Admit |
7 | 8 | Male | True | 3.02 | Business | NaN | 630.0 | 6.0 | Financial Services | NaN |
8 | 9 | Male | False | 3.24 | Business | White | 590.0 | 2.0 | Nonprofit/Gov | NaN |
9 | 10 | Male | False | 3.27 | Humanities | Asian | 690.0 | 3.0 | Consulting | NaN |
Link to data set on kaggle: https://www.kaggle.com/code/mahmoudredagamail/mba-admission-dataset-class-2025/input
Can we predict MBA acceptance at Wharton School of Business based on Gender, GPA, GMAT, work experience, and undergraduate major?
Which of these variables is most important for predicting acceptance at Wharton?
Encoding Variables
Modeling Approaches