Project Proposal¶

Group 28¶

Group Members:¶

Mai Tah Lee mtlee2@wisc.edu
Annie Purisch apurisch@wisc.edu
Seth Mlodzik smlodzik@wisc.edu
Tianxing Liu tliu398@wisc.edu

Write a one-page proposal (4 points) including a few lines of code to read data, descriptions of the question(s), variable(s), and methods you will use. Turn in a proposal.html (or .pdf), once per group.

Dataset link:¶

https://www.kaggle.com/datasets/uom190346a/mental-health-diagnosis-and-treatment-monitoring

Description of Dataset:¶

Dataset contains world health mental health diagnoses, treatment plans, and outcomes of those treatment plans. It also includes the symptoms, medication, other types of treatments, and patient demographics. Notably, the datset does not contain real data, it is synthetic.

Research Questions¶

Primary Question: Can mental health patient's treatment outcome be predicted (improved, no change, deteriorated) based on their demographics, symptoms of severity, mood scores, and treatment types?
Secondary Question: What factors contribute to adherence to treatment, and how does adherence influence treatment outcome?

Methods:¶

Missing data will be accounted for, other categorical values will also be encoded, and numeric values will be normalized.
Data Analysis: Will include analyzing the relationships between other variables like age, symptom severity, and treatment adherence with the treatment outcomes.
Modeling: Classifications will be done with logistic regresision and decision trees to classify outcomes based on patient characteristics.

The main importance of the analysis is to identify key predictors of treatment success.

In [1]:

import pandas as pd

df = pd.read_csv("mental_health_diagnosis_treatment_.csv")
df.head(3)

Out[1]:

	Patient ID	Age	Gender	Diagnosis	Symptom Severity (1-10)	Mood Score (1-10)	Sleep Quality (1-10)	Physical Activity (hrs/week)	Medication	Therapy Type	Treatment Start Date	Treatment Duration (weeks)	Stress Level (1-10)	Outcome	Treatment Progress (1-10)	AI-Detected Emotional State	Adherence to Treatment (%)
0	1	43	Female	Major Depressive Disorder	10	5	8	5	Mood Stabilizers	Interpersonal Therapy	2024-01-25	11	9	Deteriorated	7	Anxious	66
1	2	40	Female	Major Depressive Disorder	9	5	4	7	Antipsychotics	Interpersonal Therapy	2024-02-27	11	7	No Change	7	Neutral	78
2	3	55	Female	Major Depressive Disorder	6	3	4	3	SSRIs	Mindfulness-Based Therapy	2024-03-20	14	7	Deteriorated	5	Happy	62

In [2]:

# variables
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 17 columns):
 #   Column                        Non-Null Count  Dtype 
---  ------                        --------------  ----- 
 0   Patient ID                    500 non-null    int64 
 1   Age                           500 non-null    int64 
 2   Gender                        500 non-null    object
 3   Diagnosis                     500 non-null    object
 4   Symptom Severity (1-10)       500 non-null    int64 
 5   Mood Score (1-10)             500 non-null    int64 
 6   Sleep Quality (1-10)          500 non-null    int64 
 7   Physical Activity (hrs/week)  500 non-null    int64 
 8   Medication                    500 non-null    object
 9   Therapy Type                  500 non-null    object
 10  Treatment Start Date          500 non-null    object
 11  Treatment Duration (weeks)    500 non-null    int64 
 12  Stress Level (1-10)           500 non-null    int64 
 13  Outcome                       500 non-null    object
 14  Treatment Progress (1-10)     500 non-null    int64 
 15  AI-Detected Emotional State   500 non-null    object
 16  Adherence to Treatment (%)    500 non-null    int64 
dtypes: int64(10), object(7)
memory usage: 66.5+ KB