Applied Machine Learning for Health Professionals​

Course Description

This two-part series is designed to equip clinicians, researchers, and other healthcare professionals with practical machine-learning skills tailored to medical data. By focusing on real-world examples drawn from electronic health records and cohort studies, participants will gain the tools they need to prepare data, develop predictive models, and evaluate their performance in a clinical context. In Bright Health Science, this course is instructed by Ali M. Shabestari and Dr. Motahare Shabestari.
 

Part I: Python Programming & Data Preprocessing

Objectives:

  • Introduce Python fundamentals, from variables and control flow to functions and modules.
  • Master NumPy & pandas libraries for cleaning, transforming, and organizing tabular health datasets.
  • Learn best practices for handling outlier values, categorical encoding, normalization, and feature engineering in medical data
  • Apply techniques directly to sample datasets drawn from cohort studies and electronic health records

Part II: Machine-Learning Models & Clinical Implementation

Objectives:

  • Explore core supervised-learning algorithms: classifiers (e.g., logistic regression, decision trees) and regressors (e.g., linear regression, random forest)
  • Understand model assumptions, strengths, and ideal use cases in health fields.
  • Develop skills for training, hyperparameter tuning, and cross-validation on tabular medical datasets.
  • Learn rigorous evaluation metrics to assess clinical applicability.

Capstone Project

In the final module, participants will apply their new skills to a real medical dataset. Guided through the end-to-end ML workflow, they will:

  1. Prepare and preprocess raw clinical data
  2. Select and train appropriate models
  3. Tune hyperparameters for optimal performance
  4. Evaluate and interpret results with an eye toward clinical deployment

Who Should Enroll?

  • Physicians, nurses, and allied health professionals seeking to leverage data science in their practice
  • Clinical researchers aiming to incorporate predictive analytics into their studies
  • Data analysts in healthcare settings who want a structured, medically-focused ML curriculum

By the end of this course, you will be able to independently develop and evaluate machine-learning models that address real-world challenges in various medical domains, empowering you to research and drive innovation in patient care.


Course Content

1. Prerequisites
  • Python Programming
2.1. Variables
2.2. Input/Output
2.3. Data Types
2.4. Strings
2.5. Operators
3.1. Control Flow (if / else)
3.2. for loop
3.3. while loop
4.1. Functions
4.2. Anonymous Functions (lambda)
4.3. Exception Handling
 
  • Pandas
5.1. Reading & Writing in Pandas
5.2. Indexing, Selecting & Assigning
5.3. Summary Functions & Map
6.1. Grouping & Aggregation
6.2. Merging & Combining
6.3. Data Types & Missing Values
 
  • Data Preprocessing
7.1. Dataset Introduction
7.2. Outlier Detection & Handling
7.3. Missing Data Imputation
7.4. Data Type Preprocessing
 
  • Exploratory Data Analysis
8.1. Introduction to EDA
8.2. Primary Steps (review of data preprocessing)
8.3. Univariate Analysis
8.4. Bivariate Analysis
 
  • NumPy
9.1. Introduction to NumPy & Arrays
9.2. Array Operations & Indexing
9.3. Statistical Analysis with NumPy
 
  • Classification Models
1.1. Classification & Regression
1.2. Baseline Model (Logistic Regression)
1.3. Prediction & Probability Threshold
1.4. Evaluation Metrics
1.5. Extra (Notebook)
 
  • Advanced Classifiers
2.1. Decision Tree
2.2. Bagging (Random Forest)
2.3. Boosting (XGBoost)
2.4. Overfitting & Model Selection
 
  • Hyperparameter Tuning
3.1. Hyperparameters
3.2. Hyperparameter Tuning
3.3. Retraining
 
  • Interpretability
4.1. Importance of Interpretability
4.2. Global Feature Importance
4.3. SHAP
 
  • Feature Engineering
5.1. Feature Engineering
5.2. Feature Creation
5.3. Feature Selection
5.4. Dimensionality Reduction
 
  • Deployment & Reproducibility
6.1. Saving & Packaging the Model
6.2. Environment Reproducibility
6.3. Introduction to Model Deployment
6.4. Conclusion & Next Steps