Patient Behavior Prediction Using Big Data Analytics and Machine Learning
No Thumbnail Available
Authors
Narayanan, Usha
Issue Date
2025-11
Type
Dissertation
Language
en
Keywords
AutoML , Machine Learning , XAI , Business, Engineering, Science, & Technological Innovation , Healthcare Innovation & Delivery
Alternative Title
Abstract
Accurately predicting patient behaviors such as treatment adherence, healthcare engagement, and responses to interventions remains a persistent challenge due to the complex interaction of physiological, contextual, and behavioral factors. Traditional predictive models often overlook contextual determinants, including socioeconomic status, environmental conditions, and perioperative stress indicators, thereby limiting both accuracy and clinical applicability. The purpose of this quantitative, explanatory, quasi-experimental study was to develop and evaluate an automated machine learning (AutoML) based big data analytics system using the VitalDB dataset, licensed under the Creative Commons Attribution 4.0 International License, to predict patient adherence behaviors in perioperative care. The dataset included high-resolution physiologic signals, perioperative attributes, and electronic health record–derived outcomes from 6,388 surgical cases. The study compared AutoML frameworks with traditional machine learning models, including logistic regression, decision trees, random forests, and gradient boosting machines. Data preprocessing involved imputation, normalization, and feature engineering to address missing data and ensure model robustness. Model performance was evaluated using metrics such as the area under the receiver operating characteristic curve (ROC–AUC), precision–recall area (PR–AUC), and F1-score. Results demonstrated that ensemble and AutoML models achieved enhanced predictive performance (ROC–AUC = 0.99), while maintaining interpretability through SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). Key predictors included the Intraoperative Stress Index, ASA classification, and SpO₂ burden. Findings confirm that integrating contextual and physiologic data within explainable AutoML pipelines enhanced predictive accuracy and transparency, supporting the development of clinically actionable decision-support tools for personalized, data-driven healthcare.
