My training background spans both patient-oriented clinical research (in the lab) and data-oriented approaches to ambulatory (out of the lab) measurement, estimation and prediction of physical activity and other health behaviors. This puts me in a unique position to understand and evaluate these behavioral outcomes both from the perspective of the patient (e.g., intervention delivery), and from a data architecture perspective (e.g., use of appropriate trackers, extraction and summarization of mobile data). I apply these approaches to outcomes beyond physical activity, including sleep, physiological outcomes (e.g., heart rate, blood pressure), and psychological outcomes (e.g., stress, depressive symptoms).

1. Women’s health, participatory mHealth research

A central aim of my research is to re-tool patient-generated data via mHealth technology to better characterize conditions that are traditionally poorly documented and not well understood. To this end, I collaborate on Citizen Endo and the EVEN initiative led by Professor Noemie Elhadad at Columbia University Department of Biomedical Informatics. Citizen Endo aims to better document, understand, and develop treatment strategies for endometriosis using a citizen science approach through its mHealth research tracking app, Phendo.

But what is endometriosis and why study it? Endometriosis is an inflammatory disease that happens when the tissue that typically underlies the uterus starts growing outside, and spread to other organs, causes adhesions. Not only its very painful, but also not well understood, hence enigmatic, not properly treated, and is the 2nd largest cause of hospitalizations for women of child bearing age, after heart attacks. Its prevalence is estimated to be ~10%, and takes 7-10 years to diagnose. There is no cure or adequate treatments for endometriosis- even with hysterectomies 2 out of 3 times the pain returns. Pharmacotherapy can be helpful, though has not been found to be efficacious in the long term.

One reason why it is still a clinically misunderstood and poorly managed disease is that this is a cyclical disease where the symptoms fluctuate and the experience of the disease varies from one person to another. As such, we need to first better understand the nature of this variability and possible modifiable factors associated. To achieve this, I utilize patient-tracked data collected via mHealth technology. Direct patient input via self-tracked data can substantially augment electronic health records, which are currently the primary source for making many clinical decisions. And this is problematic because EHR can be incomplete, sparse, limited in the amount and type of information they provide and not standardized across different databases. In contrast, self tracked data gives us more individual level details, collected frequently over time, some gives us more context and increases the speed with which the information travels across different components of the medical health system. You can check out my latest published work in Applied Clinical Informatics on this topic.

This is just one example of how we are re-tooling this industry disruption of the medical field into making new discoveries and to better characterize traditionally under-documented problems and enigmatic diseases, which has applicability to conditions beyond endometriosis.

2. Wearables/sensors: understanding health through data

I use informatics and data-driven approaches to delineate symptom trajectories in diseases with a dynamic course (e.g., endometriosis, multiple sclerosis), and identification of self-management approaches for their effective management. I primarily study physical activity as a self-management approach in this context, and investigate 1) its personalization for targeting symptoms, and 2) methods for improving our measurement and estimation of physical activity based on data collected from wearables, sensors, mobile patient tracking. To see a working paper and project in progress on this topic, you can check out my presentation at the 2020 Data Science Day at Columbia University.

My latest work in this realm investigates digital phenotyping of sleep patterns among heterogenous samples of Latinx adults using a flexible unsupervised machine learning technique that relies on mixtures of multivariate generalized linear mixed models. I presented this work at the Annual DSI Data Science Day on April 21, 2021 and a preview of it is available online. It has most recently been published in Sleep Medicine.

3. Physical Activity and Health:

As a trained kinesiologist, a focus on physical activity and its health outcomes is a central theme across my projects. But why do we do physical activity research? Simply put, because the lack of it can kill you. WHO has declared physical inactivity as an urgent public health priority. It is the 4th leading risk factor for all-cause death and disease. This is not just a developed world problem, it is a global problem. Overall, we are moving less, sitting more, and this is killing us. Think of how little physical effort we put in to accomplish a lot of the daily tasks nowadays, ordering everything online without having to leave your couch, having a little computer clean your entire house, how many children nowadays spend their summers on ipads, compared what you used to do as a kid? We all have a relationship with physical activity, and this differs from one person to another. So we really need to consider how such things are influencing physical activity and how physical activity influences these outcomes in return.

My current projects focus on methodology for combining actively- and passively-collected data, and include:

  • Zero-inflated negative binomial models for handling the high level of zeros and variance inherent in most mHealth data, particularly physical activity and fluctuating chronic disease symptoms,
  • Investigation of how knowledge obtained from self-tracking research mHealth apps can augment evidence from electronic health records (EHR), and to which extent these participatory research data can provide complementary information about diseases that are poorly documented in the EHR and poorly understood,
  • Extreme gradient boosting (XGB) tree models to predict future symptoms from prior time point health behaviors obtained from EMA data, and comparison to other models for best capturing the between-person variability in symptom improvement,
  • Application of reinforcement learning algorithms to generate automated personalized exercise recommendations to reduce and prevent pain in women with endometriosis.