The Digital Biomarker Discovery Pipeline (DBDP) is an open source platform for the development of digital biomarkers.



The DBDP provides complete end-to-end digital biomarker development. Modules are dedicated to data pre-processing, data analysis, algorithm development, and validation.


Modular Framework

The DBDP is a modular framework, providing both end-to-end complete solutions as well as tools and algorithms for discovery.


Open Source

The DBDP is open source under the Apache 2.0 License. The DBDP is available in GitHub under a Contributor Code of Conduct.

The Purpose

Digital biomarkers are digitally collected data that are transformed into indicators of health outcomes. Digital biomarkers currently require extensive domain knowledge and computing skills. The purpose of the DBDP is to provide code sets, functions, and algorithms for the entire digital biomarker discovery pipeline to make discovering digital biomarkers more accessible. From the input of wearable sensor data to the development of machine learning and deep learning algorithms, we have provided an open source software resource for the digital biomarker community. For more information, please see our recent publication on the DBDP.

DBDP Modules

The modular framework of the DBDP lets you mix and match pre-processing code, exploratory data analyses, and algorithms for discovering your own digital biomarkers. New DBDP Modules are being uploaded weekly.
View All Sections on GitHub

Wearables data can be messy and typically needs considerable pre-processing in order to start analyzing data. We provide pre-processing modules for a variety of different mHealth and wearable sensors.

Exploratory Data Analysis is a standard process in the early stages of digital biomarker development. EDA allows us to explore relationships between variables in the data, examine trends, analyze missingness of data, and begin the process of understanding the link between the data and the physiological state we are studying.

Heart rate variability (HRV) is the physiological phenomenon of the variation in the time interval between consecutive heartbeats in milliseconds. Higher HRV has been found to be associated with reduced morbidity and mortality and improved psychological well-being and quality of life. HRV is regulated by the autonomic nervous system and is thus an important indicator of nervous system function. HRV is a potential digital biomarker for a number of diseases and conditions.

Resting heart rate (RHR) characterizes several health conditions, such as type 2 diabetes and cardiovascular diseases, and the DBDP provides a personalized, accessible and transparent method for estimating RHR from photoplethysmography-based wearable device (e.g., Fitbit) data.

Nutrition Information Search Engine.

Accurate sleep detection is necessary to determine circadian rhythm and to discover relationships between circadian rhythm and sleep characteristics.

Glucose variability is indicative of hyperglycemia, hypoglycemia, and risk for developing prediabetes and T2D. They are also indicators of glycemic control, which is an important metric when evaluating the health of both T1D and T2D patients. This module can be used with continuous glucose monitoring data to calculate 20+ glucose variability metrics.

This project is in partnership with the Rhodes Information Iniative Data+ undergraduate summer research program (Summer 2020). Come back soon for more info!

This project is currently underway. Come back soon for more info!

Contribute your digital biomarker to the DBDP!

Extensive Documentation

The DBDP provides extensive user documentation including a User Guide, instructions and documentation for contributors, and monitored issue tracking.

User Guide

The user guide was created for users, contributors, and digital medicine enthusiasts alike. Instructions for using the DBDP are located in our Wiki.

Contribution Instructions

If you would like to contribute your digital biomarker to the DBDP, please follow our Instructions for Contribution Guide located in our GitHub.

Monitored Issue Tracking

Have an idea for a new biomarker? Need help with the current modules? Issue Tracking is monitored by the Big Ideas Lab and we are happy to help!

The Digital Biomarker Discovery process can be challenging. We have compiled some resources that, along with the DBDP, can be used to make digital biomarker discovery more robust. We provide resources for choosing a wearable sensor, data handling, validation, and much more!

Ready to Start Discovering?

Discover more robust digital biomarkers with the DBDP. Start discovering now on GitHub! We believe that not only data, but also computational pipelines and algorithms should operate by the FAIR principles (Findable, Accessible, Interoperable, and Reusable).

The DBDP was developed and is curated by the Big Ideas Lab at Duke University. The BIG IDEAS Lab is developing digital biomarkers for a range of diseases and conditions using a variety of sensors.

The DBDP has been made possible in part by grant number 2020-218599 from the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation.

The DBDP has partnered with Open mHealth and MD2K Cerebral Cortex to open-source digital biomarker development.