P4@NU: Exploring the role of digital and genetic biomarkers to learn
personalized predictive models of metabolic diseases
, Paolo Missier1
Newcastle University - School of Computing
Newcastle University - National Innovation Centre for Ageing
Newcastle upon Tyne, United Kingdom
Medical practice is evolving, from the inefficient detect-and-cure
approach and towards a Preventive, Predictive, Personalised and
Participative (P4) vision that focuses on extending individual
wellness state, particularly those of ageing individuals , and is
underpinned by “Big Health Data” such as from genomics and personal
wearables . P4 medicine has the potential to not only vastly
improve people’s quality of life, but also to significantly reduce
healthcare costs and improve its efficiency.
The P4@NU project (P4 at Newcastle University) aims to develop
predictive models of transitions from wellness to disease states using
digital and genetic biomarkers. The main challenge is to extract viable
biomarkers from raw data, including activity traces from self-monitoring
wearables, and use them to develop models that are able to anticipate
disease such that preventive interventions can still be effective. Our
hypothesis is that signals for detecting the early onset of chronic
diseases can be found by appropriately combining multiple kinds of
UK Biobank  is our initial and key data resource, consisting of a
cohort of 500,000 individuals characterised by genotype, phenotype
and, for 100,000 individuals, also free-living accelerometer
activity data. Our initial focus is on using UK Biobank data to
learn models to predict two cardiometabolic diseases: Type-2
Diabetes (T2D) and Cardiovascular Disease (CVD). These diseases are
typically associated with insufficient activity and sleep and are
preventable given lifestyle changes. 
Our initial investigation has focused on extracting high-level
features from activity traces, and learning traditional models
(random forests and SVM) to predict simple disease state. This has
given us a baseline for the more complex studies that include
genotyping data (selected SNPs), and will be based on deep learning
 L. Hood and S. H. Friend, “Predictive, personalized, preventive,
participatory (P4) cancer medicine,” Nat. Rev. Clin. Oncol., vol. 8, no.
3, p. 184, 2011.
 N. D . Price et al. , “ A wellness study of 108 individuals using
personal, dense, dynamic data clouds,” Nat. Biotechnol., vol. 35, p.
747, Jul. 2017.
 C. Bycroft et al., “The UK Biobank resource with deep phenotyping
and genomic data,” Nature, vol. 562, no. 7726, pp. 203–209, 2018.
 Cassidy S, Chau JY, Catt M, et al "Cross-sectional study of diet,
physical activity, television viewing and sleep duration in 233 110
adults from the UK Biobank; the
behavioural phenotype of cardiovascular
disease and type 2 diabetes", BMJ Open 2016.