Goal: Impulse control disorders (ICDs) are frequent non-motor symptoms occurring during the course of Parkinson’s disease (PD). The objective of this study was to estimate the predictability of the future occurrence of these disorders using longitudinal data, the first study using cross-validation and replication in an independent cohort.
Methods: We used data from two longitudinal PD cohorts (training set: PPMI, Parkinson’s Progression Markers Initiative; test set: DIGPD, Drug Interaction With Genes in Parkinson’s Disease). We included 380 PD subjects from PPMI and 388 PD subjects from DIGPD, with at least two visits and with clinical and genetic data available, in our analyses. We trained three logistic regressions and a recurrent neural network to predict ICDs at the next visit using clinical risk factors and genetic variants previously associated with ICDs. We quantified performance using the area under the receiver operating characteristic curve (ROC AUC) and average precision. We compared these models to a trivial model predicting ICDs at the next visit with the status at the most recent visit.
Results: The recurrent neural network (PPMI: 0.85 [0.80 – 0.90], DIGPD: 0.802 [0.78 – 0.83]) was the only model to be significantly better than the trivial model (PPMI: ROC AUC = 0.75 [0.69 – 0.81]; DIGPD: 0.78 [0.75 – 0.80]) on both cohorts. We showed that ICDs in PD can be predicted with better accuracy with a recurrent neural network model than a trivial model. The improvement in terms of ROC AUC was higher on PPMI than on DIGPD data, but not clinically relevant in both cohorts.
Conclusions: Our results indicate that machine learning methods are potentially useful for predicting ICDs, but further works are required to reach clinical relevance.