Learning to detect vocal hyperfunction from ambulatory neck-surface acceleration features: Initial results for vocal fold nodules

Learning to detect vocal hyperfunction from ambulatory neck-surface acceleration features: Initial results for vocal fold nodules 556 235 IEEE Transactions on Biomedical Engineering (TBME)

Marzyeh Ghassemi, Jarrad H. Van Stan, Daryush D. Mehta, Matías Zañartu, Harold A. Cheyne II, Robert E. Hillman, and John V. Guttag, Massachusetts Institute of Technology, Massachusetts General Hospital, Universidad Técnica Federico Santa María, and Cornell University, Volume 61, Issue 6, Page: 1668–1675

July Mehta ghassemi-banner-556

Many common voice disorders are chronic or recurring conditions that are likely to result from faulty and/or abusive patterns of vocal behavior, referred to generically as vocal hyperfunction. Such behaviorally based disorders can be difficult to accurately assess in the clinical setting and could potentially be much better characterized by long-term ambulatory monitoring of vocal function as individuals engage in their typical daily activities. Our recent efforts have been aimed at developing a new, versatile, and cost-effective clinical tool for mobile voice monitoring that uses a neck-placed miniature accelerometer to sense voice production. The accelerometer is connected to a smartphone that serves as the data acquisition platform and provides a user-friendly interface for unobtrusive monitoring of voice use, daily calibration of the accelerometer sensor, and periodic alert capabilities.

In this study, we take an initial step toward characterizing hyperfunctional vocal behaviors by employing supervised machine learning techniques to aid in analyzing weeklong recordings from patients with a common manifestation of hyperfunction—vocal fold nodules—and vocally healthy speakers matched for age, gender, and occupation. We obtain average properties of a speaker’s vocal behavior every five minutes by estimating distributions of sound pressure level and fundamental frequency (including normalized features) from the neck-surface acceleration signal. Using support vector machine and logistic regression models, we show that the two groups of speakers exhibit distinct vocal behaviors that can be detected by solely using features extracted from the accelerometer signal. Correct classification of 22 out of the 24 subjects suggests that future work holds promise to noninvasively detect patients with the types of aberrant vocal behaviors that are associated with hyperfunctional voice disorders.

Keywords: vocal cords, nodules, machine learning, clinical detection, ambulatory voice monitoring