Detecting and analysing spontaneous oral cancer speech in the wild
Bence Mark Halpern; Rob van Son; Michiel van den Brekel; Odette Scharenborg
arXiv2020
21
scharenborg2020detecting
Abstract
Oral cancer speech is a disease which impacts more than half a million people
worldwide every year. Analysis of oral cancer speech has so far focused on read
speech. In this paper, we 1) present and 2) analyse a three-hour long
spontaneous oral cancer speech dataset collected from YouTube. 3) We set
baselines for an oral cancer speech detection task on this dataset. The
analysis of these explainable machine learning baselines shows that sibilants
and stop consonants are the most important indicators for spontaneous oral
cancer speech detection.