Vocal folds motility evaluation is paramount in both the assessment of functional deficits and in the accurate staging of neoplastic disease of the glottis. Diagnostic endoscopy, and in particular videoendoscopy, is nowadays the method through which the motility is estimated. The clinical diagnosis, however, relies on the examination of the videoendoscopic frames, which is a subjective and professional-dependent task. Hence, a more rigorous, objective, reliable, and repeatable method is needed. To support clinicians, this paper proposes a machine learning (ML) approach for vocal cords motility classification. From the endoscopic videos of 186 patients with both vocal cords preserved motility and fixation, a dataset of 558 images relative to the two classes was extracted. Successively, a number of features was retrieved from the images and used to train and test four well-grounded ML classifiers. From test results, the best performance was achieved using XGBoost, with precision = 0.82, recall = 0.82, F1 score = 0.82, and accuracy = 0.82. After comparing the most relevant ML models, we believe that this approach could provide precise and reliable support to clinical evaluation.

Classifying Vocal Folds Fixation from Endoscopic Videos with Machine Learning

Moccia, Sara
2023-01-01

Abstract

Vocal folds motility evaluation is paramount in both the assessment of functional deficits and in the accurate staging of neoplastic disease of the glottis. Diagnostic endoscopy, and in particular videoendoscopy, is nowadays the method through which the motility is estimated. The clinical diagnosis, however, relies on the examination of the videoendoscopic frames, which is a subjective and professional-dependent task. Hence, a more rigorous, objective, reliable, and repeatable method is needed. To support clinicians, this paper proposes a machine learning (ML) approach for vocal cords motility classification. From the endoscopic videos of 186 patients with both vocal cords preserved motility and fixation, a dataset of 558 images relative to the two classes was extracted. Successively, a number of features was retrieved from the images and used to train and test four well-grounded ML classifiers. From test results, the best performance was achieved using XGBoost, with precision = 0.82, recall = 0.82, F1 score = 0.82, and accuracy = 0.82. After comparing the most relevant ML models, we believe that this approach could provide precise and reliable support to clinical evaluation.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11564/828909
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact