Objective: To investigate the potential of deep learning for automatically delineating (segmenting) laryngeal cancer superficial extent on endoscopic images and videos.MethodsA retrospective study was conducted extracting and annotating white light (WL) and Narrow-Band Imaging (NBI) frames to train a segmentation model (SegMENT-Plus). Two external datasets were used for validation. The model's performances were compared with those of two otolaryngology residents. In addition, the model was tested on real intraoperative laryngoscopy videos.ResultsA total of 3933 images of laryngeal cancer from 557 patients were used. The model achieved the following median values (interquartile range): Dice Similarity Coefficient (DSC) = 0.83 (0.70-0.90), Intersection over Union (IoU) = 0.83 (0.73-0.90), Accuracy = 0.97 (0.95-0.99), Inference Speed = 25.6 (25.1-26.1) frames per second. The external testing cohorts comprised 156 and 200 images. SegMENT-Plus performed similarly on all three datasets for DSC (p = 0.05) and IoU (p = 0.07). No significant differences were noticed when separately analyzing WL and NBI test images on DSC (p = 0.06) and IoU (p = 0.78) and when analyzing the model versus the two residents on DSC (p = 0.06) and IoU (Senior vs. SegMENT-Plus, p = 0.13; Junior vs. SegMENT-Plus, p = 1.00). The model was then tested on real intraoperative laryngoscopy videos.ConclusionSegMENT-Plus can accurately delineate laryngeal cancer boundaries in endoscopic images, with performances equal to those of two otolaryngology residents. The results on the two external datasets demonstrate excellent generalization capabilities. The computation speed of the model allowed its application on videolaryngoscopies simulating real-time use. Clinical trials are needed to evaluate the role of this technology in surgical practice and resection margin improvement.Level of EvidenceIII Laryngoscope, 2024A custom-made algorithm called SegMENT-Plus was trained on 3933 laryngeal carcinoma images obtained by 557 patients. The model achieved Dice similarity coefficient of 0.827, Intersection over the union of 0.828, accuracy of 0.972, and inference speed of 25.6 fps, thus reaching real-time performances. SegMENT-Plus performed similarly on two external validation datasets. The performances of the model showed no significant differences from those obtained by two residents. The implementation of artificial intelligence during laryngoscopy can support clinicians in delineating the superficial extent of laryngeal cancer. SegMENT-Plus showed reliable results, with performances equal to those of two otolaryngology residents and with computation speed.image.

Real‐Time Laryngeal Cancer Boundaries Delineation on White Light and Narrow‐Band Imaging Laryngoscopy with Deep Learning

Moccia, Sara;
2024-01-01

Abstract

Objective: To investigate the potential of deep learning for automatically delineating (segmenting) laryngeal cancer superficial extent on endoscopic images and videos.MethodsA retrospective study was conducted extracting and annotating white light (WL) and Narrow-Band Imaging (NBI) frames to train a segmentation model (SegMENT-Plus). Two external datasets were used for validation. The model's performances were compared with those of two otolaryngology residents. In addition, the model was tested on real intraoperative laryngoscopy videos.ResultsA total of 3933 images of laryngeal cancer from 557 patients were used. The model achieved the following median values (interquartile range): Dice Similarity Coefficient (DSC) = 0.83 (0.70-0.90), Intersection over Union (IoU) = 0.83 (0.73-0.90), Accuracy = 0.97 (0.95-0.99), Inference Speed = 25.6 (25.1-26.1) frames per second. The external testing cohorts comprised 156 and 200 images. SegMENT-Plus performed similarly on all three datasets for DSC (p = 0.05) and IoU (p = 0.07). No significant differences were noticed when separately analyzing WL and NBI test images on DSC (p = 0.06) and IoU (p = 0.78) and when analyzing the model versus the two residents on DSC (p = 0.06) and IoU (Senior vs. SegMENT-Plus, p = 0.13; Junior vs. SegMENT-Plus, p = 1.00). The model was then tested on real intraoperative laryngoscopy videos.ConclusionSegMENT-Plus can accurately delineate laryngeal cancer boundaries in endoscopic images, with performances equal to those of two otolaryngology residents. The results on the two external datasets demonstrate excellent generalization capabilities. The computation speed of the model allowed its application on videolaryngoscopies simulating real-time use. Clinical trials are needed to evaluate the role of this technology in surgical practice and resection margin improvement.Level of EvidenceIII Laryngoscope, 2024A custom-made algorithm called SegMENT-Plus was trained on 3933 laryngeal carcinoma images obtained by 557 patients. The model achieved Dice similarity coefficient of 0.827, Intersection over the union of 0.828, accuracy of 0.972, and inference speed of 25.6 fps, thus reaching real-time performances. SegMENT-Plus performed similarly on two external validation datasets. The performances of the model showed no significant differences from those obtained by two residents. The implementation of artificial intelligence during laryngoscopy can support clinicians in delineating the superficial extent of laryngeal cancer. SegMENT-Plus showed reliable results, with performances equal to those of two otolaryngology residents and with computation speed.image.
File in questo prodotto:
File Dimensione Formato  
Laryngoscope 2024Sampieri.pdf

accesso aperto

Tipologia: PDF editoriale
Dimensione 9.72 MB
Formato Adobe PDF
9.72 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11564/828997
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 5
social impact