Three-Dimensional Multi-Task Deep Learning Model to Detect Glaucomatous Optic Neuropathy and Myopic Features From Optical Coherence Tomography Scans: A Retrospective Multi-Centre Study Journal Articles uri icon

  •  
  • Overview
  •  
  • Research
  •  
  • Identity
  •  
  • Additional Document Info
  •  
  • View All
  •  

abstract

  • PurposeWe aim to develop a multi-task three-dimensional (3D) deep learning (DL) model to detect glaucomatous optic neuropathy (GON) and myopic features (MF) simultaneously from spectral-domain optical coherence tomography (SDOCT) volumetric scans.MethodsEach volumetric scan was labelled as GON according to the criteria of retinal nerve fibre layer (RNFL) thinning, with a structural defect that correlated in position with the visual field defect (i.e., reference standard). MF were graded by the SDOCT en face images, defined as presence of peripapillary atrophy (PPA), optic disc tilting, or fundus tessellation. The multi-task DL model was developed by ResNet with output of Yes/No GON and Yes/No MF. SDOCT scans were collected in a tertiary eye hospital (Hong Kong SAR, China) for training (80%), tuning (10%), and internal validation (10%). External testing was performed on five independent datasets from eye centres in Hong Kong, the United States, and Singapore, respectively. For GON detection, we compared the model to the average RNFL thickness measurement generated from the SDOCT device. To investigate whether MF can affect the model’s performance on GON detection, we conducted subgroup analyses in groups stratified by Yes/No MF. The area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, and accuracy were reported.ResultsA total of 8,151 SDOCT volumetric scans from 3,609 eyes were collected. For detecting GON, in the internal validation, the proposed 3D model had significantly higher AUROC (0.949 vs. 0.913, p < 0.001) than average RNFL thickness in discriminating GON from normal. In the external testing, the two approaches had comparable performance. In the subgroup analysis, the multi-task DL model performed significantly better in the group of “no MF” (0.883 vs. 0.965, p-value < 0.001) in one external testing dataset, but no significant difference in internal validation and other external testing datasets. The multi-task DL model’s performance to detect MF was also generalizable in all datasets, with the AUROC values ranging from 0.855 to 0.896.ConclusionThe proposed multi-task 3D DL model demonstrated high generalizability in all the datasets and the presence of MF did not affect the accuracy of GON detection generally.

authors

  • Ran, An Ran
  • Wang, Xi
  • Chan, Poemen P
  • Chan Wan Kai, Noel
  • Yip, Wilson
  • Young, Alvin L
  • Wong, Mandy OM
  • Yung, Hon-Wah
  • Chang, Robert T
  • Mannil, Suria S
  • Tham, Yih Chung
  • Cheng, Ching-Yu
  • Chen, Hao
  • Li, Fei
  • Zhang, Xiulan
  • Heng, Pheng-Ann
  • Tham, Clement C
  • Cheung, Carol Y

publication date

  • 2022