artificial intelligence ; convolutional neural networks ; endoscopy ; gastric cancer ; video
Abstract
We previously constructed a VGG-16 based artificial intelligence (AI) model (image classifier [IC]) to predict the invasion depth in early gastric cancer (EGC) using endoscopic static images. However, images cannot capture the spatio-temporal information available during real-time endoscopy-the AI trained on static images could not estimate invasion depth accurately and reliably. Thus, we constructed a video classifier [VC] using videos for real-time depth prediction in EGC. We built a VC by attaching sequential layers to the last convolutional layer of IC v2, using video clips. We computed the standard deviation (SD) of output probabilities for a video clip and the sensitivities in the manner of frame units to observe consistency. The sensitivity, specificity, and accuracy of IC v2 for static images were 82.5%, 82.9%, and 82.7%, respectively. However, for video clips, the sensitivity, specificity, and accuracy of IC v2 were 33.6%, 85.5%, and 56.6%, respectively. The VC performed better analysis of the videos, with a sensitivity of 82.3%, a specificity of 85.8%, and an accuracy of 83.7%. Furthermore, the mean SD was lower for the VC than IC v2 (0.096 vs. 0.289). The AI model developed utilizing videos can predict invasion depth in EGC more precisely and consistently than image-trained models, and is more appropriate for real-world situations.