Colorectal
Hannah Williams, MD
Research Fellow
Memorial Sloan Kettering Cancer Center, New York, United States
Hannah Williams, MD
Research Fellow
Memorial Sloan Kettering Cancer Center, New York, United States
Hannah Williams, MD
Research Fellow
Memorial Sloan Kettering Cancer Center, New York, United States
Aneesh Rangnekar, PhD
Postdoctoral Fellow
Memorial Sloan Kettering Cancer Center, Department of Medical Physics, United States
Hannah M. Thompson, MD, MPH
Research Fellow
Memorial Sloan Kettering Cancer Center, Department of Surgery, United States
Christina Lee, MD
Research Fellow
Memorial Sloan Kettering Cancer Center, Department of Surgery, United States
Jorge T. Gomez, BSc
Master's student
Memorial Sloan Kettering Cancer Center, Department of Medical Physics, United States
Harini Veeraraghavan, PhD
Associate Member
Memorial Sloan Kettering Cancer Center, Department of Medical Physics, United States
Julio Garcia-Aguilar, MD, PhD
Attending Surgeon
Memorial Sloan Kettering Cancer Center, Department of Surgery
New York, NY, United States
The safety of a watch and wait (WW) approach for rectal cancer requires that surgeons correctly identify residual disease in post-treatment tumors and in the subtle mucosal changes that characterize local regrowth. Applying convolutional neural networks (CNNs) to endoscopic images may help providers interpret endoluminal response. To date, CNNs have shown only moderate accuracy in detecting residual tumor. We developed a highly accurate CNN model and validated its performance against surgeons at multiple stages of training.
Methods:
Patients with stage II/III rectal cancer treated with total neoadjuvant therapy (TNT) from 2012-2020 were retrospectively reviewed and their endoscopic images collected before, during and after treatment. A CNN model was built to predict presence of tumor using ResNet-50 architecture. The model’s diagnostic performance was analyzed during training and for two independent test sets. The main test set included images from patients with a) residual tumor after TNT or b) a sustained clinical complete response (cCR) for ≥2 years on WW. The second test set contained images from patients with a) local regrowth or b) a sustained cCR. Surgeons and surgical trainees at our institution completed an online survey of 119 deidentified images, with participants asked to determine whether each image contained tumor. Group averages and Fleiss’ kappa were calculated by respondent experience level, with results compared to the CNN model’s performance.
Results:
A total of 2717 images from 288 patients were included, with 785 (28.9%) and 147 (5.4%) used in the main and local regrowth test sets. The CNN identified tumor with an accuracy of 95%, 91% and 78% for the training, main test and local regrowth test sets, respectively. Sixteen participants (4 residents, 7 colorectal fellows, 5 colorectal attendings) completed the survey. The model performed on par with respondents of all experience levels for the main test set (Table 1A). Interobserver agreement was good (k=0.707-0.809). All groups outperformed the model in identifying tumor from images of local regrowth, with interobserver agreement ranging from fair to moderate (k=0.235-0.518) (Table 1B).
Conclusions:
A highly accurate CNN matched the performance of experienced colorectal surgeons in identifying tumor from images of rectal cancers treated with TNT. Participants outperformed the model in detecting local regrowth, suggesting that improved performance requires larger image databases and incorporation of a time component into the analysis.