Quality Improvement/Clinical Outcomes
Andres A. Abreu, MD (he/him/his)
Postsdoctoral Researcher
Department of Surgery, University of Texas Southwestern Medical Center
Dallas, Texas, United States
Andres A. Abreu, MD (he/him/his)
Postsdoctoral Researcher
Department of Surgery, University of Texas Southwestern Medical Center
Dallas, Texas, United States
Andres A. Abreu, MD (he/him/his)
Postsdoctoral Researcher
Department of Surgery, University of Texas Southwestern Medical Center
Dallas, Texas, United States
Gilbert Z. Murimwa, MD
Resident
Department of Surgery, UT Southwestern Medical Center
Dallas, Texas, United States
Emile Farah, MD
Postdoctoral Researcher
Department of Surgery, University of Texas Southwestern Medical Center
Dallas, Texas, United States
Lucia Zhang, BS
Medical Student
University of Texas Southwestern Medical Center, United States
Jonathan Rodriguez, BS
Medical Student
University of Texas Southwestern Medical Center, United States
John Sweetenham, MD
Professor
University of Texas Southwestern Medical Center, United States
Herbert J. Zeh, III, MD
Professor and Chair
Department of Surgery, University of Texas Southwestern Medical Center
Dallas, TX, United States
Patricio M. Polanco, MD
Associate Professor
Department of Surgery, University of Texas Southwestern Medical Center
Dallas, Texas, United States
Internet-based health education is increasingly vital in patient care. However, the readability of online information often exceeds the average reading level of the U.S. population, limiting accessibility and comprehension. This study investigates the potential use of chatbot artificial intelligence (AI) to improve the readability of cancer patient-facing content.
Methods:
We used ChatGPT 4.0 to rewrite content about breast, colon, lung, prostate, and pancreas cancer across 34 NCCN websites. The AI-generated output’s readability was analyzed using Fry Readability Score, Flesch-Kincaid Grade Level, Gunning-Fog Index, and Simple Measure of Gobbledygook. The primary outcome was the mean readability score for the original and AI-generated content. Additionally, we assessed the quality of a random sample of 50 AI outputs using section two of the DISCERN score; which is a validated questionnaire used to assess the quality of written, online, patient-facing content.
Results:
The mean content readability level across the 34 NCCN websites was equivalent to a university freshman level (grade 13±1.5). However, after ChatGPT's intervention, the AI-generated outputs had a mean readability score equivalent to a high school freshman education level (grade 9±0.8). The improved readability was attributed to simpler words and shorter sentences. In addition, the mean DISCERN score of the random sample of AI-generated content was equivalent to "Good" (28.5±5) with no significant differences compared to their corresponding original counterparts.
Conclusions:
Our study demonstrates the potential of AI chatbots to improve the readability of patient-facing content while maintaining content quality. The decrease in requisite literacy after AI revision emphasizes the potential of this technology to reduce healthcare disparities caused by a mismatch between resources available to a patient and their health literacy.