Tweets by SSO 2024 Annual Meeting

Icon Legend

This session is not in your schedule.

This session is in your schedule. Click again to remove it.

Presentation Icons

Recorded for On-Demand Viewing

Quality

Quality Poster Presentations

P54: Assessing Generative Artificial Intelligence (AI) as a Source of Breast Cancer Information for Patients

Saturday, March 23, 2024

11:11am - 11:17am ET

Location: C110

Oral Poster Presenter(s)

Ko Un Park, MD (she/her/hers)

Associate Surgeon
Brigham and Women's Hospital, Dana Farber Cancer Institute
Quincy, Massachusetts, United States

Submitter(s)

Ko Un Park, MD (she/her/hers)

Associate Surgeon
Brigham and Women's Hospital, Dana Farber Cancer Institute
Quincy, Massachusetts, United States

Author(s)

Ko Un Park, MD (she/her/hers)

Associate Surgeon
Brigham and Women's Hospital, Dana Farber Cancer Institute
Quincy, Massachusetts, United States
SL

Stuart Lipsitz, ScD

Director of Biostatistics, Center for Surgery and Public Health
Brigham and Women's Hospital, United States
Laura Dominici, MD

Associate Chief of Surgery
Brigham and Women's Faulkner Hospital; Associate Surgeon, Brigham and Women's Hospital, Dana-Farber Cancer Institute
Scituate, Massachusetts, United States
FL

Filipa Lynce, MD

Director, Inflammatory Breast Center; Senior Physician
Dana-Farber Cancer Institute, United States
Christina A. Minami, MD, MS (she/her/hers)

Associate Surgeon
Brigham and Women's Hospital, Dana-Farber Cancer Institute
Boston, Massachusetts, United States
Faina Nakhlis, MD

Breast Surgeon
Brigham and Women's Hospital, Dana-Farber Cancer Institute
Boston, Massachusetts, United States
Adrienne G. G. Waks, MD

Associate Director, Breast Oncology Clinical Research; Physician
Dana-Farber Cancer Institute
Boston, Massachusetts, United States
LW

Laura Warren, MD

Associate Network Clinical Director, Radiation Oncology
Brigham and Women's Hospital, Dana-Farber Cancer Institute, United States
NE

Nadine Eidman, n/a

Patient Advocate
-, United States
JF

Jeannie Frazier, n/a

Patient Advocate
-, United States
LH

Lourdes Hernandez, n/a

Patient Advocate
-, United States
CL

Carla Leslie, n/a

Patient Advocate
-, United States
SR

Susan Rafte, n/a

Patient Advocate
-, United States
DS

Delia Stroud, n/a

Patient Advocate
-, United States
JW

Joel S. Weissman, PhD

Deputy Director and Chief Scientific Officer of the Center for Surgery and Public Health
Brigham and Women's Hospital, United States
Tari A. King, MD (she/her/hers)

Chief, Division of Breast Surgery
Brigham and Women's Hospital, Dana-Farber Cancer Institute
Boston, Massachusetts, United States
Elizabeth A. Mittendorf, MD, PhD, MHCM (she/her/hers)

Professor of Surgery
Brigham and Women's Hospital, Dana Farber Cancer Institute
Boston, Massachusetts, United States

Disclosure(s):

Faina Nakhlis, MD: No financial relationships to disclose

Introduction:

The internet is increasingly used by patients as a source of medical information, and the chatbot interface generative pretrained transformer (GPT) (ChatGPT, OpenAI) is an AI system that can provide humanlike responses to patient questions. It is estimated that GPT 3.5 had over 100 million users in June 2023. It is unclear if GPT 3.5 responses can be trusted as an accurate source of medical information for patients. This study sought to evaluate the accuracy and clinical concordance of GPT 3.5 responses to breast cancer questions.

Methods:

Through a series of focus groups with 6 breast cancer advocates, major themes in breast cancer care that patients are likely to ask were identified and 20 questions corresponding to the themes were developed (Table). Questions were posed to GPT 3.5 in July 2023 and repeated 3 times over the course of a week. Responses were graded by 6 breast oncology specialists (3 surgical oncologists, 1 radiation oncologist, and 2 medical oncologists) in the following 2 domains: accuracy (4 point Likert scale, 1 = comprehensive, 2 = correct but inadequate, 3 = some correct, some incorrect, 4 = completely incorrect) and clinical concordance (i.e. Information is clinically similar to the specialist response; 5 point Likert scale, 1 = completely similar, 5 = not similar at all). ANOVA means and 95% confidence intervals (CI) were calculated.

Results:

There were 360 evaluations per domain (20 questions x 6 physician graders x 3 repetitions). The combined average for accuracy was 1.88 (range 1-3; 95% CI 1.42-1.94) and for clinical concordance was 2.79 (range 1-5; 95% CI 1.94 - 3.64). For accuracy, 24% (n=87; 95% CI 10 – 47.8%) were graded as ‘some correct and some incorrect’. No responses were graded as completely incorrect. Overall, 7.8% (n=28; 95% CI 2.4 – 22.6%) were graded as not similar at all to the information provided by the clinician if asked the same question. Grouped thematically, workup questions faired best for accuracy and chemotherapy questions for clinical concordance. The worst accuracy scored question was regarding lymphedema after axillary surgery (question 6; average 2.67). The worst clinical concordance scored question was regarding immunotherapy (question 11; average 3.5).

Conclusions:

Although generative AI, specifically ChatGPT, shows potential to provide breast cancer patients accurate and clinically concordant information, occasionally it provided inaccurate and clinically discordant answers. As such, patients should not use ChatGPT as a reliable source of information until future studies and technology refinements establish AI’s reliability.

Learning Objectives:

Upon completion, participants will be able to understand the functionality and purpose of ChatGPT as a generative artificial intelligence system.
Upon completion, participants will be able to identify the limitations of ChatGPT in providing medical information.
Upon completion, participants will be able to comprehend the importance of assessing the capabilities of ChatGPT before it can be used in patient care.