Skip to main content
Videos

Artificial Intelligence in Oncology: Evaluating Clinical Decision Support in Lung Cancer


Clinical Summary: 

  • Design/Population: A study evaluated the concordance of multiple large language models and guideline-based artificial intelligence tools with expert thoracic oncologist recommendations for clinical decision-making in EGFR-mutated stage 4 non–small cell lung cancer, across firstline and secondline treatment scenarios.
  • Key Outcomes: Concordance with expert opinion was relatively strong in the firstline setting, particularly for guideline-supported artificial intelligence tools, but was substantially lower in the secondline setting where treatment selection is more complex, rapidly evolving, and dependent on resistance mechanisms and patient-specific factors.
  • Clinical Relevance: Artificial intelligence tools may support clinical decision-making in guideline-driven first-line settings, but second-line and later treatment recommendations should be interpreted cautiously and used as discussion aids rather than replacements for expert oncology input.

Chinmay Jani, MBBS, University of Miami Sylvester Comprehensive Cancer Center, Miami, Florida, discusses a study evaluating the performance of commercially available  large language models and oncology-specific artificial intelligence (AI) platforms in clinical decision-making for EGFR-mutated advanced non-small cell lung cancer (NSCLC). The analysis compared AI-generated treatment recommendations with expert opinions from thoracic oncologists across multiple academic institutions.

Results demonstrated strong performance for guideline-based AI tools in in the first-tline setting, particularly for ASCO AI, while more complex second-line scenarios revealed substantial variability and lower concordance with expert reccomendations. These findings suggest that AI can serve as a valuable adjunct for education and clinical decision support but should be used alongside physician expertise, particularly when treatment decisions require detailed molecular and clinical context.

Dr Jani presented these results at the 2026 ASCO Annual Meeting in Chicago, Illinois. 

Transcript: 

Hi everyone, I’m Chinmay Jani. I’m chief fellow at the University of Miami Sylvester Comprehensive Cancer Center and incoming faculty this summer in thoracic oncology and phase 1 trials.

I’m really excited to be here at ASCO 2026. I’m presenting 3 different posters and abstracts, and one of the projects I’m particularly excited about focuses on artificial intelligence.

As I’ve been saying for the past few months, artificial intelligence is no longer a tool of the future—it’s a tool of the present. In fact, it’s become more than a tool. It’s increasingly part of our daily routine, not just for patients, but for physicians as well. That naturally raises an important question: how reliable is artificial intelligence?

In the past, when I was in medical school or residency, we used to talk about “Dr. Google” giving recommendations. Then came what people jokingly referred to as Facebook University, WhatsApp University, and so on. Now we have what some might call ChatGPT University. All of these information sources continue to evolve.

We wanted to evaluate how different large language models—including ChatGPT, Gemini, Claude, and Grok—as well as 2 guideline-based AI tools, the ASCO AI tool and OpenEvidence, perform in clinical decision-making.

This is important because these tools allow you to enter a clinical scenario—for example, a 72-year-old patient with stage IV EGFR-mutated non–small cell lung cancer, along with details about prior therapies, molecular findings, and comorbidities—and the model will suggest potential next treatment steps– that can be useful for physicians and patients alike.

We evaluated 4 LLMs and 2 guideline-based tools. The first thing we wanted to assess was concordance with expert opinion.

For the expert panel, we included 5 thoracic oncologists from 3 institutions: Sylvester Comprehensive Cancer Center, MD Anderson Cancer Center, and the University of Alabama.

We developed 6 first-line and six second-line clinical cases involving EGFR-mutated stage IV non–small cell lung cancer. The prompts were structured so that the models were essentially asked: if this case were presented to 100 thoracic oncologists in the United States with four possible treatment options, which option would most likely be selected? The models then ranked the available options, and we assessed both concordance with expert opinion and divergence among the different systems.

Looking at the results, one of the most exciting findings was that concordance was quite good in the first-line setting. The ASCO AI tool performed particularly well, achieving approximately 80% concordance with expert opinion. Of course, we would always like to see 100%, but these systems continue to evolve, and I think performance will continue to improve over time. The second-best performer was ChatGPT. However, in the second-line setting, performance was not as strong.

The highest concordance rate we observed was approximately 0.49, or about 50%. In simple terms, that’s essentially the equivalent of a coin flip, which is concerning.There are several possible reasons for this.

First, neither the LLMs nor the guideline-based tools are always fully up to date. There is often a lag between publication of new data, incorporation into guidelines, and eventual integration into AI systems.

Second-line treatment decisions are also inherently more complicated. You have to consider what was given in the first line, what resistance mechanisms were identified on liquid biopsy, molecular testing results, comorbidities, and many other variables.

Interestingly, Gemini alone did not perform as well as the ASCO AI tool, despite being the underlying model. This suggests that the additional guideline layer incorporated into ASCO AI adds meaningful value. A similar concept applies to OpenEvidence.

This was one of our key findings—that combining LLM technology with guideline-based frameworks may provide the most clinically useful results.

The factors contributing to discordance in the second-line setting are largely related to how quickly the field evolves. New data emerge not only at ASCO, but also at meetings such as the World Conference on Lung Cancer and ESMO.

In addition, second- and third-line decisions are often highly individualized, which is where precision medicine becomes important. Even though we included detailed information in our prompts, there are still many nuances that AI systems may not fully capture.

It’s also worth noting that our analyses used model versions available up until January, prior to abstract submission. Even now, several months later, performance may already be better.

The major takeaway is that AI continues to evolve and is becoming an increasingly valuable tool, both for research and for clinical decision-making.

In the first-line setting, these tools appear quite reliable. For trainees, fellows, residents, and even practicing physicians looking for guidance or educational support, they can be very useful.

In second- and third-line settings, however, the outputs should be interpreted more cautiously. They should be reviewed critically and discussed with disease-specific experts before being used to guide treatment decisions.

Another important finding is that guideline-based tools appear particularly helpful and reliable.

Among the LLMs, we evaluated ChatGPT, Gemini, Claude, and Grok. Performance varied, although ChatGPT performed well overall, which is somewhat reassuring given how commonly patients use it.

For patients, my message would be this: it’s unrealistic to expect people not to search for information online. Use these tools, take notes, and bring that information to your oncologist. Use AI-generated information as a starting point for discussion, but do not rely on it exclusively for medical decisions.

Looking ahead, AI will increasingly become a source of guidelines, literature review, clinical decision support, and research support.

In fact, OpenEvidence and ASCO now have a collaboration that allows ASCO guidelines to be incorporated directly into OpenEvidence, which is a very exciting development.

Artificial intelligence will also play a role in clinical research, biomarker discovery, healthcare utilization, and even identifying optimal clinical trial opportunities for individual patients.

So AI is likely to have a role across virtually every aspect of oncology, from patient care to research and drug development.

There is much more to come, and I’m looking forward to seeing where the field is by ASCO 2027.


Source: 

Jani C, Pérez-Granado J, Kalucha A, et al. Evaluating AI decision support in a rapidly evolving therapeutic landscape: EGFR-mutant metastatic NSCLC. Presented at the ASCO Annual Meeting. May 29 - June 2, 2026. Chicago, Illinois. Abstract 1630.

© 2026 HMP Global. All Rights Reserved.
Any views and opinions expressed are those of the author(s) and/or participants and do not necessarily reflect the views, policy, or position of Oncology Learning Network or HMP Global, their employees, and affiliates.