Artificial Intelligence in Dermatology: From Proof of Concept to Real-World Accountability
Ryan W. Stidham, MD, MS, explored how artificial intelligence (AI), large language models (LLMs), and multimodal systems are reshaping dermatology—and where caution is still warranted—during his Masterclasses in Dermatology session, “Artificial Intelligence Update.” His framing was both optimistic and measured: “We’ve come a long way, but we have a long way to go.”
Dr Stidham revisited the early era of computer vision in dermatology. In 2017, melanoma convolutional neural networks trained on 129,450 clinical images demonstrated that AI “could match/exceed human dermatologists in visual diagnosis,” with model accuracy of 71.2% compared with 65.6% to 53.5% for dermatologists. These proof-of-concept studies helped catalyze US Food and Drug Administration-approved skin lesion detection devices and broader AI integration.
But real-world data exposed critical gaps. In analyses using real-world clinical photos rather than curated datasets, there was a 20% drop in skin cancer accuracy and a 46% drop in sensitivity in darker vs lighter skin tones. Notably, 47.1% of nonmalignant lesions were misclassified as malignant. Lighting, photo quality, atypical presentations, and insufficient training on varied skin tones were identified as limitations.
Population deployment also revealed unintended consequences. In a Danish study of a patient-driven AI skin cancer app, suspicious lesion reporting increased by 32%, and detection of premalignant or malignant lesions rose modestly (6.0% vs 4.6%). However, benign lesion evaluations increased by 300%, with higher per-user costs and increased clinic demands. The message: Scaling AI without systems planning can strain access.
Beyond detection, AI is advancing psoriasis care. Automated Psoriasis Area and Severity Index scoring, dermatologic image segmentation, spatial registration, and cumulative body surface area quantification are enabling more precise longitudinal assessment. Automated 3D image capture platforms now offer complete disease screening and telehealth-ready monitoring.
LLMs represent the next frontier. GPT-4.0 achieved 75% accuracy on 250 dermatology board-style questions across 5 subspecialties. Multimodal systems, such as GPT4V and SkinGPT-4, merge text and images, with more than 75% of diagnoses and plans deemed acceptable in mobile image–based testing. And AI required only 0.02 minutes to generate an opinion compared with 16 minutes for dermatologists.
Yet limitations remain. Dr Stidham cautioned that today’s LLMs “do not know the stakes of medical decisions” and “can’t load all patient information.” Lessons from gastroenterology also demonstrate reduced native detection performance after AI exposure, raising concerns about AI-enabled deskilling.
Looking ahead, Dr Stidham described a future that includes AI pathology, chatbot-driven intake, LLM-based medical history summaries, and automated diagnosis and follow up with built-in triggers for human referral.
For more meeting coverage, visit the Masterclasses in Dermatology newsroom.
Reference
Stidham R. AI update in 2026. Presented at: Masterclasses in Dermatology; February 19–22, 2026; Sarasota, FL.


