Who Is the Surgeon Now: Human Hands or Machine Minds? Artificial Intelligence in Orthopedics from Diagnosis to Follow-Up—A Structured Narrative Review

Yapıcı, FURKAN

doi:10.3390/jcm15062165

Who Is the Surgeon Now: Human Hands or Machine Minds? Artificial Intelligence in Orthopedics from Diagnosis to Follow-Up—A Structured Narrative Review

Yapıcı F.

Journal of Clinical Medicine, cilt.15, sa.6, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Derleme
Cilt numarası: 15 Sayı: 6
Basım Tarihi: 2026
Doi Numarası: 10.3390/jcm15062165
Dergi Adı: Journal of Clinical Medicine
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, EMBASE
Anahtar Kelimeler: artificial intelligence (AI), deep learning (DL), fracture detection, generative AI (GAI), machine learning (ML), musculoskeletal imaging, natural language processing (NLP), orthopedics, perioperative risk prediction, radiomics
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Erzincan Binali Yıldırım Üniversitesi Adresli: Evet

Özet

Background: Artificial intelligence (AI) is transitioning from proof-of-concept prototypes to clinically utilized tools in orthopedics. The key translational question is whether AI will replace surgeons or, more realistically, augment human expertise. Methods: A structured narrative review was conducted using PubMed/MEDLINE, Web of Science, and Google Scholar (completed 31 January 2026). Peer-reviewed English-language studies that utilized AI for orthopedic clinical problems were eligible. To synthesize the 73 included papers without forced quantitative pooling, evidence was qualitatively charted and organized using a four-axis framework: clinical task, data modality, validation maturity, and intended user/setting. Results: The evidence base was dominated by retrospective, imaging-centered AI studies (predominantly LOE III). Radiograph-based fracture detection and automated measurements were frequently reported to achieve high discrimination, though performance degraded in complex or “edge” cases. Predictive models for arthroplasty and spine outcomes demonstrated variable actionability and inconsistent reporting of calibration. Common translational barriers across subspecialties included limited external validation, dataset shift, and a scarcity of prospective impact studies. Conclusions: Current evidence supports an augmentation paradigm rather than a replacement paradigm. AI acts as a “co-surgeon,” improving triage and standardizing quantification. However, safe clinical translation requires representative external validation, rigorous failure analysis, and human-in-the-loop workflows where surgeons retain ultimate accountability.