医学
卡帕
组内相关
超声波
可靠性(半导体)
科恩卡帕
图像质量
超声成像
放射科
核医学
人工智能
机器学习
计算机科学
图像(数学)
数学
物理
临床心理学
量子力学
功率(物理)
心理测量学
几何学
作者
Siyavash Ghasseminia,Andrew Kean Seng Lim,Nathan David P. Concepcion,David Kirschner,Yi Ming Teo,Sukhdeep Dulai,Myles Mabee,Sara Kernick,Cain Brockley,Siska D Muljadi,Pavel Singh,Abhilash Rakkunedeth Hareendranathan,Jeevesh Kapur,Dornoosh Zonoobi,Kumaradevan Punithakumar,Jacob L. Jaremko
出处
期刊:Journal of Pediatric Orthopaedics
[Ovid Technologies (Wolters Kluwer)]
日期:2022-02-03
卷期号:42 (4): e315-e323
被引量:8
标识
DOI:10.1097/bpo.0000000000002065
摘要
Background: Ultrasound for developmental dysplasia of the hip (DDH) is challenging for nonexperts to perform and interpret. Recording “sweep” images allows more complete hip assessment, suitable for automation by artificial intelligence (AI), but reliability has not been established. We assessed agreement between readers of varying experience and a commercial AI algorithm, in DDH detection from infant hip ultrasound sweeps. Methods: We selected a full spectrum of poor-to-excellent quality images and normal to severe dysplasia, in 240 hips (120 single 2-dimensional images, 120 sweeps). For 12 readers (radiologists, sonographers, clinicians and researchers; 3 were DDH subspecialists), and a ultrasound-FDA-cleared AI software package (Medo Hip), we calculated interobserver reliability for alpha angle measurements by intraclass correlation coefficient (ICC 2,1 ) and for DDH classification by Randolph Kappa. Results: Alpha angle reliability was high for AI versus subspecialists (ICC=0.87 for sweeps, 0.90 for single images). For DDH diagnosis from sweeps, agreement was high between subspecialists (kappa=0.72), and moderate for nonsubspecialists (0.54) and AI (0.47). Agreement was higher for single images (kappa=0.80, 0.66, 0.49). AI reliability deteriorated more than human readers for the poorest-quality images. The agreement of radiologists and clinicians with the accepted standard, while still high, was significantly poorer for sweeps than 2D images ( P <0.05). Conclusions: In a challenging exercise representing the wide spectrum of image quality and reader experience seen in real-world hip ultrasound, agreement on DDH diagnosis from easily obtained sweeps was only slightly lower than from single images, likely because of the additional step of selecting the best image. AI performed similarly to a nonsubspecialist human reader but was more affected by low-quality images.
科研通智能强力驱动
Strongly Powered by AbleSci AI