计算机科学
人工智能
结肠镜检查
编码(集合论)
深度学习
比例(比率)
金标准(测试)
特征提取
计算机视觉
机器学习
结直肠癌
癌症
放射科
医学
物理
集合(抽象数据类型)
量子力学
内科学
程序设计语言
作者
Vanshali Sharma,Pradipta Sasmal,M. K. Bhuyan,Pradip K. Das,Yuji Iwahori,Kunio Kasugai
出处
期刊:IEEE Transactions on Automation Science and Engineering
[Institute of Electrical and Electronics Engineers]
日期:2023-10-02
卷期号:: 1-14
被引量:3
标识
DOI:10.1109/tase.2023.3315518
摘要
Colonoscopy video acquisition has been tremendously increased for retrospective analysis, comprehensive inspection, and detection of polyps to diagnose colorectal cancer (CRC). However, extracting meaningful clinical information from colonoscopy videos requires an enormous amount of reviewing time, which burdens the surgeons considerably. To reduce the manual efforts, we propose a first end-to-end automated multi-stage deep learning framework to extract an adequate number of clinically significant frames, i.e., keyframes from colonoscopy videos. The proposed framework comprises multiple stages that employ different deep learning models to select keyframes, which are high-quality, non-redundant polyp frames capturing multi-views of polyps. In one of the stages of our framework, we also propose a novel multi-scale attention-based model, YcOLOn, for polyp localization, which generates ROI and prediction scores crucial for obtaining keyframes. We further designed a GUI application to navigate through different stages. Extensive evaluation in real-world scenarios involving patient-wise and cross-dataset validations shows the efficacy of the proposed approach. The framework removes 96.3% and 94.02% frames, reduces detection processing time by 38.28% and 59.99%, and increases mAP by 2% and 5% on the SUN database and the CVC-VideoClinicDB, respectively. The source code is available at https://github.com/Vanshali/KeyframeExtraction Note to Practitioners —The widespread acceptance of colonoscopy procedures as a gold standard for CRC screening is constrained by the massive amount of data recorded during the process that needs to be manually reviewed. Such manual procedures are burdensome and induce human diagnostic errors. This article suggests an automated framework to extract keyframes (important frames) from colonoscopy videos that can efficiently represent the clinically relevant information captured in the video streams. This is achieved by the automated removal of uninformative and highly correlated frames, which do not add to clinical findings. The approach ensures diversity among keyframes and provides clinicians with a multi-view of polyps for easy resection. In addition, the proposed multi-scale attention-based model improves the polyp localization performance, which further helps in refining the keyframe selection process. The comprehensive experimental results corroborate that discarding insignificant frames can enhance polyp detection and localization performance and reduce computational requirements. The study estimates 30% to 60% time saving for clinicians during video screening. In clinical practices, the proposed automated framework and our designed GUI would enable surgeons to visualize the essential data better with minimal manual interventions and assist in precise polyp resection.
科研通智能强力驱动
Strongly Powered by AbleSci AI