计算机科学
人工智能
期限(时间)
卷积神经网络
帧(网络)
变压器
模式识别(心理学)
投影(关系代数)
计算机视觉
班级(哲学)
算法
量子力学
电信
物理
电压
作者
Ruipeng Zhang,Binjie Qin,Song Ding,Yueqi Zhu,Xu Chen,Yisong Lv
标识
DOI:10.36227/techrxiv.21864027.v1
摘要
<p>Locating the start, apex and end keyframes of moving contrast agents for keyframe counting during X-ray coronary angiography (XCA) is very important in the diagnosis and treatment of cardiovascular diseases. To locate these keyframes from the class-imbalanced and boundary-agnostic foreground vessel actions that overlap complex backgrounds, we propose long short-term spatiotemporal attention by integrating a convolutional long short-term memory (CLSTM) network into a multiscale Transformer to learn the segment- and sequence-level dependences in the consecutive-frame-based deep features. Image-to-patch contrastive learning is further embedded between the CLSTM-based long-term spatiotemporal attention and Transformer-based short-term attention modules. The imagewise contrastive module reuses the long-term attention to contrast image-level foreground/background of XCA sequence, while patchwise contrastive projection selects the random patches of backgrounds as convolution kernels to project foreground/background frames into different latent spaces. A new XCA video dataset is collected to evaluate the proposed neural network. The experimental results show that the proposed method achieves a mAP of 70.51\% and an F1-score of 0.8188, considerably outperforming the state-of-the-art methods. The source code and dataset are available at https://github.com/Binjie-Qin/STA-IPCon.</p>
科研通智能强力驱动
Strongly Powered by AbleSci AI