发布文献求助

亲爱的研友该休息了！由于当前在线用户较少，发布求助请尽量完整的填写文献信息，科研通机器人24小时在线，伴您度过漫漫科研夜！身体可是革命的本钱，早点休息，好梦！

Global-Shared Text Representation Based Multi-Stage Fusion Transformer Network for Multi-Modal Dense Video Captioning

隐藏字幕计算机科学编码器变压器情态动词人工智能自然语言处理语音识别图像（数学）量子力学操作系统物理电压化学高分子化学

作者

Yulai Xie,Jingjing Niu,Yang Zhang,Fang Ren

出处

期刊：IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
日期：2024-01-01 卷期号：26: 3164-3179

标识

DOI：10.1109/tmm.2023.3307972

摘要

Dense video captioning aims to detect all events of an uncropped video and generate corresponding textual captions for each event. Multi-modal information is essential to improve the performance of this task, but the existing methods mainly rely on the single visual or dual audio-visual modal input, while completely ignoring the text modal input (subtitle). Since the text data has a similar data representation as video caption words, it is conducive to the performance improvement of video captioning. In this paper, we propose a novel framework, called the multi-stage fusion transformer network (MS-FTN), to realize multi-modal dense video captioning by fusing the text, the audio, and the visual features in stages. We present a multi-stage feature fusion encoder that first fuses audio and visual modalities at a lower level and then fuses them with a global-shared text representation at a higher level to generate a set of multi-modal complementary context features. In addition, an anchor-free event proposal module is proposed to efficiently generate a set of event proposals without the complex anchor calculation. Extensive experiments on the subsets of the ActivityNet Captions dataset show that our proposed MS-FTN achieves superior performance and efficient computation. Moreover, the ablation studies demonstrate that the global-shared text representation is more suitable for multi-modal dense video captioning. Our code and data are available at https://github.com/xieyulai/GS-MS-FTN .

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: NS发布了新的文献求助10

22秒前; NS完成签到，获得积分10

39秒前; jeff完成签到，获得积分10

57秒前; 是张张啊完成签到，获得积分10

1分钟前; 香蕉觅云的应助被aaronwolf采纳，获得10

1分钟前; ALIN关注了科研通微信公众号

1分钟前; 科研通管家关闭了oleskarabach的文献求助

1分钟前; 顾矜上传了应助文件

2分钟前; 招水若离完成签到，获得积分10

2分钟前; ALIN发布了新的文献求助20

2分钟前; 中西西完成签到，获得积分10

2分钟前; 山止川行完成签到，获得积分10

3分钟前; 可爱的函函的应助被lyn_zhou采纳，获得10

3分钟前; 赘婿的应助被科研通管家采纳，获得10

3分钟前; wanci上传了应助文件

4分钟前; NexusExplorer的应助被典雅的曼文采纳，获得10

4分钟前; 中央发布了新的文献求助10

4分钟前; NexusExplorer上传了应助文件

5分钟前; 典雅的曼文发布了新的文献求助10

6分钟前; 啦啦啦完成签到，获得积分10

6分钟前; caicai完成签到，获得积分20

7分钟前; 星辰大海的应助被caicai采纳，获得10

7分钟前; 机智的胖达完成签到，获得积分10

7分钟前; 陈媛发布了新的文献求助10

7分钟前; 小二郎的应助被wu采纳，获得10

7分钟前; 小二郎上传了应助文件

8分钟前; wu发布了新的文献求助10

8分钟前; 大鸭子完成签到，获得积分10

9分钟前; Francis完成签到，获得积分10

9分钟前; FashionBoy的应助被科研通管家采纳，获得10

9分钟前; 乐乐的应助被Francis采纳，获得10

10分钟前; 乐乐上传了应助文件

10分钟前; Francis发布了新的文献求助10

10分钟前; wu完成签到，获得积分10

10分钟前; 谦也静熵完成签到，获得积分10

11分钟前; 科研剧中人完成签到，获得积分0

11分钟前; 科研通AI2S上传了应助文件

11分钟前; 赘婿上传了应助文件

11分钟前; wu关注了科研通微信公众号

11分钟前; lijiauyi1994发布了新的文献求助10

11分钟前

高分求助中: System in Systemic Functional Linguistics A System-based Theory of Language 1000; The Data Economy: Tools and Applications 1000; Essentials of thematic analysis 700; Mantiden - Faszinierende Lauerjäger – Buch gebraucht kaufen 600; PraxisRatgeber Mantiden., faszinierende Lauerjäger. – Buch gebraucht kaufe 600; A Dissection Guide & Atlas to the Rabbit 600; Academia de Coimbra: 1537-1990: história, praxe, boémia e estudo, partidas e piadas, organismos académicos 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3117452; 求助须知：如何正确求助？哪些是违规求助？ 2767593; 关于积分的说明 7691561; 捐赠科研通 2422961; 什么是DOI，文献DOI怎么找？ 1286511; 科研通“疑难数据库（出版商）”最低求助积分说明 620412; 版权声明 599868

今日热心研友

乐乐乐乐乐乐

互助遵法尚德

Hi_爱吃大米饭

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2024 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：826996720【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通