Enhanced Multi-Scale Cross-Attention for Person Image Generation

比例(比率) 图像(数学) 计算机科学 人工智能 计算机视觉 心理学 地理 地图学
作者
Hao Tang,Ling Shao,Nicu Sebe,Luc Van Gool
出处
期刊:Cornell University - arXiv
标识
DOI:10.48550/arxiv.2501.08900
摘要

In this paper, we propose a novel cross-attention-based generative adversarial network (GAN) for the challenging person image generation task. Cross-attention is a novel and intuitive multi-modal fusion method in which an attention/correlation matrix is calculated between two feature maps of different modalities. Specifically, we propose the novel XingGAN (or CrossingGAN), which consists of two generation branches that capture the person's appearance and shape, respectively. Moreover, we propose two novel cross-attention blocks to effectively transfer and update the person's shape and appearance embeddings for mutual improvement. This has not been considered by any other existing GAN-based image generation work. To further learn the long-range correlations between different person poses at different scales and sub-regions, we propose two novel multi-scale cross-attention blocks. To tackle the issue of independent correlation computations within the cross-attention mechanism leading to noisy and ambiguous attention weights, which hinder performance improvements, we propose a module called enhanced attention (EA). Lastly, we introduce a novel densely connected co-attention module to fuse appearance and shape features at different stages effectively. Extensive experiments on two public datasets demonstrate that the proposed method outperforms current GAN-based methods and performs on par with diffusion-based methods. However, our method is significantly faster than diffusion-based methods in both training and inference.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
迷人荷花完成签到,获得积分10
1秒前
dacongming发布了新的文献求助10
1秒前
hahaha给hahaha的求助进行了留言
2秒前
2秒前
青青子衿发布了新的文献求助10
2秒前
3秒前
3秒前
雪山飞龙发布了新的文献求助10
4秒前
在水一方应助vc采纳,获得10
4秒前
迷人荷花发布了新的文献求助10
4秒前
大模型应助Lucky小M采纳,获得10
5秒前
高兴冬易发布了新的文献求助10
6秒前
一一应助Fuj采纳,获得10
7秒前
9秒前
yulia完成签到 ,获得积分10
11秒前
ShuV发布了新的文献求助80
11秒前
周梦蝶发布了新的文献求助10
12秒前
毛豆应助Vivian采纳,获得10
13秒前
14秒前
dacongming完成签到,获得积分10
14秒前
lucky发布了新的文献求助10
16秒前
17秒前
17秒前
毛豆应助谦让的樱采纳,获得10
18秒前
甄高丽发布了新的文献求助30
19秒前
大个应助迷人荷花采纳,获得10
19秒前
科研民工完成签到,获得积分10
20秒前
20秒前
劲秉应助求知的周采纳,获得30
21秒前
21秒前
窝窝头发布了新的文献求助10
21秒前
传奇3应助zyy1996采纳,获得10
21秒前
22秒前
wwewew发布了新的文献求助10
22秒前
杉杉发布了新的文献求助10
24秒前
MchemG应助cyl采纳,获得100
25秒前
26秒前
bhwl12发布了新的文献求助10
27秒前
30秒前
xyydhcg发布了新的文献求助10
31秒前
高分求助中
Востребованный временем 2500
Agaricales of New Zealand 1: Pluteaceae - Entolomataceae 1040
Healthcare Finance: Modern Financial Analysis for Accelerating Biomedical Innovation 1000
지식생태학: 생태학, 죽은 지식을 깨우다 600
海南省蛇咬伤流行病学特征与预后影响因素分析 500
Neuromuscular and Electrodiagnostic Medicine Board Review 500
ランス多機能化技術による溶鋼脱ガス処理の高効率化の研究 500
热门求助领域 (近24小时)
化学 医学 材料科学 生物 工程类 有机化学 生物化学 纳米技术 内科学 物理 化学工程 计算机科学 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 电极
热门帖子
关注 科研通微信公众号,转发送积分 3463011
求助须知:如何正确求助?哪些是违规求助? 3056528
关于积分的说明 9052413
捐赠科研通 2746289
什么是DOI,文献DOI怎么找? 1506855
科研通“疑难数据库(出版商)”最低求助积分说明 696225
邀请新用户注册赠送积分活动 695791