计算机科学
域适应
领域(数学分析)
人工智能
适应(眼睛)
领域分析
模式识别(心理学)
数学
数学分析
软件建设
物理
软件
软件系统
分类器(UML)
光学
程序设计语言
作者
Xiaoqian Liu,Peng-Fei Zhang,Xin Luo,Zi Huang,Xin-Shun Xu
标识
DOI:10.1109/tmm.2024.3400669
摘要
Text recognition remains challenging, primarily due to the scarcity of annotated real data or the hard labor to annotate large-scale real data. Most existing solutions rely on synthetic training data, where the synthetic-to-real domain gaps limit the model performance on real data. To solve this, unsupervised domain adaptation (UDA) methods have been proposed, aiming to obtain domain-invariant representations. However, they commonly focus on domain-level alignment, neglecting the finegrained character features and thus leading to indistinguishable characters. In this paper, we propose a simple yet effective self-supervised UDA framework tailored for cross-domain text recognition, named TextAdapter, which integrates contrastive learning and consistency regularization to mitigate domain gaps. Specifically, a fine-grained feature alignment module based on character contrastive learning is designed to learn domaininvariant character representations by category-level alignment. Additionally, to address the task-agnostic problem in contrastive learning, i.e., ignoring the sequence semantics, an instance consistency matching module is proposed to perceive the contextual semantics by matching the prediction consistency among target data different augmented views. Experimental results on crossdomain benchmarks demonstrate the effectiveness of our method. Furthermore, TextAdapter can be embedded in most off-the-shelf text recognition models with new state-of-the-art performance, which illustrates the generality of our framework.
科研通智能强力驱动
Strongly Powered by AbleSci AI