计算机科学
徽标(编程语言)
人工智能
公制(单位)
集合(抽象数据类型)
任务(项目管理)
Logos圣经软件
比例(比率)
情报检索
视觉对象识别的认知神经科学
深度学习
相似性(几何)
对象(语法)
模式识别(心理学)
自然语言处理
机器学习
图像(数学)
运营管理
物理
管理
量子力学
经济
程序设计语言
操作系统
作者
Chenge Li,István Fehérvári,Xiaonan Zhao,Ives Macêdo,Srikar Appalaraju
标识
DOI:10.1109/wacv51458.2022.00066
摘要
Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model’s retrieval performance both on our PL8K dataset and on five other public logo datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI