计算机科学
相关性(法律)
散列函数
组分(热力学)
情报检索
等级制度
相似性(几何)
二进制数
人工智能
自然语言处理
数学
计算机安全
市场经济
算术
热力学
图像(数学)
物理
经济
政治学
法学
作者
Jia-Nan Guo,Xian-Ling Mao,Wei Wei,Heyan Huang
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:: 1-1
被引量:1
标识
DOI:10.1109/tkde.2022.3161807
摘要
Document hashing is a powerful paradigm for document retrieval, which maps high-dimensional documents to compact hashing codes with preserving the similarity of original data. While fairly successful, the existing document hashing methods do not consider the relevance relationship among different documents from a category and the hierarchical relationship among categories. Intuitively, the intra-category relevance connects related concepts among different documents, which can supplement the omitted information for each document; meanwhile the hierarchical categories can help to identify whether mistakes occur in leaf categories or parent categories, which can be used to reduce the mistakes occurring in parent categories that are often more serious. Inspired by above intuitions, we propose a novel \textbf{I}ntra-category aware \textbf{H}ierarchical supervised \textbf{D}ocument \textbf{H}ashing, called IHDH. Specifically, IHDH is a binary autoencoder architecture equipped with two novel components: intra-category component and hierarchy component. The intra-category component exploits the difference among latent semantic representations of different documents from a category to supplement the omitted information for each document. The hierarchy component utilizes the hierarchical structure to transform the probabilities of leaf categories into the probabilities of parent categories by union operation, and then gives a further parent-level penalty to reduce the mistakes occurring in parent categories.
科研通智能强力驱动
Strongly Powered by AbleSci AI