计算机科学
情报检索
服务(商务)
自助服务
万维网
数据库
计算机安全
经济
经济
作者
Chenxu Niu,Wensheng Zhang,Suren Byna,Yong Chen
标识
DOI:10.1109/bigdata59044.2023.10386205
摘要
Finding relevant datasets can be a time-consuming and challenging task, especially for self-describing file formats. Current solutions use either exact or partial keyword matching approaches to extract and process metadata queries, but they fail to capture semantic relationships between the metadata content and query keywords. To address this challenge, we introduce PSQS, a novel parallel semantic search method for self-describing files. The method leverages parallel processing and kv2vec semantic similarity measures to retrieve semantically relevant data efficiently. Our evaluation against existing metadata search solutions shows that PSQS offers a new, efficient and effective semantic search functionality for various fields where large self-describing files are used, such as scientific data management, leading to more accurate and efficient data retrieval.
科研通智能强力驱动
Strongly Powered by AbleSci AI