一般化
计算机科学
搜索引擎索引
情报检索
嵌入
任务(项目管理)
数据挖掘
理论计算机科学
数学
人工智能
数学分析
经济
管理
作者
Pankaj Dadure,Partha Pakray,Sivaji Bandyopadhyay
出处
期刊:Advances in intelligent systems and computing
日期:2022-01-01
卷期号:: 1085-1100
标识
DOI:10.1007/978-981-16-6890-6_81
摘要
AbstractThe web is a rich repository of mathematical information, the task of finding relevant documents in such collection is a laborious one. Although multiple approaches have been proposed to retrieve relevant documents for a queried formula, the poor values of evaluation measures depict existing limitations of such systems. To improve the performance of this systems, this paper proposes a novel approach of formula indexing by employing formula embedding and generalization techniques. The formula embedding and the generalization modules of the proposed system transform the formulas into the fixed-size vectors by counting the occurrence of different entities in formulas. Subsequently, the formula vectors are indexed by an indexer. The documents retrieved by both the modules have higher priorities in comparison to those retrieved by individual ones. The obtained results have been compared with the state-of-the-art existing approaches, and the comparison study reveals that the proposed approach gives better retrieval accuracy in terms of P\(\_\)5\(=\)0.522, P\(\_\)10\(=\)0.478, P\(\_\)15=0.352, and P\(\_\)20=0.289 measures.KeywordsMathematical information retrievalFormula embeddingFormula generalizationBit position information tableCanonicalization tokenizationStructural unification.
科研通智能强力驱动
Strongly Powered by AbleSci AI