树库
计算机科学
解析
人工智能
独立性(概率论)
语法
自然语言处理
简单(哲学)
国家(计算机科学)
算法
数学
语言学
统计
认识论
哲学
作者
Dan Klein,Christopher D. Manning
标识
DOI:10.3115/1075096.1075150
摘要
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.
科研通智能强力驱动
Strongly Powered by AbleSci AI