组合性原则
计算机科学
Softmax函数
词(群论)
自然语言处理
人工智能
简单(哲学)
质量(理念)
语义学(计算机科学)
加速
语言学
人工神经网络
哲学
认识论
程序设计语言
操作系统
作者
Tomáš Mikolov,Ilya Sutskever,Kai Chen,Greg S. Corrado,Jeff Dean
摘要
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling.
An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of Canada and cannot be easily combined to obtain Air Canada. Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.
科研通智能强力驱动
Strongly Powered by AbleSci AI