Bayes错误率
朴素贝叶斯分类器
贝叶斯分类器
人工智能
机器学习
贝叶斯定理
贝叶斯法则
贝叶斯程序设计
计算机科学
概率分类
分类器(UML)
模式识别(心理学)
熵(时间箭头)
数学
贝叶斯因子
支持向量机
贝叶斯概率
物理
量子力学
摘要
The naive Bayes classifier greatly simplify learning by assuming that features are independent given class. Although independence is generally a poor assumption, in practice naive Bayes often competes well with more sophisticated classifiers. Our broad goal is to understand the data characteristics which affect the performance of naive Bayes. Our approach uses Monte Carlo simulations that allow a systematic study of classification accuracy for several classes of randomly generated problems. We analyze the impact of the distribution entropy on the classification error, showing that low-entropy feature distributions yield good performance of naive Bayes. We also demonstrate that naive Bayes works well for certain nearlyfunctional feature dependencies, thus reaching its best performance in two opposite cases: completely independent features (as expected) and functionally dependent features (which is surprising). Another surprising result is that the accuracy of naive Bayes is not directly correlated with the degree of feature dependencies measured as the classconditional mutual information between the features. Instead, a better predictor of naive Bayes accuracy is the amount of information about the class that is lost because of the independence assumption.
科研通智能强力驱动
Strongly Powered by AbleSci AI