Boosting(机器学习)
计算机科学
可扩展性
机器学习
人工智能
素描
隐藏物
树(集合论)
数据挖掘
并行计算
数据库
算法
数学
数学分析
作者
Tianqi Chen,Carlos Guestrin
出处
期刊:Cornell University - arXiv
日期:2016-08-08
被引量:6816
标识
DOI:10.1145/2939672.2939785
摘要
Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
科研通智能强力驱动
Strongly Powered by AbleSci AI