过度拟合
人工智能
支持向量机
决策树
机器学习
在线分析处理
计算机科学
关联规则学习
预处理器
分类器(UML)
数据科学
数据挖掘
人工神经网络
数据仓库
作者
Pang-Ning Tan,Michael M. Steinbach,Vipin Kumar
出处
期刊:Routledge eBooks
[Informa]
日期:2008-06-20
卷期号:: 151-206
被引量:2880
标识
DOI:10.4324/9780080878096-12
摘要
1 Introduction 1.1 What is Data Mining? 1.2 Motivating Challenges 1.3 The Origins of Data Mining 1.4 Data Mining Tasks 1.5 Scope and Organization of the Book 1.6 Bibliographic Notes 1.7 Exercises 2 Data 2.1 Types of Data 2.2 Data Quality 2.3 Data Preprocessing 2.4 Measures of Similarity and Dissimilarity 2.5 Bibliographic Notes 2.6 Exercises 3 Exploring Data 3.1 The Iris Data Set 3.2 Summary Statistics 3.3 Visualization 3.4 OLAP and Multidimensional Data Analysis 3.5 Bibliographic Notes 3.6 Exercises 4 Classification: Basic Concepts, Decision Trees, and Model Evaluation 4.1 Preliminaries 4.2 General Approach to Solving a Classification Problem 4.3 Decision Tree Induction 4.4 Model Overfitting 4.5 Evaluating the Performance of a Classifier 4.6 Methods for Comparing Classifiers 4.7 Bibliographic Notes 4.8 Exercises 5 Classification: Alternative Techniques 5.1 Rule-Based Classifier 5.2 Nearest-Neighbor Classifiers 5.3 Bayesian Classifiers 5.4 Artificial Neural Network (ANN) 5.5 Support Vector Machine (SVM) 5.6 Ensemble Methods 5.7 Class Imbalance Problem 5.8 Multiclass Problem 5.9 Bibliographic Notes 5.10 Exercises 6 Association Analysis: Basic Concepts and Algorithms 6.1 Problem Definition 6.2 Frequent Itemset Generation 6.3 Rule Generation 6.4 Compact Representation of Frequent Itemsets 6.5 Alternative Methods for Generating Frequent Itemsets 6.6 FP-Growth Algorithm 6.7 Evaluation of Association Patterns 6.8 Effect of Skewed Support Distribution 6.9 Bibliographic Notes 6.10 Exercises 7 Association Analysis: Advanced Concepts 7.1 Handling Categorical Attributes 7.2 Handling Continuous Attributes 7.3 Handling a Concept Hierarchy 7.4 Sequential Patterns 7.5 Subgraph Patterns 7.6 Infrequent Patterns 7.7 Bibliographic Notes 7.8 Exercises 8 Cluster Analysis: Basic Concepts and Algorithms 8.1 Overview 8.2 K-means 8.3 Agglomerative Hierarchical Clustering 8.4 DBSCAN 8.5 Cluster Evaluation 8.6 Bibliographic Notes 8.7 Exercises 9 Cluster Analysis: Additional Issues and Algorithms 9.1 Characteristics of Data, Clusters, and Clustering Algorithms 9.2 Prototype-Based Clustering 9.3 Density-Based Clustering 9.4 Graph-Based Clustering 9.5 Scalable Clustering Algorithms 9.6 Which Clustering Algorithm? 9.7 Bibliographic Notes 9.8 Exercises 10 Anomaly Detection 10.1 Preliminaries 10.2 Statistical Approaches 10.3 Proximity-Based Outlier Detection 10.4 Density-Based Outlier Detection 10.5 Clustering-Based Techniques 10.6 Bibliographic Notes 10.7 Exercises Appendix A Linear Algebra Appendix B Dimensionality Reduction Appendix C Probability and Statistics Appendix D Regression Appendix E Optimization Author Index Subject Index
科研通智能强力驱动
Strongly Powered by AbleSci AI