Data Mining: Concepts and Techniques

3rd Edition - June 9, 2011
Authors: Jiawei Han, Micheline Kamber, Jian Pei
Language: English
eBook ISBN:
9 7 8 - 0 - 1 2 - 3 8 1 4 8 0 - 7

Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifica… Read more

Purchase options

LIMITED OFFER

Save 50% on book bundles

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

Institutional subscription on ScienceDirect

Request a sales quote

Resources

Companion materials(opens in new tab/window)Textbook support for instructors(opens in new tab/window)

Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining.

This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining.

Dedication
Foreword
Foreword to Second Edition
Preface
Organization of the Book
To the Instructor
To the Student
To the Professional
Book Web Sites with Resources
Acknowledgments
Third Edition of the Book
Second Edition of the Book
First Edition of the Book
About the Authors
1. Introduction

Publisher Summary
1.1 Why Data Mining?
1.2 What Is Data Mining?
1.3 What Kinds of Data Can Be Mined?
1.4 What Kinds of Patterns Can Be Mined?
1.5 Which Technologies Are Used?
1.6 Which Kinds of Applications Are Targeted?
1.7 Major Issues in Data Mining
1.8 Summary
1.9 Exercises
1.10 Bibliographic Notes

2. Getting to Know Your Data

Publisher Summary
2.1 Data Objects and Attribute Types
2.2 Basic Statistical Descriptions of Data
2.3 Data Visualization
2.4 Measuring Data Similarity and Dissimilarity
2.5 Summary
2.6 Exercises
2.7 Bibliographic Notes

3. Data Preprocessing

Publisher Summary
3.1 Data Preprocessing: An Overview
3.2 Data Cleaning
3.3 Data Integration
3.4 Data Reduction
3.5 Data Transformation and Data Discretization
3.6 Summary
3.7 Exercises
3.8 Bibliographic Notes

4. Data Warehousing and Online Analytical Processing

Publisher Summary
4.1 Data Warehouse: Basic Concepts
4.2 Data Warehouse Modeling: Data Cube and OLAP
4.3 Data Warehouse Design and Usage
4.4 Data Warehouse Implementation
4.5 Data Generalization by Attribute-Oriented Induction
4.6 Summary
4.7 Exercises
Bibliographic Notes

5. Data Cube Technology

Publisher Summary
5.1 Data Cube Computation: Preliminary Concepts
5.2 Data Cube Computation Methods
5.3 Processing Advanced Kinds of Queries by Exploring Cube Technology
5.4 Multidimensional Data Analysis in Cube Space
5.5 Summary
5.6 Exercises
5.7 Bibliographic Notes

6. Mining Frequent Patterns, Associations, and Correlations: Basic Concepts and Methods

Publisher Summary
6.1 Basic Concepts
6.2 Frequent Itemset Mining Methods
6.3 Which Patterns Are Interesting?—Pattern Evaluation Methods
6.4 Summary
6.5 Exercises
6.6 Bibliographic Notes

7. Advanced Pattern Mining

Publisher Summary
7.1 Pattern Mining: A Road Map
7.2 Pattern Mining in Multilevel, Multidimensional Space
7.3 Constraint-Based Frequent Pattern Mining
7.4 Mining High-Dimensional Data and Colossal Patterns
7.5 Mining Compressed or Approximate Patterns
7.6 Pattern Exploration and Application
7.7 Summary
7.8 Exercises
7.9 Bibliographic Notes

8. Classification: Basic Concepts

Publisher Summary
8.1 Basic Concepts
8.2 Decision Tree Induction
8.3 Bayes Classification Methods
8.4 Rule-Based Classification
8.5 Model Evaluation and Selection
8.6 Techniques to Improve Classification Accuracy
8.7 Summary
8.8 Exercises
8.9 Bibliographic Notes

9. Classification: Advanced Methods

Publisher Summary
9.1 Bayesian Belief Networks
9.2 Classification by Backpropagation
9.3 Support Vector Machines
9.4 Classification Using Frequent Patterns
9.5 Lazy Learners (or Learning from Your Neighbors)
9.6 Other Classification Methods
9.7 Additional Topics Regarding Classification
Summary
9.9 Exercises
9.10 Bibliographic Notes

10. Cluster Analysis: Basic Concepts and Methods

Publisher Summary
10.1 Cluster Analysis
10.2 Partitioning Methods
10.3 Hierarchical Methods
10.4 Density-Based Methods
10.5 Grid-Based Methods
10.6 Evaluation of Clustering
10.7 Summary
10.8 Exercises
10.9 Bibliographic Notes

11. Advanced Cluster Analysis

Publisher Summary
11.1 Probabilistic Model-Based Clustering
11.2 Clustering High-Dimensional Data
11.3 Clustering Graph and Network Data
11.4 Clustering with Constraints
Summary
11.6 Exercises
11.7 Bibliographic Notes

12. Outlier Detection

Publisher Summary
12.1 Outliers and Outlier Analysis
12.2 Outlier Detection Methods
12.3 Statistical Approaches
12.4 Proximity-Based Approaches
12.5 Clustering-Based Approaches
12.6 Classification-Based Approaches
12.7 Mining Contextual and Collective Outliers
12.8 Outlier Detection in High-Dimensional Data
12.9 Summary
12.10 Exercises
12.11 Bibliographic Notes

13. Data Mining Trends and Research Frontiers

Publisher Summary
13.1 Mining Complex Data Types
13.2 Other Methodologies of Data Mining
13.3 Data Mining Applications
13.4 Data Mining and Society
13.5 Data Mining Trends
13.6 Summary
13.7 Exercises
13.8 Bibliographic Notes

Bibliography
Index

Jiawei Han

Jiawei Han is Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Well known for his research in the areas of data mining and database systems, he has received many awards for his contributions in the field, including the 2004 ACM SIGKDD Innovations Award. He has served as Editor-in-Chief of ACM Transactions on Knowledge Discovery from Data, and on editorial boards of several journals, including IEEE Transactions on Knowledge and Data Engineering and Data Mining and Knowledge Discovery.

Affiliations and expertise

Professor, Department of Computer ScienceUniversity of Illinois, Urbana Champaign, USA

Micheline Kamber

Micheline Kamber is a researcher with a passion for writing in easy-to-understand terms. She has a master's degree in computer science (specializing in artificial intelligence) from Concordia University, Canada.

Affiliations and expertise

Simon Fraser University, Burnaby, Canada

Jian Pei

Jian Pei is currently a Canada Research Chair (Tier 1) in Big Data Science and a Professor in the School of Computing Science at Simon Fraser University. He is also an associate member of the Department of Statistics and Actuarial Science. He is a well-known leading researcher in the general areas of data science, big data, data mining, and database systems. His expertise is on developing effective and efficient data analysis techniques for novel data intensive applications. He is recognized as a Fellow of the Association of Computing Machinery (ACM) for his “contributions to the foundation, methodology and applications of data mining” and as a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) for his “contributions to data mining and knowledge discovery”. He is the editor-in-chief of the IEEE Transactions of Knowledge and Data Engineering (TKDE), a director of the Special Interest Group on Knowledge Discovery in Data (SIGKDD) of the Association for Computing Machinery (ACM), and a general co-chair or program committee co-chair of many premier conferences.

Affiliations and expertise

Simon Fraser University, Burnaby, Canada