Data Mining: Know It All (eBook)
480 Seiten
Elsevier Science (Verlag)
978-0-08-087788-4 (ISBN)
The proposed book expertly combines the finest data mining material from the Morgan Kaufmann portfolio. Individual chapters are derived from a select group of MK books authored by the best and brightest in the field. These chapters are combined into one comprehensive volume in a way that allows it to be used as a reference work for those interested in new and developing aspects of data mining.
This book represents a quick and efficient way to unite valuable content from leading data mining experts, thereby creating a definitive, one-stop-shopping opportunity for customers to receive the information they would otherwise need to round up from separate sources.
- Chapters contributed by various recognized experts in the field let the reader remain up to date and fully informed from multiple viewpoints.
- Presents multiple methods of analysis and algorithmic problem-solving techniques, enhancing the reader's technical expertise and ability to implement practical solutions.
- Coverage of both theory and practice brings all of the elements of data mining together in a single volume, saving the reader the time and expense of making multiple purchases.
Soumen Chakrabarti is assistant Professor in Computer Science and Engineering at the Indian Institute of Technology, Bombay. Prior to joining IIT, he worked on hypertext databases and data mining at IBM Almaden Research Center. He has developed three systems and holds five patents in this area. Chakrabarti has served as a vice-chair and program committee member for many conferences, including WWW, SIGIR, ICDE, and KDD, and as a guest editor of the IEEE TKDE special issue on mining and searching the Web. His work on focused crawling received the Best Paper award at the 8th International World Wide Web Conference (1999). He holds a Ph.D. from the University of California, Berkeley.
This book brings all of the elements of data mining together in a single volume, saving the reader the time and expense of making multiple purchases. It consolidates both introductory and advanced topics, thereby covering the gamut of data mining and machine learning tactics ? from data integration and pre-processing, to fundamental algorithms, to optimization techniques and web mining methodology. The proposed book expertly combines the finest data mining material from the Morgan Kaufmann portfolio. Individual chapters are derived from a select group of MK books authored by the best and brightest in the field. These chapters are combined into one comprehensive volume in a way that allows it to be used as a reference work for those interested in new and developing aspects of data mining. This book represents a quick and efficient way to unite valuable content from leading data mining experts, thereby creating a definitive, one-stop-shopping opportunity for customers to receive the information they would otherwise need to round up from separate sources. - Chapters contributed by various recognized experts in the field let the reader remain up to date and fully informed from multiple viewpoints. - Presents multiple methods of analysis and algorithmic problem-solving techniques, enhancing the reader's technical expertise and ability to implement practical solutions. - Coverage of both theory and practice brings all of the elements of data mining together in a single volume, saving the reader the time and expense of making multiple purchases.
Front cover 1
Data Mining: Know It All 4
Copyright page 5
Table of contents 6
About This Book 10
Contributing Authors 12
Chapter 1 What’s It All About? 18
1.1DATA MINING AND MACHINE LEARNING 18
1.2SIMPLE EXAMPLES: THE WEATHER PROBLEM AND OTHERS 24
1.3FIELDED APPLICATIONS 37
1.4MACHINE LEARNING AND STATISTICs 44
1.5GENERALIZATION AS SEARCH 45
1.6DATA MINING AND ETHICS 49
1.7RESOURCES 51
Chapter 2 Data Acquisition and Integration 54
2.1INTRODUCTION 54
2.2SOURCES OF DATA 54
2.3VARIABLE TYPES 56
2.4DATA ROLLUP 58
2.5ROLLUP WITH SUMS, AVERAGES, AND COUNTS 65
2.6CALCULATION OF THE MODE 66
2.7DATA INTEGRATION 67
Chapter 3 Data Preprocessing 74
3.1WHY PREPROCESS THE DATA? 75
3.2DESCRIPTIVE DATA SUMMARIZATION 78
3.3DATA CLEANING 89
3.4DATA INTEGRATION AND TRANSFORMATION 95
3.5DATA REDUCTION 101
3.6DATA DISCRETIZATION AND CONCEPT HIERARCHY GENERATION 115
3.7SUMMARY 125
3.8RESOURCES 126
Chapter 4 Physical Design for Decision Support, Warehousing, and OLAP 130
4.1WHAT IS ONLINE ANALYTICAL PROCESSING? 130
4.2DIMENSION HIERARCHIES 133
4.3STAR AND SNOWFLAKE SCHEMAS 134
4.4WAREHOUSES AND MARTS 136
4.5SCALING UP THE SYSTEM 139
4.6DSS, WAREHOUSING, AND OLAP DESIGN CONSIDERATIONS 141
4.7USAGE SYNTAX AND EXAMPLES FOR MAJOR DATABASE SERVERS 142
4.8SUMMARY 145
4.9LITERATURE SUMMARY 146
RESOURCES 146
Chapter 5 Algorithms: The Basic Methods 148
5.1INFERRING RUDIMENTARY RULES 149
5.2STATISTICAL MODELING 153
5.3DIVIDE AND CONQUER: CONSTRUCTING DECISION TREES 161
5.4COVERING ALGORITHMS: CONSTRUCTING RULES 170
5.5MINING ASSOCIATION RULES 177
5.6LINEAR MODELS 185
5.7INSTANCE-BASED LEARNING 193
5.8CLUSTERING 201
5.9RESOURCES 205
Chapter 6 Further Techniques in Decision Analysis 208
6.1MODELING RISK PREFERENCES 208
6.2ANALYZING RISK DIRECTLY 215
6.3DOMINANCE 217
6.4SENSITIVITY ANALYSIS 222
6.5VALUE OF INFORMATION 232
6.6NORMATIVE DECISION ANALYSIS 237
Chapter 7 Fundamental Concepts of Genetic Algorithms 238
7.1THE VOCABULARY OF GENETIC ALGORITHMS 239
7.2OVERVIEW 247
7.3THE ARCHITECTURE OF A GENETIC ALGORITHM 258
7.4PRACTICAL ISSUES IN USING A GENETIC ALGORITHM 302
7.5REVIEW 307
7.6RESOURCES 307
Chapter 8 Data Structures and Algorithms for Moving Objects Types 310
8.1DATA STRUCTURES 310
8.2ALGORITHMS FOR OPERATIONS ON TEMPORAL DATA TYPES 315
8.3ALGORITHMS FOR LIFTED OPERATIONS 327
8.4RESOURCES 336
Chapter 9 Improving the Model 338
9.1LEARNING FROM ERRORS 340
9.2IMPROVING MODEL QUALITY, SOLVING PROBLEMS 360
9.3SUMMARY 412
Chapter 10 Social Network Analysis 414
10.1SOCIAL SCIENCES AND BIBLIOMETRY 415
10.2PAGERANK AND HYPERLINK-INDUCED TOPIC SEARCH 417
10.3SHORTCOMINGS OF THE COARSE-GRAINED GRAPH MODEL 427
10.4ENHANCED MODELS AND TECHNIQUES 433
10.5EVALUATION OF TOPIC DISTILLATION 441
10.6MEASURING AND MODELING THE WEB 447
10.7RESOURCES 457
Index 460
Erscheint lt. Verlag | 31.10.2008 |
---|---|
Sprache | englisch |
Themenwelt | Sachbuch/Ratgeber |
Informatik ► Datenbanken ► Data Warehouse / Data Mining | |
Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
Sozialwissenschaften ► Kommunikation / Medien ► Buchhandel / Bibliothekswesen | |
ISBN-10 | 0-08-087788-5 / 0080877885 |
ISBN-13 | 978-0-08-087788-4 / 9780080877884 |
Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich