Principles and Theory for Data Mining and Machine Learning (eBook)
XII, 786 Seiten
Springer New York (Verlag)
978-0-387-98135-2 (ISBN)
Extensive treatment of the most up-to-date topics
Provides the theory and concepts behind popular and emerging methods
Range of topics drawn from Statistics, Computer Science, and Electrical Engineering
The idea for this book came from the time the authors spent at the Statistics and Applied Mathematical Sciences Institute (SAMSI) in Research Triangle Park in North Carolina starting in fall 2003. The rst author was there for a total of two years, the rst year as a Duke/SAMSI Research Fellow. The second author was there for a year as a Post-Doctoral Scholar. The third author has the great fortune to be in RTP p- manently. SAMSI was - and remains - an incredibly rich intellectual environment with a general atmosphere of free-wheeling inquiry that cuts across established elds. SAMSI encourages creativity: It is the kind of place where researchers can be found at work in the small hours of the morning - computing, interpreting computations, and developing methodology. Visiting SAMSI is a unique and wonderful experience. The people most responsible for making SAMSI the great success it is include Jim Berger, Alan Karr, and Steve Marron. We would also like to express our gratitude to Dalene Stangl and all the others from Duke, UNC-Chapel Hill, and NC State, as well as to the visitors (short and long term) who were involved in the SAMSI programs. It was a magical time we remember with ongoing appreciation.
Preface 1
Variability, Information, and Prediction 16
The Curse of Dimensionality 18
The Two Extremes 19
Perspectives on the Curse 20
Sparsity 21
Exploding Numbers of Models 23
Multicollinearity and Concurvity 24
The Effect of Noise 25
Coping with the Curse 26
Selecting Design Points 26
Local Dimension 27
Parsimony 32
Two Techniques 33
The Bootstrap 33
Cross-Validation 42
Optimization and Search 47
Univariate Search 47
Multivariate Search 48
General Searches 49
Constraint Satisfaction and Combinatorial Search 50
Notes 53
Hammersley Points 53
Edgeworth Expansions for the Mean 54
Bootstrap Asymptotics for the Studentized Mean 56
Exercises 58
Local Smoothers 68
Early Smoothers 70
Transition to Classical Smoothers 74
Global Versus Local Approximations 75
LOESS 79
Kernel Smoothers 82
Statistical Function Approximation 83
The Concept of Kernel Methods and the Discrete Case 88
Kernels and Stochastic Designs: Density Estimation 93
Stochastic Designs: Asymptotics for Kernel Smoothers 96
Convergence Theorems and Rates for Kernel Smoothers 101
Kernel and Bandwidth Selection 105
Linear Smoothers 110
Nearest Neighbors 111
Applications of Kernel Regression 115
A Simulated Example 115
Ethanol Data 117
Exercises 122
Spline Smoothing 132
Interpolating Splines 132
Natural Cubic Splines 138
Smoothing Splines for Regression 141
Model Selection for Spline Smoothing 144
Spline Smoothing Meets Kernel Smoothing 145
Asymptotic Bias, Variance, and MISE for Spline Smoothers 146
Ethanol Data Example -- Continued 148
Splines Redux: Hilbert Space Formulation 151
Reproducing Kernels 153
Constructing an RKHS 156
Direct Sum Construction for Splines 161
Explicit Forms 164
Nonparametrics in Data Mining and Machine Learning 167
Simulated Comparisons 169
What Happens with Dependent Noise Models? 172
Higher Dimensions and the Curse of Dimensionality 174
Notes 178
Sobolev Spaces: Definition 178
Exercises 179
New Wave Nonparametrics 186
Additive Models 187
The Backfitting Algorithm 188
Concurvity and Inference 192
Nonparametric Optimality 195
Generalized Additive Models 196
Projection Pursuit Regression 199
Neural Networks 204
Backpropagation and Inference 207
Barron's Result and the Curse 212
Approximation Properties 213
Barron's Theorem: Formal Statement 215
Recursive Partitioning Regression 217
Growing Trees 219
Pruning and Selection 222
Regression 223
Bayesian Additive Regression Trees: BART 225
MARS 225
Sliced Inverse Regression 230
ACE and AVAS 233
Notes 235
Proof of Barron's Theorem 235
Exercises 239
Supervised Learning: Partition Methods 246
Multiclass Learning 248
Discriminant Analysis 250
Distance-Based Discriminant Analysis 251
Bayes Rules 256
Probability-Based Discriminant Analysis 260
Tree-Based Classifiers 264
Splitting Rules 264
Logic Trees 268
Random Forests 269
Support Vector Machines 277
Margins and Distances 277
Binary Classification and Risk 280
Prediction Bounds for Function Classes 283
Constructing SVM Classifiers 286
SVM Classification for Nonlinearly Separable Populations 294
SVMs in the General Nonlinear Case 297
Some Kernels Used in SVM Classification 303
Kernel Choice, SVMs and Model Selection 304
Support Vector Regression 305
Multiclass Support Vector Machines 308
Neural Networks 309
Notes 311
Hoeffding's Inequality 311
VC Dimension 312
Exercises 315
Alternative Nonparametrics 322
Ensemble Methods 323
Bayes Model Averaging 325
Bagging 327
Stacking 331
Boosting 333
Other Averaging Methods 341
Oracle Inequalities 343
Bayes Nonparametrics 349
Dirichlet Process Priors 349
Polya Tree Priors 351
Gaussian Process Priors 353
The Relevance Vector Machine 359
RVM Regression: Formal Description 360
RVM Classification 364
Hidden Markov Models -- Sequential Classification 367
Notes 369
Proof of Yang's Oracle Inequality 369
Proof of Lecue's Oracle Inequality 372
Exercises 374
Computational Comparisons 379
Computational Results: Classification 380
Comparison on Fisher's Iris Data 380
Comparison on Ripley's Data 383
Computational Results: Regression 390
Vapnik's sinc Function 391
Friedman's Function 403
Conclusions 406
Systematic Simulation Study 411
No Free Lunch 414
Exercises 416
Unsupervised Learning: Clustering 419
Centroid-Based Clustering 422
K-Means Clustering 423
Variants 426
Hierarchical Clustering 427
Agglomerative Hierarchical Clustering 428
Divisive Hierarchical Clustering 436
Theory for Hierarchical Clustering 440
Partitional Clustering 444
Model-Based Clustering 446
Graph-Theoretic Clustering 461
Spectral Clustering 466
Bayesian Clustering 472
Probabilistic Clustering 472
Hypothesis Testing 475
Computed Examples 477
Ripley's Data 479
Iris Data 489
Cluster Validation 494
Notes 498
Derivatives of Functions of a Matrix: 498
Kruskal's Algorithm: Proof 498
Prim's Algorithm: Proof 499
Exercises 499
Learning in High Dimensions 506
Principal Components 508
Main Theorem 509
Key Properties 511
Extensions 513
Factor Analysis 515
Finding and 517
Finding K 519
Estimating Factor Scores 520
Projection Pursuit 521
Independent Components Analysis 524
Main Definitions 524
Key Results 526
Computational Approach 528
Nonlinear PCs and ICA 529
Nonlinear PCs 530
Nonlinear ICA 531
Geometric Summarization 531
Measuring Distances to an Algebraic Shape 532
Principal Curves and Surfaces 533
Supervised Dimension Reduction: Partial Least Squares 536
Simple PLS 536
PLS Procedures 537
Properties of PLS 539
Supervised Dimension Reduction: Sufficient Dimensions in Regression 540
Visualization I: Basic Plots 544
Elementary Visualization 547
Projections 554
Time Dependence 556
Visualization II: Transformations 559
Chernoff Faces 559
Multidimensional Scaling 560
Self-Organizing Maps 566
Exercises 573
Variable Selection 582
Concepts from Linear Regression 583
Subset Selection 585
Variable Ranking 588
Overview 590
Traditional Criteria 591
Akaike Information Criterion (AIC) 593
Bayesian Information Criterion (BIC) 596
Choices of Information Criteria 598
Cross Validation 600
Shrinkage Methods 612
Shrinkage Methods for Linear Models 614
Grouping in Variable Selection 628
Least Angle Regression 630
Shrinkage Methods for Model Classes 633
Cautionary Notes 644
Bayes Variable Selection 645
Prior Specification 648
Posterior Calculation and Exploration 656
Evaluating Evidence 660
Connections Between Bayesian and Frequentist Methods 663
Computational Comparisons 666
The n > p Case
When p > n
Notes 680
Code for Generating Data in Section 10.5 680
Exercises 684
Multiple Testing 692
Analyzing the Hypothesis Testing Problem 694
A Paradigmatic Setting 694
Counts for Multiple Tests 697
Measures of Error in Multiple Testing 698
Aspects of Error Control 700
Controlling the Familywise Error Rate 703
One-Step Adjustments 703
Stepwise p-Value Adjustments 706
PCER and PFER 708
Null Domination 709
Two Procedures 710
Controlling the Type I Error Rate 715
Adjusted p-Values for PFER/PCER 719
Controlling the False Discovery Rate 720
FDR and other Measures of Error 722
The Benjamini-Hochberg Procedure 723
A BH Theorem for a Dependent Setting 724
Variations on BH 726
Controlling the Positive False Discovery Rate 732
Bayesian Interpretations 732
Aspects of Implementation 736
Bayesian Multiple Testing 740
Fully Bayes: Hierarchical 741
Fully Bayes: Decision theory 744
Notes 749
Proof of the Benjamini-Hochberg Theorem 749
Proof of the Benjamini-Yekutieli Theorem 752
References 756
Index 785
Erscheint lt. Verlag | 21.7.2009 |
---|---|
Reihe/Serie | Springer Series in Statistics | Springer Series in Statistics |
Zusatzinfo | XII, 786 p. |
Verlagsort | New York |
Sprache | englisch |
Themenwelt | Informatik ► Datenbanken ► Data Warehouse / Data Mining |
Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
Mathematik / Informatik ► Mathematik ► Statistik | |
Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik | |
Technik | |
Schlagworte | classification • Clustering • Data Mining • high dimensional and complex data • linear regression • machine learning • Model uncertainty • nonlinear methods • pattern recognition • regularization methods • supervised learning • Unsupervised Learning |
ISBN-10 | 0-387-98135-7 / 0387981357 |
ISBN-13 | 978-0-387-98135-2 / 9780387981352 |
Haben Sie eine Frage zum Produkt? |
Größe: 10,1 MB
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich