Nicht aus der Schweiz? Besuchen Sie lehmanns.de

Chemometrics with R (eBook)

Multivariate Data Analysis in the Natural Sciences and Life Sciences

(Autor)

eBook Download: PDF
2011 | 2011
XIV, 286 Seiten
Springer Berlin (Verlag)
978-3-642-17841-2 (ISBN)

Lese- und Medienproben

Chemometrics with R - Ron Wehrens
Systemvoraussetzungen
96,29 inkl. MwSt
(CHF 93,95)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
'Chemometrics with R' offers readers an accessible introduction to the world of multivariate statistics in the life sciences, providing a complete description of the general data analysis paradigm, from exploratory analysis to modeling to validation. Several more specific topics from the area of chemometrics are included in a special section. The corresponding R code is provided for all the examples in the book; scripts, functions and data are available in a separate, publicly available R package. For researchers working in the life sciences, the book can also serve as an easy-to-use primer on R.

Ron Wehrens (1966) obtained a PhD in Chemometrics at the Radboud University Nijmegen, The Netherlands. He was a lecturer in Analytical Chemistry at the University of Twente, and later an associated professor at the Radboud University Nijmegen. Since January 2010, he is group leader in Biostatistics and Data Analysis at the Fondazione Edmund Mach in San Michele all'Adige, Italy.

Ron Wehrens (1966) obtained a PhD in Chemometrics at the Radboud University Nijmegen, The Netherlands. He was a lecturer in Analytical Chemistry at the University of Twente, and later an associated professor at the Radboud University Nijmegen. Since January 2010, he is group leader in Biostatistics and Data Analysis at the Fondazione Edmund Mach in San Michele all'Adige, Italy.

Chemometrics with R 3
Preface 7
Contents 11
1 Introduction 15
Part I Preliminaries 19
2 Data 20
3 Preprocessing 26
3.1 Dealing with Noise 26
3.2 Baseline Removal 31
3.3 Aligning Peaks – Warping 33
3.3.1 Parametric Time Warping 35
3.3.2 Dynamic Time Warping 39
3.3.3 Practicalities 44
3.4 Peak Picking 44
3.5 Scaling 46
3.6 Missing Data 51
3.7 Conclusion 52
Part II Exploratory Analysis 53
4 Principal Component Analysis 54
4.1 The Machinery 55
4.2 Doing It Yourself 57
4.3 Choosing the Number of PCs 59
4.3.1 Statistical Tests 60
4.4 Projections 62
4.5 R Functions for PCA 64
4.6 Related Methods 68
4.6.1 Multidimensional Scaling 68
4.6.2 Independent Component Analysis and Projection Pursuit 71
4.6.3 Factor Analysis 74
4.6.4 Discussion 76
5 Self-Organizing Maps 78
5.1 Training SOMs 79
5.2 Visualization 82
5.3 Application 84
5.4 R Packages for SOMs 87
5.5 Discussion 88
6 Clustering 90
6.1 Hierarchical Clustering 91
6.2 Partitional Clustering 96
6.2.1 K-Means 96
6.2.2 K-Medoids 98
6.3 Probabilistic Clustering 101
6.4 Comparing Clusterings 106
6.5 Discussion 108
Part III Modelling 111
7 Classification 112
7.1 Discriminant Analysis 113
7.1.1 Linear Discriminant Analysis 114
7.1.2 Crossvalidation 118
7.1.3 Fisher LDA 120
7.1.4 Quadratic Discriminant Analysis 123
7.1.5 Model-Based Discriminant Analysis 125
7.1.6 Regularized Forms of Discriminant Analysis 127
Diagonal Discriminant Analysis 128
Shrunken Centroid Discriminant Analysis 129
7.2 Nearest-Neighbour Approaches 131
7.3 Tree-Based Approaches 135
7.3.1 Recursive Partitioning and Regression Trees 135
Constructing the Tree 139
7.3.2 Discussion 144
7.4 More Complicated Techniques 144
7.4.1 Support Vector Machines 145
Extensions to More than Two Classes 148
Finding the Right Parameters 149
7.4.2 Artificial Neural Networks 150
8 Multivariate Regression 154
8.1 Multiple Regression 154
8.1.1 Limits of Multiple Regression 156
8.2 PCR 158
8.2.1 The Algorithm 158
8.2.2 Selecting the Optimal Number of Components 161
8.3 Partial Least Squares (PLS) Regression 164
8.3.1 The Algorithm(s) 165
8.3.2 Interpretation 169
PLS Packages for R 172
8.4 Ridge Regression 172
8.5 Continuum Methods 174
8.6 Some Non-Linear Regression Techniques 174
8.6.1 SVMs for Regression 174
8.6.2 ANNs for Regression 177
8.7 Classification as a Regression Problem 179
8.7.1 Regression for LDA 179
8.7.2 Discussion 181
Part IV Model Inspection 182
9 Validation 183
9.1 Representativity and Independence 184
9.2 Error Measures 186
9.3 Model Selection 187
9.4 Crossvalidation Revisited 189
9.4.1 LOO Crossvalidation 189
9.4.2 Leave-Multiple-Out Crossvalidation 191
9.4.3 Double Crossvalidation 191
9.5 The Jackknife 192
9.6 The Bootstrap 194
9.6.1 Error Estimation with the Bootstrap 195
9.6.2 Confidence Intervals for Regression Coefficients 198
9.6.3 Other R Packages for Bootstrapping 203
9.7 Integrated Modelling and Validation 203
9.7.1 Bagging 204
9.7.2 Random Forests 205
9.7.3 Boosting 210
10 Variable Selection 213
10.1 Tests for Coefficient Significance 214
10.1.1 Confidence Intervals for Individual Coefficients 215
10.1.2 Tests Based on Overall Error Contributions 218
10.2 Explicit Coefficient Penalization 221
10.3 Global Optimization Methods 225
10.3.1 Simulated Annealing 226
10.3.2 Genetic Algorithms 233
10.3.3 Discussion 240
Part V Applications 241
11 Chemometric Applications 242
11.1 Outlier Detection with Robust PCA 242
11.1.1 Robust PCA 243
11.1.2 Discussion 247
11.2 Orthogonal Signal Correction and OPLS 247
11.3 Discrimination with Fat Data Matrices 250
11.3.1 PCDA 251
11.3.2 PLSDA 255
A Word of Warning 257
11.4 Calibration Transfer 258
11.5 Multivariate Curve Resolution 262
11.5.1 Theory 263
11.5.2 Finding Suitable Initial Estimates 264
Evolving Factor Analysis 264
OPA { the Orthogonal Projection Approach 266
11.5.3 Applying MCR 268
11.5.4 Constraints 270
11.5.5 Combining Data Sets 272
Part VI Appendices 275
A R Packages Used in this Book 276
References 277
Index 286

Erscheint lt. Verlag 20.1.2011
Reihe/Serie Use R!
Use R!
Zusatzinfo XIV, 286 p. 99 illus.
Verlagsort Berlin
Sprache englisch
Themenwelt Mathematik / Informatik Mathematik Statistik
Medizin / Pharmazie Allgemeines / Lexika
Naturwissenschaften Biologie
Naturwissenschaften Chemie
Technik
Schlagworte Bioinformatics • Chemometrics • Multivariate Statistics • R software
ISBN-10 3-642-17841-3 / 3642178413
ISBN-13 978-3-642-17841-2 / 9783642178412
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 6,0 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich