Hands-On Machine Learning with R
CRC Press (Verlag)
978-1-138-49568-5 (ISBN)
Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today’s most popular machine learning methods. This book serves as a practitioner’s guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory.
Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R’s machine learning stack and be able to implement a systematic approach for producing high quality modeling results.
Features:
· Offers a practical and applied introduction to the most popular machine learning methods.
· Topics covered include feature engineering, resampling, deep learning and more.
· Uses a hands-on approach and real world data.
Brad Boehmke is a data scientist at 84.51° where he wears both software developer and machine learning engineer hats. He is an Adjunct Professor at the University of Cincinnati, author of Data Wrangling with R, and creator of multiple public and private enterprise R packages. Brandon Greenwell is a data scientist at 84.51° where he works on a diverse team to enable, empower, and encourage others to successfully apply machine learning to solve real business problems. He’s part of the Adjunct Graduate Faculty at Wright State University, an Adjunct Instructor at the University of Cincinnati, and the author of several R packages available on CRAN.
I FUNDAMENTALS 1. Introduction to Machine Learning 1.1 Supervised learning 1.1.1 Regression problems 1.1.2 Classification problems 1.2 Unsupervised learning 1.3 Roadmap 1.4 The data sets 2. Modeling Process 2.1 Prerequisites 2.2 Data splitting 2.2.1 Simple random sampling 2.2.2 Stratified sampling 2.2.3 Class imbalances 2.3 Creating models in R 2.3.1 Many formula interfaces 2.3.2 Many engines 2.4 Resampling methods 2.4.1 k-fold cross validation 2.4.2 Bootstrapping 2.4.3 Alternatives 2.5 Bias variance trade-off 2.5.1 Bias 2.5.2 Variance 2.5.3 Hyperparameter tuning 2.6 Model evaluation 2.6.1 Regression models 2.6.2 Classification models 2.7 Putting the processes together 3. Feature & Target Engineering 3.1 Prerequisites 3.2 Target engineering 3.3 Dealing with missingness 3.3.1 Visualizing missing values 3.3.2 Imputation 3.4 Feature filtering 3.5 Numeric feature engineering 3.5.1 Skewness 3.5.2 Standardization 3.6 Categorical feature engineering 3.6.1 Lumping 3.6.2 One-hot & dummy encoding 3.6.3 Label encoding 3.6.4 Alternatives 3.7 Dimension reduction 3.8 Proper implementation 3.8.1 Sequential steps 3.8.2 Data leakage 3.8.3 Putting the process together II SUPERVISED LEARNING 4. Linear Regression 4.1 Prerequisites 4.2 Simple linear regression 4.2.1 Estimation 4.2.2 Inference 4.3 Multiple linear regression 4.4 Assessing model accuracy 4.5 Model concerns 4.6 Principal component regression 4.7 Partial least squares 4.8 Feature interpretation 4.9 Final thoughts 5. Logistic Regression 5.1 Prerequisites 5.2 Why logistic regression 5.3 Simple logistic regression 5.4 Multiple logistic regression 5.5 Assessing model accuracy 5.6 Model concerns 5.7 Feature interpretation 5.8 Final thoughts 6. Regularized Regression 6.1 Prerequisites 6.2 Why regularize? 6.2.1 Ridge penalty 6.2.2 Lasso penalty 6.2.3 Elastic nets 6.3 Implementation 6.4 Tuning 6.5 Feature interpretation 6.6 Attrition data 6.7 Final thoughts 7. Multivariate Adaptive Regression Splines 7.1 Prerequisites 7.2 The basic idea 7.2.1 Multivariate regression splines 7.3 Fitting a basic MARS model 7.4 Tuning 7.5 Feature interpretation 7.6 Attrition data 7.7 Final thoughts 8. K-Nearest Neighbors 8.1 Prerequisites 8.2 Measuring similarity 8.2.1 Distance measures 8.2.2 Pre-processing 8.3 Choosing k 8.4 MNIST example 8.5 Final thoughts 9 Decision Trees 9.1 Prerequisites 9.2 Structure 9.3 Partitioning 9.4 How deep? 9.4.1 Early stopping 9.4.2 Pruning 9.5 Ames housing example 9.6 Feature interpretation 9.7 Final thoughts 10. Bagging 10.1 Prerequisites 10.2 Why and when bagging works 10.3 Implementation 10.4 Easily parallelize 10.5 Feature interpretation 10.6 Final thoughts 11. Random Forests 11.1 Prerequisites 11.2 Extending bagging 11.3 Out-of-the-box performance 11.4 Hyperparameters 11.4.1 Number of trees 11.4.2 mtry 11.4.3 Tree complexity 11.4.4 Sampling scheme 11.4.5 Split rule 11.5 Tuning strategies 11.6 Feature interpretation 11.7 Final thoughts 12. Gradient Boosting 12.1 Prerequisites 12.2 How boosting works 12.2.1 A sequential ensemble approach 12.2.2 Gradient descent 12.3 Basic GBM 12.3.1 Hyperparameters 12.3.2 Implementation 12.3.3 General tuning strategy 12.4 Stochastic GBMs 12.4.1 Stochastic hyperparameters 12.4.2 Implementation 12.5 XGBoost 12.5.1 XGBoost hyperparameters 12.5.2 Tuning strategy 12.6 Feature interpretation 12.7 Final thoughts 13. Deep Learning 13.1 Prerequisites 13.2 Why deep learning 13.3 Feedforward DNNs 13.4 Network architecture 13.4.1 Layers and nodes 13.4.2 Activation 13.5 Backpropagation 13.6 Model training 13.7 Model tuning 13.7.1 Model capacity 13.7.2 Batch normalization 13.7.3 Regularization 13.7.4 Adjust learning rate 13.8 Grid Search 13.9 Final thoughts 14. Support Vector Machines 14.1 Prerequisites 14.2 Optimal separating hyperplanes 14.2.1 The hard margin classifier 14.2.2 The soft margin classifier 14.3 The support vector machine 14.3.1 More than two classes 14.3.2 Support vector regression 14.4 Job attrition example 14.4.1 Class weights 14.4.2 Class probabilities 14.5 Feature interpretation 14.6 Final thoughts 15. Stacked Models 15.1 Prerequisites 15.2 The Idea 15.2.1 Common ensemble methods 15.2.2 Super learner algorithm 15.2.3 Available packages 15.3 Stacking existing models 15.4 Stacking a grid search 15.5 Automated machine learning 15.6 Final thoughts 16. Interpretable Machine Learning 16.1 Prerequisites 16.2 The idea 16.2.1 Global interpretation 16.2.2 Local interpretation 16.2.3 Model-specific vs. model-agnostic 16.3 Permutation-based feature importance 16.3.1 Concept 16.3.2 Implementation 16.4 Partial dependence 16.4.1 Concept 16.4.2 Implementation 16.4.3 Alternative uses 16.5 Individual conditional expectation 16.5.1 Concept 16.5.2 Implementation 16.6 Feature interactions 16.6.1 Concept 16.6.2 Implementation 16.6.3 Alternatives 16.7 Local interpretable model-agnostic explanations 16.7.1 Concept 16.7.2 Implementation 16.7.3 Tuning 16.7.4 Alternative uses 16.8 Shapley values 16.8.1 Concept 16.8.2 Implementation 16.8.3 XGBoost and built-in Shapley values 16.9 Localized step-wise procedure 16.9.1 Concept 16.9.2 Implementation 16.10Final thoughts III DIMENSION REDUCTION 17. Principal Components Analysis 17.1 Prerequisites 17.2 The idea 17.3 Finding principal components 17.4 Performing PCA in R 17.5 Selecting the number of principal components 17.5.1 Eigenvalue criterion 17.5.2 Proportion of variance explained criterion 17.5.3 Scree plot criterion 17.6 Final thoughts 18. Generalized Low Rank Models 18.1 Prerequisites 18.2 The idea 18.3 Finding the lower ranks 18.3.1 Alternating minimization 18.3.2 Loss functions 18.3.3 Regularization 18.3.4 Selecting k 18.4 Fitting GLRMs in R 18.4.1 Basic GLRM model 18.4.2 Tuning to optimize for unseen data 18.5 Final thoughts 19. Autoencoders 19.1 Prerequisites 19.2 Undercomplete autoencoders 19.2.1 Comparing PCA to an autoencoder 19.2.2 Stacked autoencoders 19.2.3 Visualizing the reconstruction 19.3 Sparse autoencoders 19.4 Denoising autoencoders 19.5 Anomaly detection 19.6 Final thoughts IV Clustering 20. K-means Clustering 20.1 Prerequisites 20.2 Distance measures 20.3 Defining clusters 20.4 k-means algorithm 20.5 Clustering digits 20.6 How many clusters? 20.7 Clustering with mixed data 20.8 Alternative partitioning methods 20.9 Final thoughts 21. Hierarchical Clustering 21.1 Prerequisites 21.2 Hierarchical clustering algorithms 21.3 Hierarchical clustering in R 21.3.1 Agglomerative hierarchical clustering 21.3.2 Divisive hierarchical clustering 21.4 Determining optimal clusters 21.5 Working with dendrograms 21.6 Final thoughts 22. Model-based Clustering 22.1 Prerequisites 22.2 Measuring probability and uncertainty 22.3 Covariance types 22.4 Model selection 22.5 My basket example 22.6 Final thoughts Bibliography Index
Erscheinungsdatum | 20.11.2019 |
---|---|
Reihe/Serie | Chapman & Hall/CRC The R Series |
Verlagsort | London |
Sprache | englisch |
Maße | 156 x 234 mm |
Gewicht | 928 g |
Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
Mathematik / Informatik ► Mathematik ► Computerprogramme / Computeralgebra | |
Wirtschaft ► Volkswirtschaftslehre ► Ökonometrie | |
ISBN-10 | 1-138-49568-9 / 1138495689 |
ISBN-13 | 978-1-138-49568-5 / 9781138495685 |
Zustand | Neuware |
Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
Haben Sie eine Frage zum Produkt? |
aus dem Bereich