Machine Learning and Artificial Intelligence (eBook)
XXII, 261 Seiten
Springer International Publishing (Verlag)
978-3-030-26622-6 (ISBN)
This book provides comprehensive coverage of combined Artificial Intelligence (AI) and Machine Learning (ML) theory and applications. Rather than looking at the field from only a theoretical or only a practical perspective, this book unifies both perspectives to give holistic understanding. The first part introduces the concepts of AI and ML and their origin and current state. The second and third parts delve into conceptual and theoretic aspects of static and dynamic ML techniques. The forth part describes the practical applications where presented techniques can be applied. The fifth part introduces the user to some of the implementation strategies for solving real life ML problems.
The book is appropriate for students in graduate and upper undergraduate courses in addition to researchers and professionals. It makes minimal use of mathematics to make the topics more intuitive and accessible.
- Presents a full reference to artificial intelligence and machine learning techniques - in theory and application;
- Provides a guide to AI and ML with minimal use of mathematics to make the topics more intuitive and accessible;
- Connects all ML and AI techniques to applications and introduces implementations.
Dr. Ameet Joshi received his PhD from Michigan State University in 2006. He has over 15 years of experience in developing machine learning algorithms in various different industrial settings including Pipeline Inspection, Home Energy Disaggregation, Microsoft Cortana Intelligence and Business Intelligence in CRM. He is currently a Data Science Manager at Microsoft. Previously, he has worked as Machine Learning Specialist at Belkin International and a Director of Research at Microline Technology Corp. He is a member of several technical committees, has published in numerous conference and journal publications and contributed to edited books. He also has two patents and have received several industry awards including and Senior Membership of IEEE (which only 8% of members achieve).
Foreword 6
Preface 9
Acknowledgments 11
Contents 12
Part I Introduction 20
1 Introduction to AI and ML 21
1.1 Introduction 21
1.2 What Is AI 22
1.3 What Is ML 22
1.4 Organization of the Book 23
1.4.1 Introduction 23
1.4.2 Machine Learning 23
1.4.3 Building End to End Pipelines 24
1.4.4 Artificial Intelligence 24
1.4.5 Implementations 24
1.4.6 Conclusion 25
2 Essential Concepts in Artificial Intelligence and Machine Learning 26
2.1 Introduction 26
2.2 Big Data and Not-So-Big Data 26
2.2.1 What Is Big Data 26
2.2.2 Why Should We Treat Big Data Differently? 27
2.3 Types of Learning 27
2.3.1 Supervised Learning 27
2.3.2 Unsupervised Learning 28
2.3.3 Reinforcement Learning 28
2.4 Machine Learning Methods Based on Time 28
2.4.1 Static Learning 28
2.4.2 Dynamic Learning 29
2.5 Dimensionality 29
2.5.1 Curse of Dimensionality 30
2.6 Linearity and Nonlinearity 30
2.7 Occam's Razor 35
2.8 No Free Lunch Theorem 35
2.9 Law of Diminishing Returns 36
2.10 Early Trends in Machine Learning 36
2.10.1 Expert Systems 36
2.11 Conclusion 37
3 Data Understanding, Representation, and Visualization 38
3.1 Introduction 38
3.2 Understanding the Data 38
3.2.1 Understanding Entities 39
3.2.2 Understanding Attributes 39
3.2.3 Understanding Data Types 41
3.3 Representation and Visualization of the Data 41
3.3.1 Principal Component Analysis 41
3.3.2 Linear Discriminant Analysis 44
3.4 Conclusion 46
Part II Machine Learning 47
4 Linear Methods 48
4.1 Introduction 48
4.2 Linear and Generalized Linear Models 49
4.3 Linear Regression 49
4.3.1 Defining the Problem 49
4.3.2 Solving the Problem 50
4.4 Regularized Linear Regression 51
4.4.1 Regularization 51
4.4.2 Ridge Regression 51
4.4.3 Lasso Regression 52
4.5 Generalized Linear Models (GLM) 52
4.5.1 Logistic Regression 52
4.6 k-Nearest Neighbor (KNN) Algorithm 53
4.6.1 Definition of KNN 53
4.6.2 Classification and Regression 55
4.6.3 Other Variations of KNN 55
4.7 Conclusion 56
5 Perceptron and Neural Networks 57
5.1 Introduction 57
5.2 Perceptron 57
5.3 Multilayered Perceptron or Artificial Neural Network 58
5.3.1 Feedforward Operation 58
5.3.2 Nonlinear MLP or Nonlinear ANN 59
5.3.2.1 Activation Functions 59
5.3.3 Training MLP 59
5.3.3.1 Online or Stochastic Learning 61
5.3.3.2 Batch Learning 61
5.3.4 Hidden Layers 62
5.4 Radial Basis Function Networks 62
5.4.1 Interpretation of RBF Networks 63
5.5 Overfitting and Regularization 64
5.5.1 L1 and L2 Regularization 64
5.5.2 Dropout Regularization 65
5.6 Conclusion 65
6 Decision Trees 66
6.1 Introduction 66
6.2 Why Decision Trees? 67
6.2.1 Types of Decision Trees 67
6.3 Algorithms for Building Decision Trees 67
6.4 Regression Tree 68
6.5 Classification Tree 70
6.6 Decision Metrics 70
6.6.1 Misclassification Error 70
6.6.2 Gini Index 70
6.6.3 Cross-Entropy or Deviance 71
6.7 CHAID 71
6.7.1 CHAID Algorithm 72
6.8 Training Decision Tree 72
6.8.1 Steps 72
6.9 Ensemble Decision Trees 73
6.10 Bagging Ensemble Trees 73
6.11 Random Forest Trees 74
6.11.1 Decision Jungles 74
6.12 Boosted Ensemble Trees 75
6.12.1 AdaBoost 75
6.12.2 Gradient Boosting 75
6.13 Conclusion 76
7 Support Vector Machines 77
7.1 Introduction 77
7.2 Motivation and Scope 77
7.2.1 Extension to Multi-Class Classification 78
7.2.2 Extension for Nonlinear Case 78
7.3 Theory of SVM 79
7.4 Separability and Margins 81
7.4.1 Regularization and Soft Margin SVM 81
7.4.2 Use of Slack Variables 81
7.5 Nonlinearity and Use of Kernels 82
7.5.1 Radial Basis Function 82
7.5.2 Polynomial 83
7.5.3 Sigmoid 83
7.6 Risk Minimization 83
7.7 Conclusion 83
8 Probabilistic Models 84
8.1 Introduction 84
8.2 Discriminative Models 85
8.2.1 Maximum Likelihood Estimation 85
8.2.2 Bayesian Approach 85
8.2.3 Comparison of MLE and Bayesian Approach 87
8.2.3.1 Solution Using MLE 87
8.2.3.2 Solution Using Bayes's Approach 88
8.3 Generative Models 89
8.3.1 Mixture Methods 90
8.3.2 Bayesian Networks 90
8.4 Some Useful Probability Distributions 90
8.4.1 Normal or Gaussian Distribution 91
8.4.2 Bernoulli Distribution 92
8.4.3 Binomial Distribution 95
8.4.4 Gamma Distribution 95
8.4.5 Poisson Distribution 96
8.5 Conclusion 100
9 Dynamic Programming and Reinforcement Learning 101
9.1 Introduction 101
9.2 Fundamental Equation of Dynamic Programming 101
9.3 Classes of Problems Under Dynamic Programming 103
9.4 Reinforcement Learning 103
9.4.1 Characteristics of Reinforcement Learning 103
9.4.2 Framework and Algorithm 104
9.5 Exploration and Exploitation 105
9.6 Examples of Reinforcement Learning Applications 105
9.7 Theory of Reinforcement Learning 106
9.7.1 Variations in Learning 107
9.7.1.1 Q-Learning 107
9.7.1.2 SARSA 108
9.8 Conclusion 108
10 Evolutionary Algorithms 109
10.1 Introduction 109
10.2 Bottleneck with Traditional Methods 109
10.3 Darwin's Theory of Evolution 110
10.4 Genetic Programming 112
10.5 Swarm Intelligence 114
10.6 Ant Colony Optimization 115
10.7 Simulated Annealing 116
10.8 Conclusion 116
11 Time Series Models 117
11.1 Introduction 117
11.2 Stationarity 118
11.3 Autoregressive and Moving Average Models 119
11.3.1 Autoregressive, or AR Process 120
11.3.2 Moving Average, or MA Process 120
11.3.3 Autoregressive Moving Average ARMA Process 121
11.4 Autoregressive Integrated Moving Average (ARIMA) Models 121
11.5 Hidden Markov Models (HMM) 122
11.5.1 Applications 124
11.6 Conditional Random Fields (CRF) 124
11.7 Conclusion 125
12 Deep Learning 126
12.1 Introduction 126
12.2 Origin of Modern Deep Learning 127
12.3 Convolutional Neural Networks (CNNs) 128
12.3.1 1D Convolution 128
12.3.2 2D Convolution 129
12.3.3 Architecture of CNN 129
12.3.3.1 Convolution Layer 130
12.3.3.2 Rectified Linear Unit (ReLU) 130
12.3.3.3 Pooling 131
12.3.3.4 Fully Connected Layer 131
12.3.4 Training CNN 132
12.4 Recurrent Neural Networks (RNN) 132
12.4.1 Limitation of RNN 133
12.4.2 Long Short-Term Memory RNN 133
12.4.2.1 Forget Gate 134
12.4.2.2 Input Gate 134
12.4.2.3 Output Gate 135
12.4.3 Advantages of LSTM 135
12.4.4 Current State of LSTM-RNN 135
12.5 Conclusion 135
13 Emerging Trends in Machine Learning 136
13.1 Introduction 136
13.2 Transfer Learning 136
13.3 Generative Adversarial Networks (GANs) 137
13.4 Quantum Computation 137
13.4.1 Quantum Theory 138
13.4.2 Quantum Entanglement 139
13.4.3 Quantum Superposition 140
13.4.4 Computation with Quantum Particles 140
13.5 AutoML 140
13.6 Conclusion 141
14 Unsupervised Learning 142
14.1 Introduction 142
14.2 Clustering 143
14.2.1 k-Means Clustering 143
14.2.2 Improvements to k-Means Clustering 145
14.2.2.1 Hierarchical k-Means Clustering 146
14.2.2.2 Fuzzy k-Means Clustering 146
14.3 Component Analysis 146
14.3.1 Independent Component Analysis (ICA) 146
14.4 Self Organizing Maps (SOM) 147
14.5 Autoencoding Neural Networks 147
14.6 Conclusion 149
Part III Building End to End Pipelines 150
15 Featurization 151
15.1 Introduction 151
15.2 UCI: Adult Salary Predictor 151
15.3 Identifying the Raw Data, Separating Information from Noise 152
15.3.1 Correlation and Causality 153
15.4 Building Feature Set 154
15.4.1 Standard Options of Feature Building 154
15.4.1.1 Numerical Features 155
15.4.1.2 Categorical Features 155
15.4.1.3 String Features 156
15.4.1.4 Datetime Features 158
15.4.2 Custom Options of Feature Building 158
15.5 Handling Missing Values 159
15.6 Visualizing the Features 160
15.6.1 Numeric Features 160
15.6.2 Categorical Features 161
15.6.2.1 Feature: Workclass 164
15.6.2.2 Feature: Education 165
15.6.2.3 Other Features 166
15.7 Conclusion 166
16 Designing and Tuning Model Pipelines 167
16.1 Introduction 167
16.2 Choosing the Technique or Algorithm 167
16.2.1 Choosing Technique for Adult Salary Classification 168
16.3 Splitting the Data 168
16.3.1 Stratified Sampling 170
16.4 Training 171
16.4.1 Tuning the Hyperparameters 171
16.5 Accuracy Measurement 172
16.6 Explainability of Features 172
16.7 Practical Considerations 172
16.7.1 Data Leakage 173
16.7.2 Coincidence and Causality 174
16.7.3 Unknown Categories 175
16.8 Conclusion 175
17 Performance Measurement 176
17.1 Introduction 176
17.2 Metrics Based on Numerical Error 177
17.2.1 Mean Absolute Error 177
17.2.2 Mean Squared Error 177
17.2.3 Root Mean Squared Error 177
17.2.4 Normalized Error 178
17.3 Metrics Based on Categorical Error 178
17.3.1 Accuracy 178
17.3.2 Precision and Recall 179
17.3.2.1 F-Score 179
17.3.2.2 Confusion Matrix 179
17.3.3 Receiver Operating Characteristics (ROC) Curve Analysis 180
17.4 Hypothesis Testing 181
17.4.1 Background 181
17.4.2 Steps in Hypothesis Testing 182
17.4.3 A/B Testing 183
17.5 Conclusion 183
Part IV Artificial Intelligence 184
18 Classification 185
18.1 Introduction 185
18.2 Examples of Real World Problems in Classification 185
18.3 Spam Email Detection 186
18.3.1 Defining Scope 186
18.3.2 Assumptions 187
18.3.2.1 Assumptions About the Spam Emails 187
18.3.2.2 Assumptions About the Genuine Emails 187
18.3.2.3 Assumptions About Precision and Recall Tradeoff 187
18.3.3 Skew in the Data 188
18.3.4 Supervised Learning 188
18.3.5 Feature Engineering 189
18.3.6 Model Training 189
18.3.7 Iterating the Process for Optimization 189
18.4 Conclusion 190
19 Regression 191
19.1 Introduction 191
19.2 Predicting Real Estate Prices 191
19.2.1 Defining Regression Specific Problem 191
19.2.2 Gather Labelled Data 192
19.2.2.1 Splitting the Data 193
19.2.3 Feature Engineering 193
19.2.4 Model Selection 196
19.2.5 Model Performance 196
19.3 Other Applications of Regression 197
19.4 Conclusion 197
20 Ranking 198
20.1 Introduction 198
20.2 Measuring Ranking Performance 199
20.3 Ranking Search Results and Google's PageRank 201
20.4 Techniques Used in Ranking Systems 201
20.4.1 Keyword Identification/Extraction 201
20.5 Conclusion 203
21 Recommendations Systems 204
21.1 Introduction 204
21.2 Collaborative Filtering 205
21.2.1 Solution Approaches 206
21.3 Amazon's Personal Shopping Experience 207
21.3.1 Context Based Recommendation 207
21.3.2 Personalization Based Recommendation 208
21.4 Netflix's Streaming Video Recommendations 208
21.5 Conclusion 209
Part V Implementations 210
22 Azure Machine Learning 211
22.1 Introduction 211
22.2 Azure Machine Learning Studio 211
22.2.1 How to Start? 212
22.3 Building ML Pipeline Using AML Studio 214
22.3.1 Get the Data 214
22.3.2 Data Preprocessing 216
22.3.3 Training the Classifier Model 218
22.4 Scoring and Performance Metrics 219
22.4.1 Comparing Two Models 221
22.5 Conclusion 223
23 Open Source Machine Learning Libraries 225
23.1 Introduction 225
23.2 Options of Machine Learning Libraries 226
23.3 Scikit-Learn Library 227
23.3.1 Development Environment 227
23.3.2 Importing Data 228
23.3.3 Data Preprocessing 229
23.3.4 Splitting the Data Using Stratified Sampling 230
23.3.5 Training a Multiclass Classification Model 230
23.3.6 Computing Metrics 231
23.3.7 Using Alternate Model 231
23.4 Model Tuning and Optimization 232
23.4.1 Generalization 233
23.5 Comparison Between AML Studio and Scikit-Learn 233
23.6 Conclusion 236
24 Amazon's Machine Learning Toolkit: Sagemaker 237
24.1 Introduction 237
24.2 Setting Up Sagemaker 237
24.3 Uploading Data to S3 Storage 238
24.4 Writing the Machine Learning Pipeline Using Python 239
24.5 Conclusion 240
Part VI Conclusion 248
25 Conclusion and Next Steps 249
25.1 Overview 249
25.2 What's Next 250
References 251
Index 254
Erscheint lt. Verlag | 24.9.2019 |
---|---|
Zusatzinfo | XXII, 261 p. 98 illus., 94 illus. in color. |
Sprache | englisch |
Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
Technik ► Elektrotechnik / Energietechnik | |
Wirtschaft | |
Schlagworte | AI and Machine Learning • AI reference • Artificial Intelligence • ML reference • ML Techniques • Modern perspective on AI and ML |
ISBN-10 | 3-030-26622-2 / 3030266222 |
ISBN-13 | 978-3-030-26622-6 / 9783030266226 |
Haben Sie eine Frage zum Produkt? |
Größe: 6,4 MB
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich