Statistical Analysis with R For Dummies
For Dummies (Verlag)
978-1-119-33706-5 (ISBN)
Understanding the world of R programming and analysis has never been easier Most guides to R, whether books or online, focus on R functions and procedures. But now, thanks to Statistical Analysis with R For Dummies, you have access to a trusted, easy-to-follow guide that focuses on the foundational statistical concepts that R addresses—as well as step-by-step guidance that shows you exactly how to implement them using R programming.
People are becoming more aware of R every day as major institutions are adopting it as a standard. Part of its appeal is that it's a free tool that's taking the place of costly statistical software packages that sometimes take an inordinate amount of time to learn. Plus, R enables a user to carry out complex statistical analyses by simply entering a few commands, making sophisticated analyses available and understandable to a wide audience. Statistical Analysis with R For Dummies enables you to perform these analyses and to fully understand their implications and results.
Gets you up to speed on the #1 analytics/data science software tool
Demonstrates how to easily find, download, and use cutting-edge community-reviewed methods in statistics and predictive modeling
Shows you how R offers intel from leading researchers in data science, free of charge
Provides information on using R Studio to work with R
Get ready to use R to crunch and analyze your data—the fast and easy way!
Joseph Schmuller, PhD, has taught undergraduate and graduate statistics, and has 25 years of IT experience. The author of four editions of Statistical Analysis with Excel For Dummies and three editions of Teach Yourself UML in 24 Hours (SAMS), he has created online coursework for Lynda.com and is a former Editor in Chief of PC AI magazine. He is a Research Scholar at the University of North Florida.
Introduction 1
About This Book 1
Similarity with This Other For Dummies Book 2
What You Can Safely Skip 2
Foolish Assumptions 2
How This Book Is Organized 3
Part 1: Getting Started with Statistical Analysis with R 3
Part 2: Describing Data 3
Part 3: Drawing Conclusions from Data 3
Part 4: Working with Probability 3
Part 5: The Part of Tens 4
Online Appendix A: More on Probability 4
Online Appendix B: Non-Parametric Statistics 4
Online Appendix C: Ten Topics That Just Didn’t Fit in Any Other Chapter 4
Icons Used in This Book 4
Where to Go from Here 5
Part 1: Getting Started with Statistical Analysis with R 7
Chapter 1: Data, Statistics, and Decisions 9
The Statistical (and Related) Notions You Just Have to Know 10
Samples and populations 10
Variables: Dependent and independent 11
Types of data 12
A little probability 13
Inferential Statistics: Testing Hypotheses 14
Null and alternative hypotheses 14
Two types of error 15
Chapter 2: R: What It Does and How It Does It 17
Downloading R and RStudio 18
A Session with R 21
The working directory 21
So let’s get started, already 22
Missing data 26
R Functions 26
User-Defined Functions 28
Comments 29
R Structures 29
Vectors 30
Numerical vectors 30
Matrices 31
Factors 33
Lists 34
Lists and statistics 35
Data frames 36
Packages 39
More Packages 42
R Formulas 43
Reading and Writing 44
Spreadsheets 44
CSV files 46
Text files 47
Part 2: Describing Data 49
Chapter 3: Getting Graphic 51
Finding Patterns 51
Graphing a distribution 52
Bar-hopping 53
Slicing the pie 54
The plot of scatter 55
Of boxes and whiskers 56
Base R Graphics 57
Histograms 57
Adding graph features 59
Bar plots 60
Pie graphs 62
Dot charts 62
Bar plots revisited 64
Scatter plots 67
Box plots 71
Graduating to ggplot2 71
Histograms 72
Bar plots 74
Dot charts 75
Bar plots re-revisited 78
Scatter plots 82
Box plots 86
Wrapping Up 89
Chapter 4: Finding Your Center 91
Means: The Lure of Averages 91
The Average in R: mean() 93
What’s your condition? 93
Eliminate $-signs forth with() 94
Exploring the data 95
Outliers: The flaw of averages 96
Other means to an end 97
Medians: Caught in the Middle 99
The Median in R: median() 100
Statistics à la Mode 101
The Mode in R 101
Chapter 5: Deviating from the Average 103
Measuring Variation 104
Averaging squared deviations: Variance and how to calculate it 104
Sample variance 107
Variance in R 107
Back to the Roots: Standard Deviation 108
Population standard deviation 108
Sample standard deviation 109
Standard Deviation in R 109
Conditions, Conditions, Conditions 110
Chapter 6: Meeting Standards and Standings 111
Catching Some Z’s 112
Characteristics of z-scores 112
Bonds versus the Bambino 113
Exam scores 114
Standard Scores in R 114
Where Do You Stand? 117
Ranking in R 117
Tied scores 117
Nth smallest, Nth largest 118
Percentiles 118
Percent ranks 120
Summarizing 121
Chapter 7: Summarizing It All 123
How Many? 123
The High and the Low 125
Living in the Moments 125
A teachable moment 126
Back to descriptives 126
Skewness 127
Kurtosis 130
Tuning in the Frequency 131
Nominal variables: table() et al 131
Numerical variables: hist() 132
Numerical variables: stem() 138
Summarizing a Data Frame 139
Chapter 8: What’s Normal? 143
Hitting the Curve 143
Digging deeper 144
Parameters of a normal distribution 145
Working with Normal Distributions 147
Distributions in R 147
Normal density function 147
Cumulative density function 152
Quantiles of normal distributions 155
Random sampling 156
A Distinguished Member of the Family 158
Part 3: Drawing Conclusions From Data 161
Chapter 9: The Confidence Game: Estimation 163
Understanding Sampling Distributions 164
An EXTREMELY Important Idea: The Central Limit Theorem 165
(Approximately) Simulating the central limit theorem 167
Predictions of the central limit theorem 171
Confidence: It Has Its Limits! 173
Finding confidence limits for a mean 173
Fit to a t 175
Chapter 10: One-Sample Hypothesis Testing 179
Hypotheses, Tests, and Errors 179
Hypothesis Tests and Sampling Distributions 181
Catching Some Z’s Again 183
Z Testing in R 185
t for One 187
t Testing in R 188
Working with t-Distributions 189
Visualizing t-Distributions 190
Plotting t in base R graphics 191
Plotting t in ggplot2 192
One more thing about ggplot2 197
Testing a Variance 198
Testing in R 199
Working with Chi-Square Distributions 201
Visualizing Chi-Square Distributions 201
Plotting chi-square in base R graphics 202
Plotting chi-square in ggplot2 203
Chapter 11: Two-Sample Hypothesis Testing 205
Hypotheses Built for Two 205
Sampling Distributions Revisited 206
Applying the central limit theorem 207
Z’s once more 208
Z-testing for two samples in R 210
t for Two 212
Like Peas in a Pod: Equal Variances 212
t-Testing in R 214
Working with two vectors 214
Working with a data frame and a formula 215
Visualizing the results 216
Like p’s and q’s: Unequal variances 219
A Matched Set: Hypothesis Testing for Paired Samples 220
Paired Sample t-testing in R 222
Testing Two Variances 222
F-testing in R 224
F in conjunction with t 225
Working with F-Distributions 226
Visualizing F-Distributions 226
Chapter 12: Testing More than Two Samples 231
Testing More Than Two 231
A thorny problem 232
A solution 233
Meaningful relationships 237
ANOVA in R 237
Visualizing the results 239
After the ANOVA 239
Contrasts in R 242
Unplanned comparisons 243
Another Kind of Hypothesis, Another Kind of Test 244
Working with repeated measures ANOVA 245
Repeated measures ANOVA in R 247
Visualizing the results 249
Getting Trendy 250
Trend Analysis in R 254
Chapter 13: More Complicated Testing 255
Cracking the Combinations 255
Interactions 257
The analysis 257
Two-Way ANOVA in R 259
Visualizing the two-way results 261
Two Kinds of Variables at Once 263
Mixed ANOVA in R 266
Visualizing the Mixed ANOVA results 268
After the Analysis 269
Multivariate Analysis of Variance 270
MANOVA in R 271
Visualizing the MANOVA results 273
After the analysis 275
Chapter 14: Regression: Linear, Multiple, and the General Linear Model 277
The Plot of Scatter 277
Graphing Lines 279
Regression: What a Line! 281
Using regression for forecasting 283
Variation around the regression line 283
Testing hypotheses about regression 285
Linear Regression in R 290
Features of the linear model 292
Making predictions 292
Visualizing the scatter plot and regression line 293
Plotting the residuals 294
Juggling Many Relationships at Once: Multiple Regression 295
Multiple regression in R 297
Making predictions 298
Visualizing the 3D scatter plot and regression plane 298
ANOVA: Another Look 301
Analysis of Covariance: The Final Component of the GLM 305
But wait — there’s more 311
Chapter 15: Correlation: The Rise and Fall of Relationships 313
Scatter plots Again 313
Understanding Correlation 314
Correlation and Regression 316
Testing Hypotheses About Correlation 319
Is a correlation coefficient greater than zero? 319
Do two correlation coefficients differ? 320
Correlation in R 322
Calculating a correlation coefficient 322
Testing a correlation coefficient 322
Testing the difference between two correlation coefficients 323
Calculating a correlation matrix 324
Visualizing correlation matrices 324
Multiple Correlation 326
Multiple correlation in R 327
Adjusting R-squared 328
Partial Correlation 329
Partial Correlation in R 330
Semipartial Correlation 331
Semipartial Correlation in R 332
Chapter 16: Curvilinear Regression: When Relationships Get Complicated 335
What Is a Logarithm? 336
What Is e? 338
Power Regression 341
Exponential Regression 346
Logarithmic Regression 350
Polynomial Regression: A Higher Power 354
Which Model Should You Use? 358
Part 4: Working with Probability 359
Chapter 17: Introducing Probability 361
What Is Probability? 361
Experiments, trials, events, and sample spaces 362
Sample spaces and probability 362
Compound Events 363
Union and intersection 363
Intersection again 364
Conditional Probability 365
Working with the probabilities 366
The foundation of hypothesis testing 366
Large Sample Spaces 366
Permutations 367
Combinations 368
R Functions for Counting Rules 369
Random Variables: Discrete and Continuous 371
Probability Distributions and Density Functions 371
The Binomial Distribution 374
The Binomial and Negative Binomial in R 375
Binomial distribution 375
Negative binomial distribution 377
Hypothesis Testing with the Binomial Distribution 378
More on Hypothesis Testing: R versus Tradition 380
Chapter 18: Introducing Modeling 383
Modeling a Distribution 383
Plunging into the Poisson distribution 384
Modeling with the Poisson distribution 385
Testing the model’s fit 388
A word about chisqtest() 391
Playing ball with a model 392
A Simulating Discussion 396
Taking a chance: The Monte Carlo method 396
Loading the dice 396
Simulating the central limit theorem 401
Part 5: The Part of Tens 405
Chapter 19: Ten Tips for Excel Emigrés 407
Defining a Vector in R Is Like Naming a Range in Excel 407
Operating on Vectors Is Like Operating on Named Ranges 408
Sometimes Statistical Functions Work the Same Way 412
And Sometimes They Don’t 412
Contrast: Excel and R Work with Different Data Formats 413
Distribution Functions Are (Somewhat) Similar 414
A Data Frame Is (Something) Like a Multicolumn Named Range 416
The sapply() Function Is Like Dragging 417
Using edit() Is (Almost) Like Editing a Spreadsheet 418
Use the Clipboard to Import a Table from Excel into R 419
Chapter 20: Ten Valuable Online R Resources 421
Websites for R Users 421
R-bloggers 421
Microsoft R Application Network 422
Quick-R 422
RStudio Online Learning 422
Stack Overflow 422
Online Books and Documentation 423
R manuals 423
R documentation 423
RDocumentation 423
YOU CANanalytics 423
The R Journal 424
Index 425
Erscheinungsdatum | 18.05.2017 |
---|---|
Sprache | englisch |
Maße | 185 x 234 mm |
Gewicht | 635 g |
Themenwelt | Mathematik / Informatik ► Mathematik ► Statistik |
Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik | |
ISBN-10 | 1-119-33706-2 / 1119337062 |
ISBN-13 | 978-1-119-33706-5 / 9781119337065 |
Zustand | Neuware |
Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
Haben Sie eine Frage zum Produkt? |
aus dem Bereich