R for SAS and SPSS Users (eBook)
XXVIII, 707 Seiten
Springer New York (Verlag)
978-1-4614-0685-3 (ISBN)
The glossary defines over 50 R terms using SAS/SPSS jargon and again using R jargon. The table of contents and the index allow you to find equivalent R functions by looking up both SAS statements and SPSS commands. When finished, you will be able to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses.
This new edition has updated programming, an expanded index, and even more statistical methods covered in over 25 new sections.
Robert A. Muenchen is the manager of the Statistical Consulting Center at the University of Tennessee and has 28 years of experience as a consulting statistician. He has served on the advisory boards of SPSS Inc. and the Statistical Graphics Corporation.
R is a powerful and free software system for data analysis and graphics, with over 5,000 add-on packages available. This book introduces R using SAS and SPSS terms with which you are already familiar. It demonstrates which of the add-on packages are most like SAS and SPSS and compares them to R's built-in functions. It steps through over 30 programs written in all three packages, comparing and contrasting the packages' differing approaches. The programs and practice datasets are available for download.The glossary defines over 50 R terms using SAS/SPSS jargon and again using R jargon. The table of contents and the index allow you to find equivalent R functions by looking up both SAS statements and SPSS commands. When finished, you will be able to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses. This new edition has updated programming, an expanded index, and even more statistical methods covered in over 25 new sections.
Robert A. Muenchen is the manager of the Statistical Consulting Center at the University of Tennessee and has 28 years of experience as a consulting statistician. He has served on the advisory boards of SPSS Inc. and the Statistical Graphics Corporation.
Preface 6
Contents 12
List of Tables 24
List of Figures 26
1 Introduction 30
1.1 Overview 30
1.2 Why Learn R? 31
1.3 Is R Accurate? 32
1.4 What About Tech Support? 33
1.5 Getting Started Quickly 34
1.6 The Five Main Parts of SAS and SPSS 34
1.7 Our Practice Data Sets 36
1.8 Programming Conventions 37
1.9 Typographic Conventions 38
2 Installing and Updating R 39
2.1 Installing Add-on Packages 39
2.2 Loading an Add-on Package 41
2.3 Updating Your Installation 43
2.4 Uninstalling R 45
2.5 Uninstalling a Package 45
2.6 Choosing Repositories 46
2.7 Accessing Data in Packages 46
3 Running R 49
3.1 Running R Interactively on Windows 49
3.2 Running R Interactively on Macintosh 52
3.3 Running R Interactively on Linux or UNIX 54
3.4 Running Programs That Include Other Programs 56
3.5 Running R in Batch Mode 57
3.6 Running R in SAS and WPS 58
3.6.1 SAS/IML Studio 58
3.6.2 A Bridge to R 59
3.6.3 The SAS X Command 59
3.6.4 Running SAS and R Sequentially 60
3.6.5 Example Program Running R from Within SAS 60
3.7 Running R in SPSS 61
3.7.1 Example Program Running R from Within SPSS 65
3.8 Running R in Excel 65
3.9 Running R from Within Text Editors 67
3.10 Integrated Development Environments 68
3.10.1 Eclipse 68
3.10.2 JGR 69
3.10.3 RStudio 70
3.11 Graphical User Interfaces 70
3.11.1 Deducer 71
3.11.2 R Commander 74
3.11.3 rattle 76
3.11.4 Red-R 79
4 Help and Documentation 81
4.1 Starting Help 81
4.2 Examples in Help Files 83
4.3 Help for Functions That Call Other Functions 85
4.4 Help for Packages 85
4.5 Help for Data Sets 86
4.6 Books and Manuals 86
4.7 E-mail Lists 86
4.8 Searching the Web 87
4.9 Vignettes 88
4.10 Demonstrations 88
5 Programming Language Basics 89
5.1 Introduction 89
5.2 Simple Calculations 90
5.3 Data Structures 91
5.3.1 Vectors 91
5.3.2 Factors 96
5.3.3 Data Frames 102
5.3.4 Matrices 106
5.3.5 Arrays 110
5.3.6 Lists 111
5.4 Saving Your Work 116
5.5 Comments to Document Your Programs 118
5.6 Comments to Document Your Objects 119
5.7 Controlling Functions (Procedures) 120
5.7.1 Controlling Functions with Arguments 120
5.7.2 Controlling Functions with Objects 123
5.7.3 Controlling Functions with Formulas 124
5.7.4 Controlling Functions with an Object's Class 124
5.7.5 Controlling Functions with Extractor Functions 127
5.8 How Much Output There? 128
5.9 Writing Your Own Functions (Macros) 133
5.10 Controlling Program Flow 135
5.11 R Program Demonstrating Programming Basics 136
6 Data Acquisition 143
6.1 Manual Data Entry Using the R Data Editor 143
6.2 Reading Delimited Text Files 145
6.2.1 Reading Comma-Delimited Text Files 146
6.2.2 Reading Tab-Delimited Text Files 148
6.2.3 Reading Text from a Web Site 149
6.2.4 Reading Text from the Clipboard 150
6.2.5 Missing Values for Character Variables 150
6.2.6 Trouble with Tabs 152
6.2.7 Skipping Variables in Delimited Text Files 153
6.2.8 Reading Character Strings 154
6.2.9 Example Programs for Reading Delimited Text Files SAS Program for Reading Delimited Text Files 154
6.3 Reading Text Data Within a Program 157
6.3.1 The Easy Approach 158
6.3.2 The More General Approach 159
6.3.3 Example Programs for Reading Text Data Within a Program SAS Program for Reading Text Data Within a Program 160
6.4 Reading Multiple Observations per Line 162
6.4.1 Example Programs for Reading Multiple Observations per Line Example SAS Program for Reading Multiple Observations per Line 164
6.5 Reading Data from the Keyboard 166
6.6 Reading Fixed-Width Text Files, One Record per Case 166
6.6.1 Reading Data Using Macro Substitution 169
6.6.2 Example Programs for Reading Fixed-Width Text Files, One Record per Case 170
6.7 Reading Fixed-Width Text Files, Two or More Records per Case 171
6.7.1 Example Programs to Read Fixed-Width Text Files with Two Records per Case SAS Program to Read Two Records per Case 173
6.8 Reading Excel Files 174
6.8.1 Example Programs for Reading Excel Files SAS Program for Reading Excel Files 175
6.9 Reading from Relational Databases 176
6.10 Reading Data from SAS 177
6.10.1 Example Programs to Write Data from SAS and Read It into R 178
6.11 Reading Data from SPSS 179
6.11.1 Example Programs for Reading Data from SPSS 180
6.12 Writing Delimited Text Files 181
6.12.1 Example Programs for Writing Delimited Text Files SAS Program for Writing Delimited Text Files 182
6.13 Viewing a Text File 184
6.14 Writing Excel Files 184
6.14.1 Example Programs for Writing Excel Files SAS Program for Writing Excel Files 185
6.15 Writing to Relational Databases 186
6.16 Writing Data to SAS and SPSS 186
6.16.1 Example Programs to Write Data to SAS and SPSS 187
7 Selecting Variables 189
7.1 Selecting Variables in SAS and SPSS 189
7.2 Subscripting 190
7.3 Selecting Variables by Index Number 191
7.4 Selecting Variables by Column Name 194
7.5 Selecting Variables Using Logic 195
7.6 Selecting Variables by String Search (varname: or varname1-varnameN) 197
7.7 Selecting Variables Using $ Notation 200
7.8 Selecting Variables by Simple Name 200
7.8.1 The attach Function 201
7.8.2 The with Function 202
7.8.3 Using Short Variable Names in Formulas 202
7.9 Selecting Variables with the subset Function 203
7.10 Selecting Variables by List Subscript 204
7.11 Generating Indices A to Z from Two Variable Names 204
7.11.1 Selecting Numeric or Character Variables 205
7.12 Saving Selected Variables to a New Data Set 208
7.13 Example Programs for Variable Selection 208
7.13.1 SAS Program to Select Variables 209
7.13.2 SPSS Program to Select Variables 209
7.13.3 R Program to Select Variables 210
8 Selecting Observations 215
8.1 Selecting Observations in SAS and SPSS 215
8.2 Selecting All Observations 216
8.3 Selecting Observations by Index Number 217
8.4 Selecting Observations Using Random Sampling 219
8.5 Selecting Observations by Row Name 221
8.6 Selecting Observations Using Logic 222
8.7 Selecting Observations by String Search 226
8.8 Selecting Observations with the subset Function 228
8.9 Generating Indices A to Z from Two Row Names 228
8.10 Variable Selection Methods with No Counterpart for Selecting Observations 229
8.11 Saving Selected Observations to a New Data Frame 229
8.12 Example Programs for Selecting Observations 230
8.12.1 SAS Program to Select Observations 230
8.12.2 SPSS Program to Select Observations 231
8.12.3 R Program to Select Observations 231
9 Selecting Variables and Observations 237
9.1 The subset Function 237
9.2 Subscripting with Logical Selections and Variable Names 239
9.3 Using Names to Select Both Observations and Variables 240
9.4 Using Numeric Index Values to Select Both Observations and Variables 241
9.5 Using Logic to Select Both Observations and Variables 241
9.6 Saving and Loading Subsets 242
9.7 Example Programs for Selecting Variables and Observations 243
9.7.1 SAS Program for Selecting Variables and Observations 243
9.7.2 SPSS Program for Selecting Variables and Observations 243
9.7.3 R Program for Selecting Variables and Observations 244
10 Data Management 246
10.1 Transforming Variables 246
10.1.1 Example Programs for Transforming Variables SAS Program for Transforming Variables 250
10.2 Procedures or Functions? The apply Function Decides 252
10.2.1 Applying the mean Function 252
10.2.2 Finding N or NVALID 256
10.2.3 Standardizing and Ranking Variables 258
10.2.4 Applying Your Own Functions 260
10.2.5 Example Programs for Applying Statistical Functions SAS Program for Applying Statistical Functions 261
10.3 Conditional Transformations 264
10.3.1 The ifelse Function 264
10.3.2 Cutting Functions 268
10.3.3 Example Programs for Conditional Transformations SAS Program for Conditional Transformations 269
10.4 Multiple Conditional Transformations 273
10.4.1 Example Programs for Multiple Conditional Transformations SAS Program for Multiple Conditional Transformations 275
10.5 Missing Values 277
10.5.1 Substituting Means for Missing Values 279
10.5.2 Finding Complete Observations 280
10.5.3 When /99" Has Meaning 281
10.5.4 Example Programs to Assign Missing Values SAS Program to Assign Missing Values 282
10.6 Renaming Variables (and Observations) 285
10.6.1 Advanced Renaming Examples 287
10.6.2 Renaming by Index 288
10.6.3 Renaming by Column Name 289
10.6.4 Renaming Many Sequentially Numbered Variable Names 290
10.6.5 Renaming Observations 291
10.6.6 Example Programs for Renaming Variables 291
10.7 Recoding Variables 295
10.7.1 Recoding a Few Variables 296
10.7.2 Recoding Many Variables 296
10.7.3 Example Programs for Recoding Variables SAS Program for Recoding Variables 299
10.8 Indicator or Dummy Variables 301
10.8.1 Example Programs for Indicator or Dummy Variables SAS Program for Indicator or Dummy Variables 304
10.9 Keeping and Dropping Variables 306
10.9.1 Example Programs for Keeping and Dropping Variables SAS Program for Keeping and Dropping Variables 307
10.10 Stacking/Concatenating/Adding Data Sets 308
10.10.1 Example Programs for Stacking/Concatenating/Adding Data Sets SAS Program for Stacking/Concatenating/Adding Data Sets 310
10.11 Joining/Merging Data Sets 312
10.11.1 Example Programs for Joining/Merging Data Sets SAS Program for Joining/Merging Data Sets 315
10.12 Creating Summarized or Aggregated Data Sets 317
10.12.1 The aggregate Function 317
10.12.2 The tapply Function 319
10.12.3 Merging Aggregates with Original Data 321
10.12.4 Tabular Aggregation 323
10.12.5 The plyr and reshape2 Packages 325
10.12.6 Comparing Summarization Methods 325
10.12.7 Example Programs for Aggregating/Summarizing Data SAS Program for Aggregating/Summarizing Data 326
10.13 By or Split-File Processing 329
10.13.1 Example Programs for By or Split-File Processing SAS Program for By or Split-File processing 333
10.14 Removing Duplicate Observations 335
10.14.1 Completely Duplicate Observations 335
10.14.2 Duplicate Keys 338
10.14.3 Example Programs for Removing Duplicates SAS Program for Removing Duplicates 338
10.15 Selecting First or Last Observations per Group 341
10.15.1 Example Programs for Selecting Last Observation per Group SAS Program for Selecting Last Observation per Group 344
10.16 Transposing or Flipping Data Sets 346
10.16.1 Example Programs for Transposing or Flipping Data Sets SAS Program for Transposing or Flipping Data Sets 349
10.17 Reshaping Variables to Observations and Back 351
10.17.1 Summarizing/Aggregating Data Using reshape2 355
10.17.2 Example Programs for Reshaping Variables to Observations and Back 357
10.18 Sorting Data Frames 360
10.18.1 Example Programs for Sorting Data Sets SAS Program for Sorting Data 363
10.19 Converting Data Structures 365
10.19.1 Converting from Logical to Numeric Index and Back 368
10.20 Character String Manipulations 369
10.20.1 Example Programs for Character String Manipulation SAS Program for Character String Manipulation 376
10.21 Dates and Times 381
10.21.1 Calculating Durations 385
10.21.2 Adding Durations to Date-Time Variables
389
10.21.3 Accessing Date-Time Elements
389
10.21.4 Creating Date-Time Variables from Elements
390
10.21.5 Logical Comparisons with Date-Time Variables
391
10.21.6 Formatting Date-Time Output
391
10.21.7 Two-Digit Years 392
10.21.8 Date-Time Conclusion
393
10.21.9 Example Programs for Dates and Times SAS Program for Dates and Times 393
11 Enhancing Your Output 401
11.1 Value Labels or Formats (and Measurement Level) 401
11.1.1 Character Factors 402
11.1.2 Numeric Factors 404
11.1.3 Making Factors of Many Variables 406
11.1.4 Converting Factors to Numeric or Character Variables 409
11.1.5 Dropping Factor Levels 410
11.1.6 Example Programs for Value Labels SAS Program to Assign Value Labels 411
11.1.7 R Program to Assign Value Labels and Factor Status 412
11.2 Variable Labels 415
11.2.1 Other Packages That Support Variable Labels 419
11.2.2 Example Programs for Variable Labels SAS Program for Variable Labels 419
11.3 Output for Word Processing and Web Pages 421
11.3.1 The xtable Package 422
11.3.2 Other Options for Formatting Output 424
11.3.3 Example Program for Formatting Output 424
12 Generating Data 426
12.1 Generating Numeric Sequences 427
12.2 Generating Factors 428
12.3 Generating Repetitious Patterns (Not Factors) 429
12.4 Generating Values for Reading Fixed-Width Files 430
12.5 Generating Integer Measures 431
12.6 Generating Continuous Measures 433
12.7 Generating a Data Frame 434
12.8 Example Programs for Generating Data 436
12.8.1 SAS Program for Generating Data 436
12.8.2 SPSS Program for Generating Data 437
12.8.3 R Program for Generating Data 438
13 Managing Your Files and Workspace 442
13.1 Loading and Listing Objects 442
13.2 Understanding Your Search Path 446
13.3 Attaching Data Frames 447
13.4 Loading Packages 449
13.5 Attaching Files 451
13.6 Removing Objects from Your Workspace 452
13.7 Minimizing Your Workspace 455
13.8 Setting Your Working Directory 455
13.9 Saving Your Workspace 456
13.9.1 Saving Your Workspace Manually 456
13.9.2 Saving Your Workspace Automatically 456
13.9.3 Getting Operating Systems to Show You .RData Files 457
13.9.4 Organizing Projects with Windows Shortcuts 457
13.10 Saving Your Programs and Output 458
13.11 Saving Your History 458
13.12 Large Data Set Considerations 460
13.13 Example R Program for Managing Files and Workspace 460
14 Graphics Overview 465
14.1 Dynamic Visualization 465
14.2 SAS/GRAPH 466
14.3 SPSS Graphics 466
14.4 R Graphics 467
14.5 The Grammar of Graphics 468
14.6 Other Graphics Packages 469
14.7 Graphics Archives 469
14.8 Graphics Demonstrations 469
14.9 Graphics Procedures and Graphics Systems 471
14.10 Graphics Devices 472
15 Traditional Graphics 475
15.1 The plot Function 475
15.2 Bar Plots 477
15.2.1 Bar Plots of Counts 477
15.2.2 Bar Plots for Subgroups of Counts 481
15.2.3 Bar Plots of Means 482
15.3 Adding Titles, Labels, Colors, and Legends 483
15.4 Graphics Parameters and Multiple Plots on a Page 486
15.5 Pie Charts 489
15.6 Dot Charts 490
15.7 Histograms 490
15.7.1 Basic Histograms 491
15.7.2 Histograms Stacked 493
15.7.3 Histograms Overlaid 494
15.8 Normal QQ Plots 499
15.9 Strip Charts 500
15.10 Scatter and Line Plots 504
15.10.1 Scatter Plots with Jitter 507
15.10.2 Scatter Plots with Large Data Sets 507
15.10.3 Scatter Plots with Lines 510
15.10.4 Scatter Plots with Linear Fit by Group 511
15.10.5 Scatter Plots by Group or Level (Coplots) 513
15.10.6 Scatter Plots with Con dence Ellipse 513
15.10.7 Scatter Plots with Con dence and Prediction Intervals 514
15.10.8 Plotting Labels Instead of Points 520
15.10.9 Scatter Plot Matrices 522
15.11 Dual-Axis Plots 524
15.12 Box Plots 526
15.13 Error Bar Plots 529
15.14 Interaction Plots 529
15.15 Adding Equations and Symbols to Graphs 529
15.16 Summary of Graphics Elements and Parameters 531
15.17 Plot Demonstrating Many Modi cations 531
15.18 Example Traditional Graphics Programs 532
15.18.1 SAS Program for Traditional Graphics 534
15.18.2 SPSS Program for Traditional Graphics 534
15.18.3 R Program for Traditional Graphics 535
16 Graphics with ggplot2 545
16.1 Introduction 545
16.1.1 Overview of qplot and ggplot 546
16.1.2 Missing Values 548
16.1.3 Typographic Conventions 549
16.2 Bar Plots 550
16.3 Pie Charts 552
16.4 Bar Plots for Groups 554
16.5 Plots by Group or Level 555
16.6 Presummarized Data 556
16.7 Dot Charts 558
16.8 Adding Titles and Labels 559
16.9 Histograms and Density Plots 560
16.9.1 Histograms 560
16.9.2 Density Plots 561
16.9.3 Histograms with Density Overlaid 562
16.9.4 Histograms for Groups, Stacked 563
16.9.5 Histograms for Groups, Overlaid 564
16.10 Normal QQ Plots 564
16.11 Strip Plots 565
16.12 Scatter Plots and Line Plots 568
16.12.1 Scatter Plots with Jitter 571
16.12.2 Scatter Plots for Large Data Sets 572
16.12.3 Scatter Plots with Fit Lines 577
16.12.4 Scatter Plots with Reference Lines 579
16.12.5 Scatter Plots with Labels Instead of Points 581
16.12.6 Changing Plot Symbols 583
16.12.7 Scatter Plot with Linear Fits by Group 584
16.12.8 Scatter Plots Faceted by Groups 585
16.12.9 Scatter Plot Matrix 586
16.13 Box Plots 588
16.14 Error Bar Plots 591
16.15 Geographic Maps 592
16.15.1 Finding and Converting Maps 597
16.16 Logarithmic Axes 598
16.17 Aspect Ratio 599
16.18 Multiple Plots on a Page 599
16.19 Saving ggplot2 Graphs to a File 601
16.20 An Example Specifying All Defaults 602
16.21 Summary of Graphics Elements and Parameters 603
16.22 Example Programs for Grammar of Graphics 604
16.22.1 SPSS Program for Graphics Production Language 604
16.22.2 R Program for ggplot2 607
17 Statistics 623
17.1 Scienti c Notation 623
17.2 Descriptive Statistics 624
17.2.1 The Deducer frequencies Function 624
17.2.2 The Hmisc describe Function 625
17.2.3 The summary Function 627
17.2.4 The table Function and Its Relatives 628
17.2.5 The mean Function and Its Relatives 630
17.3 Cross-Tabulation 631
17.3.1 The CrossTable Function 631
17.3.2 The table and chisq.test Functions 632
17.4 Correlation 636
17.4.1 The cor Function 638
17.5 Linear Regression 640
17.5.1 Plotting Diagnostics 644
17.5.2 Comparing Models 645
17.5.3 Making Predictions with New Data 646
17.6 t-Test: Independent Groups 646
17.7 Equality of Variance 648
17.8 t-Test: Paired or Repeated Measures 649
17.9 Wilcoxon{Mann{Whitney Rank Sum: Independent Groups 650
17.10 Wilcoxon Signed-Rank Test: Paired Groups 651
17.11 Sign Test: Paired Groups 652
17.12 Analysis of Variance 654
17.13 Sums of Squares 657
17.14 The Kruskal{Wallis Test 659
17.15 Example Programs for Statistical Tests 661
17.15.1 SAS Program for Statistical Tests 661
17.15.2 SPSS Program for Statistical Tests 663
17.15.3 R Program for Statistical Tests 665
18 Conclusion 670
References 685
Index 690
Erscheint lt. Verlag | 27.8.2011 |
---|---|
Sprache | englisch |
Themenwelt | Geisteswissenschaften ► Psychologie |
Informatik ► Datenbanken ► Data Warehouse / Data Mining | |
Mathematik / Informatik ► Informatik ► Software Entwicklung | |
Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
Mathematik / Informatik ► Mathematik ► Angewandte Mathematik | |
Mathematik / Informatik ► Mathematik ► Computerprogramme / Computeralgebra | |
Mathematik / Informatik ► Mathematik ► Statistik | |
Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik | |
Sozialwissenschaften ► Soziologie ► Empirische Sozialforschung | |
Technik | |
Schlagworte | R |
ISBN-10 | 1-4614-0685-4 / 1461406854 |
ISBN-13 | 978-1-4614-0685-3 / 9781461406853 |
Haben Sie eine Frage zum Produkt? |
Größe: 7,7 MB
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich