Nicht aus der Schweiz? Besuchen Sie lehmanns.de
The Microsoft Data Warehouse Toolkit - Joy Mundy, Warren Thornthwaite

The Microsoft Data Warehouse Toolkit

With SQL Server 2008 R2 and the Microsoft Business Intelligence Toolset
Buch | Softcover
704 Seiten
2011 | 2nd edition
John Wiley & Sons Inc (Verlag)
978-0-470-64038-8 (ISBN)
CHF 64,10 inkl. MwSt
The techniques pioneered by the Kimball Group have become the industry standard for data warehouse design, development, and management. In this new edition of the Microsoft Data Warehouse Toolkit, the authors share best practices for using these techniques in SQL Server 2008 R2 and Office 2010.
Best practices and invaluable advice from world-renowned data warehouse experts In this book, leading data warehouse experts from the Kimball Group share best practices for using the upcoming “Business Intelligence release” of SQL Server, referred to as SQL Server 2008 R2. In this new edition, the authors explain how SQL Server 2008 R2 provides a collection of powerful new tools that extend the power of its BI toolset to Excel and SharePoint users and they show how to use SQL Server to build a successful data warehouse that supports the business intelligence requirements that are common to most organizations. Covering the complete suite of data warehousing and BI tools that are part of SQL Server 2008 R2, as well as Microsoft Office, the authors walk you through a full project lifecycle, including design, development, deployment and maintenance.



Features more than 50 percent new and revised material that covers the rich new feature set of the SQL Server 2008 R2 release, as well as the Office 2010 release
Includes brand new content that focuses on PowerPivot for Excel and SharePoint, Master Data Services, and discusses updated capabilities of SQL Server Analysis, Integration, and Reporting Services
Shares detailed case examples that clearly illustrate how to best apply the techniques described in the book
The accompanying Web site contains all code samples as well as the sample database used throughout the case studies

The Microsoft Data Warehouse Toolkit, Second Edition provides you with the knowledge of how and when to use BI tools such as Analysis Services and Integration Services to accomplish your most essential data warehousing tasks.

Joy Mundy is a member of the Kimball Group and has been focusing on data warehousing and business intelligence since the early 1990s. Warren Thornthwaite has been building decision support and data warehousing systems since 1980 and is a member of the Kimball Group. Ralph Kimball, PhD, is known worldwide as an innovator, writer, educator, speaker, and consultant in the field of data warehousing. He is the founder of the Kimball Group (www.kimballgroup.com), which provides data warehouse consulting and education.

Foreword xxvii

Introduction xxix

Part 1 Requirements, Realities, and Architecture 1

Chapter 1 Defining Business Requirements 3

The Most Important Determinant of Long-Term Success 5

Adventure Works Cycles Introduction 6

Uncovering Business Value 6

Obtaining Sponsorship 7

Defining Enterprise-Level Business Requirements 8

Prioritizing the Business Requirements 22

Revisiting the Project Planning 25

Gathering Project-Level Requirements 26

Summary 28

Chapter 2 Designing the Business Process Dimensional Model 29

Dimensional Modeling Concepts and Terminology 30

Facts 31

Dimensions 33

Bringing Facts and Dimensions Together 34

The Bus Matrix, Conformed Dimensions, and Drill Across 36

Additional Design Concepts and Techniques 38

Surrogate Keys 38

Slowly Changing Dimensions 39

Dates 42

Degenerate Dimensions 43

Snowflaking 43

Many-to-Many or Multivalued Dimensions 44

Hierarchies 47

Aggregate Dimensions 49

Junk Dimensions 51

The Three Fact Table Types 52

Aggregates 53

The Dimensional Modeling Process 54

Preparation 55

Data Profiling and Research 60

Building Dimensional Models 63

Developing the Detailed Dimensional Model 66

Testing and Refining the Model 68

Reviewing and Validating the Model 68

Case Study: The Adventure Works Cycles Orders Dimensional Model 69

The Orders Fact Table 69

The Dimensions 69

Identifying Dimension Attributes and Facts for the Orders Business Process 72

The Final Draft of the Initial Orders Model 74

Detailed Orders Dimensional Model Development 75

Final Dimensional Model 77

Summary 77

Chapter 3 The Toolset 79

The Microsoft DW/BI Toolset 80

Why Use the Microsoft Toolset? 82

Architecture of a Microsoft DW/BI System 83

Why Analysis Services? 84

Why a Relational Store? 86

ETL Is Not Optional 86

The Role of Master Data Services 88

Delivering BI Applications 88

Overview of the Microsoft Tools 89

Which Products Do You Need? 90

SQL Server Development and Management Tools 92

Summary 97

Chapter 4 System Setup 99

System Sizing Considerations 100

Calculating Data Volumes 101

Determining Usage Complexity 102

Estimating Simultaneous Users 104

Assessing System Availability Requirements 105

How Big Will It Be? 105

System Configuration Considerations 105

Memory 106

Monolithic or Distributed? 106

Storage System Considerations 110

Processors 113

Setting Up for High Availability 114

Software Installation and Configuration 115

Development Environment Software Requirements 116

Test and Production Software Requirements 120

Operating Systems 122

SQL Server Relational Database Setup 122

Analysis Services Setup 126

Integration Services Setup 129

Reporting Services Setup 130

Summary 131

Part 2 Building and Populating the Databases 133

Chapter 5 Creating the Relational Data Warehouse 135

Getting Started 136

Complete the Physical Design 137

Surrogate Keys 138

String Columns 138

To Null, or Not to Null? 140

Housekeeping Columns 140

Table and Column Extended Properties 142

Define Storage and Create Constraints and Supporting Objects 142

Create Files and Filegroups 142

Data Compression 144

Entity and Referential Integrity Constraints 145

Initial Indexing and Database Statistics 147

Aggregate Tables 150

Create Table Views 151

Insert an Unknown Member Row 152

Example CREATE TABLE Statement 152

Partitioned Tables 153

Finishing Up 163

Staging Tables 163

Metadata Setup 163

Summary 164

Chapter 6 Master Data Management 165

Managing Master Reference Data 166

Incomplete Attributes 167

Data Integration 168

Systems Integration 170

Master Data Management Systems and the Data Warehouse 171

Introducing SQL Server Master Data Services 171

Model Definition Features 172

Data Management Features 174

User Interface: Exploring and Managing the Master Data 174

Importing and Updating Data 176

Exporting Data 177

Full Versioning of All Attributes 179

Creating a Simple Application 179

Summary 186

Chapter 7 Designing and Developing the ETL System 187

Round Up the Requirements 188

Develop the ETL Plan 191

Introducing SQL Server Integration Services 192

Control Flow and Data Flow 194

SSIS Package Architecture 197

The Major Subsystems of ETL 198

Extracting Data 199

Subsystem 1: Data Profiling 199

Subsystem 2: Change Data Capture System 200

Subsystem 3: Extract System 202

Cleaning and Conforming Data 206

Subsystem 4: Data Cleaning System 206

Subsystem 5: Error Event Schema 214

Subsystem 6: Audit Dimension Assembler 215

Subsystem 7: Deduplication System 216

Subsystem 8: Conforming System 217

Delivering Data for Presentation 218

Subsystem 9: Slowly Changing Dimension Manager 218

Subsystem 10: Surrogate Key Generator 223

Subsystem 11: Hierarchy Manager 223

Subsystem 12: Special Dimensions Manager 224

Subsystem 13: Fact Table Builders 225

Subsystem 14: Surrogate Key Pipeline 229

Subsystem 15: Multi-Valued Dimension Bridge Table Builder 235

Subsystem 16: Late Arriving Data Handler 235

Subsystem 17: Dimension Manager 238

Subsystem 18: Fact Provider System 238

Subsystem 19: Aggregate Builder 239

Subsystem 20: OLAP Cube Builder 239

Subsystem 21: Data Propagation Manager 240

Managing the ETL Environment 240

Summary 243

Chapter 8 The Core Analysis Services OLAP Database 245

Overview of Analysis Services OLAP 247

Why Use Analysis Services? 247

Why Not Analysis Services? 249

Designing the OLAP Structure 250

Planning 251

Getting Started 253

Create a Project and a Data Source View 255

Dimension Designs 257

Creating and Editing Dimensions 261

Creating and Editing the Cube 274

Physical Design Considerations 291

Understanding Storage Modes 293

Developing the Partitioning Plan 294

Designing Performance Aggregations 296

Planning for Deployment 298

Processing the Full Cube 299

Developing the Incremental Processing Plan 299

Summary 304

Chapter 9 Design Requirements for Real-Time BI 305

Real-Time Triage 306

What Does Real-Time Mean? 306

Who Needs Real Time? 307

Real-Time Tradeoffs 308

Scenarios and Solutions 311

Executing Reports in Real Time 313

Serving Reports from a Cache 313

Creating an ODS with Mirrors and Snapshots 314

Creating an ODS with Replication 314

Building a BizTalk Application 315

Building a Real-Time Relational Partition 315

Querying Real-Time Data in the Relational Database 317

Using Analysis Services to Query Real-Time Data 318

Summary 319

Part 3 Developing the BI Applications 321

Chapter 10 Building BI Applications in Reporting Services 323

A Brief Overview of BI Applications 324

Types of BI Applications 325

The Value of Business Intelligence Applications 326

A High-Level Architecture for Reporting 328

Reviewing Business Requirements for Reporting 328

Examining the Reporting Services Architecture 330

Using Reporting Services as a Standard Reporting Tool 332

Reporting Services Assessment 339

The Reporting System Design and Development Process 340

Reporting System Design 341

Reporting System Development 348

Building and Delivering Reports 351

Planning and Preparation 351

Creating Reports 354

Reporting Operations 368

Ad Hoc Reporting Options 369

The Report Model 370

Shared Datasets 371

Report Parts 371

Summary 372

Chapter 11 PowerPivot and Excel 375

Using Excel for Analysis and Reporting 376

The PowerPivot Architecture: Excel on Steroids 378

Creating and Using PowerPivot Databases 380

Getting Started 381

PowerPivot Table Design 381

Creating Analytics with PowerPivot 385

Observations and Guidelines on PowerPivot for Excel 392

PowerPivot for SharePoint 394

The PowerPivot SharePoint User Experience 394

Server-Level Resources 397

PowerPivot Monitoring and Management 397

PowerPivot’s Role in a Managed DW/BI Environment 400

Summary 401

Chapter 12 The BI Portal and SharePoint 403

The BI Portal 404

Planning the BI Portal 405

Impact on Design 406

Business Process Categories 407

Additional Functions 408

Building the BI Portal 409

Using SharePoint as the BI Portal 411

Architecture and Concepts 412

Setting Up SharePoint 417

Summary 426

Chapter 13 Incorporating Data Mining 429

Defining Data Mining 430

Basic Data Mining Terminology 432

Business Uses of Data Mining 433

Roles and Responsibilities 440

SQL Server Data Mining Architecture Overview 440

The Data Mining Design Environment 442

Build, Deploy, and Process 442

Accessing the Mining Models 443

Integration Services and Data Mining 443

Additional Features 444

Architecture Summary 445

Microsoft Data Mining Algorithms 445

Decision Trees 446

Naïve Bayes 447

Clustering 448

Sequence Clustering 448

Time Series 449

Association 449

Neural Network 449

The Data Mining Process 450

The Business Phase 451

The Data Mining Phase 453

The Operations Phase 460

Metadata 462

Data Mining Examples 463

Case Study: Categorizing Cities 463

Case Study: Product Recommendations 472

Summary 488

Part 4 Deploying and Managing the DW/BI System 491

Chapter 14 Designing and Implementing Security 493

Identifying the Security Manager 494

Securing the Hardware and Operating System 495

Securing the Operating System 495

Using Windows Integrated Security 496

Securing the Development Environment 497

Securing the Data 498

Providing Open Access for Internal Users 498

Itemizing Sensitive Data 500

Securing Various Types of Data Access 500

Securing the Components of the DW/BI System 502

Reporting Services Security 502

Analysis Services Security 505

Relational DW Security 514

Integration Services Security 520

Usage Monitoring 521

Summary 521

Chapter 15 Metadata Plan 523

Metadata Basics 524

The Purpose of Metadata 524

Metadata Categories 525

The Metadata Repository 526

Metadata Standards 526

SQL Server 2008 R2 Metadata 527

Cross-Tool Components 528

Relational Engine Metadata 532

Analysis Services 532

Integration Services 533

Reporting Services 533

Master Data Services 534

SharePoint 534

External Metadata Sources 534

Looking to the Future 535

A Practical Metadata Approach 535

Creating the Metadata Strategy 536

Business Metadata Reporting 538

 

Process Metadata Reporting 541

Technical Metadata Reporting 542

Ongoing Metadata Management 543

Summary 543

Chapter 16 Deployment 545

Setting Up the Environments 546

Testing 550

Development Testing 551

System Testing 555

Data Quality Assurance Testing 557

Performance Testing 559

Usability Testing 562

Testing Summary 563

Deploying to Production 564

Relational Database Deployment 565

Integration Services Package Deployment 567

Analysis Services Database Deployment 568

Reporting Services Report Deployment 571

Master Data Services Deployment 572

Data Warehouse and BI Documentation 573

Core Descriptions 573

Additional Documentation 575

User Training 576

User Support 579

Desktop Readiness and Configuration 580

Summary 581

Chapter 17 Operations and Maintenance 583

Providing User Support 584

Maintaining the BI Portal 585

Extending the BI Applications 586

System Management 587

Governing the DW/BI System 588

Performance Monitoring 593

Usage Monitoring 600

Managing Disk Space 602

Service and Availability Management 603

Performance Tuning the DW/BI System 604

Backup and Recovery 606

Executing the ETL Packages 611

Summary 611

Chapter 18 Present Imperatives and Future Outlook 613

Growing the DW/BI System 613

Lifecycle Review with Common Problems 615

Phase I — ​Requirements, Realities, Plans, and Designs 616

Phase II — ​Developing the Databases 616

Phase III — ​Developing the BI Applications and Portal Environment 617

Phase IV — ​Deploying and Managing the DW/BI System 618

Iteration and Growth 618

What We Like in the Microsoft BI Toolset 619

Future Directions: Room for Improvement 620

Conclusion 623

Index 625

Erscheint lt. Verlag 5.4.2011
Co-Autor Ralph Kimball
Verlagsort New York
Sprache englisch
Maße 188 x 229 mm
Gewicht 1043 g
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Informatik Weitere Themen Hardware
ISBN-10 0-470-64038-3 / 0470640383
ISBN-13 978-0-470-64038-8 / 9780470640388
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
CHF 104,90
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
CHF 62,85