Database Modeling and Design (eBook)
296 Seiten
Elsevier Science (Verlag)
978-0-08-047077-1 (ISBN)
Database Modeling and Design, Fourth Edition, the extensively revised edition of the classic logical database design reference, explains how you can model and design your database application in consideration of new technology or new business needs. It is an ideal text for a stand-alone data management course focused on logical database design, or a supplement to an introductory text for introductory database management.
This book features clear explanations, lots of terrific examples and an illustrative case, and practical advice, with design rules that are applicable to any SQL-based system. The common examples are based on real-life experiences and have been thoroughly class-tested. The text takes a detailed look at the Unified Modeling Language (UML-2) as well as the entity-relationship (ER) approach for data requirements specification and conceptual modeling - complemented with examples for both approaches. It also discusses the use of data modeling concepts in logical database design; the transformation of the conceptual model to the relational model and to SQL syntax; the fundamentals of database normalization through the fifth normal form; and the major issues in business intelligence such as data warehousing, OLAP for decision support systems, and data mining. There are examples for how to use the most popular CASE tools to handle complex data modeling problems, along with exercises that test understanding of all material, plus solutions for many exercises. Lecture notes and a solutions manual are also available.
This edition will appeal to professional data modelers and database design professionals, including database application designers, and database administrators (DBAs); new/novice data management professionals, such as those working on object oriented database design; and students in second courses in database focusing on design.
+ a detailed look at the Unified Modeling Language (UML-2) as well as the entity-relationship (ER) approach for data requirements specification and conceptual modeling--with examples throughout the book in both approaches!+ the details and examples of how to use data modeling concepts in logical database design, and the transformation of the conceptual model to the relational model and to SQL syntax;
+ the fundamentals of database normalization through the fifth normal form;
+ practical coverage of the major issues in business intelligence--data warehousing, OLAP for decision support systems, and data mining;
+ examples for how to use the most popular CASE tools to handle complex data modeling problems.
+ Exercises that test understanding of all material, plus solutions for many exercises.
Toby J. Teorey is a professor in the Electrical Engineering and Computer Science Department at the University of Michigan, Ann Arbor. He received his B.S. and M.S. degrees in electrical engineering from the University of Arizona, Tucson, and a Ph.D. in computer sciences from the University of Wisconsin, Madison. He was general chair of the 1981 ACM SIGMOD Conference and program chair for the 1991 Entity-Relationship Conference. Professor Teorey's current research focuses on database design and data warehousing, OLAP, advanced database systems, and performance of computer networks. He is a member of the ACM and the IEEE Computer Society.
Database Modeling and Design, Fourth Edition, the extensively revised edition of the classic logical database design reference, explains how you can model and design your database application in consideration of new technology or new business needs. It is an ideal text for a stand-alone data management course focused on logical database design, or a supplement to an introductory text for introductory database management. This book features clear explanations, lots of terrific examples and an illustrative case, and practical advice, with design rules that are applicable to any SQL-based system. The common examples are based on real-life experiences and have been thoroughly class-tested. The text takes a detailed look at the Unified Modeling Language (UML-2) as well as the entity-relationship (ER) approach for data requirements specification and conceptual modeling - complemented with examples for both approaches. It also discusses the use of data modeling concepts in logical database design; the transformation of the conceptual model to the relational model and to SQL syntax; the fundamentals of database normalization through the fifth normal form; and the major issues in business intelligence such as data warehousing, OLAP for decision support systems, and data mining. There are examples for how to use the most popular CASE tools to handle complex data modeling problems, along with exercises that test understanding of all material, plus solutions for many exercises. Lecture notes and a solutions manual are also available. This edition will appeal to professional data modelers and database design professionals, including database application designers, and database administrators (DBAs); new/novice data management professionals, such as those working on object oriented database design; and students in second courses in database focusing on design. + a detailed look at the Unified Modeling Language (UML-2) as well as the entity-relationship (ER) approach for data requirements specification and conceptual modeling--with examples throughout the book in both approaches! + the details and examples of how to use data modeling concepts in logical database design, and the transformation of the conceptual model to the relational model and to SQL syntax; + the fundamentals of database normalization through the fifth normal form;+ practical coverage of the major issues in business intelligence--data warehousing, OLAP for decision support systems, and data mining; + examples for how to use the most popular CASE tools to handle complex data modeling problems. + Exercises that test understanding of all material, plus solutions for many exercises.
front cover 1
copyright 7
table of contents 10
front matter 16
Preface 16
Organization 17
Typographical Conventions 18
Acknowledgments 19
Solutions Manual 19
body 20
1 Introduction 20
1.1 Data and Database Management 21
1.2 The Database Life Cycle 22
1.3 Conceptual Data Modeling 27
1.4 Summary 30
1.5 Literature Summary 30
2 The Entity-Relationship Model 32
2.1 Fundamental ER Constructs 32
2.1.1 Basic Objects: Entities, Relationships, Attributes 32
2.1.2 Degree of a Relationship 35
2.1.3 Connectivity of a Relationship 37
2.1.4 Attributes of a Relationship 38
2.1.5 Existence of an Entity in a Relationship 38
2.1.6 Alternative Conceptual Data Modeling Notations 39
2.2 Advanced ER Constructs 42
2.2.1 Generalization: Supertypes and Subtypes 42
2.2.2 Aggregation 44
2.2.3 Ternary Relationships 44
2.2.4 General n-ary Relationships 47
2.2.5 Exclusion Constraint 48
2.2.6 Referential Integrity 49
2.3 Summary 49
2.4 Literature Summary 50
3 The Unified Modeling Language (UML) 52
3.1 Class Diagrams 53
3.1.1 Basic Class Diagram Notation 54
3.1.2 Class Diagrams for Database Design 56
3.1.3 Example from the Music Industry 62
3.2 Activity Diagrams 65
3.2.1 Activity Diagram Notation Description 65
3.2.2 Activity Diagrams for Workflow 67
3.3 Rules of Thumb for UML Usage 69
3.4 Summary 70
3.5 Literature Summary 70
4 Requirements Analysis and Conceptual Data Modeling 72
4.1 Introduction 72
4.2 Requirements Analysis 73
4.3 Conceptual Data Modeling 74
4.3.1 Classify Entities and Attributes 75
4.3.2 Identify the Generalization Hierarchies 76
4.3.3 Define Relationships 77
4.3.4 Example of Data Modeling: Company Personnel and Project Database 80
4.4 View Integration 85
4.4.1 Preintegration Analysis 86
4.4.2 Comparison of Schemas 87
4.4.3 Conformation of Schemas 87
4.4.4 Merging and Restructuring of Schemas 88
4.4.5 Example of View Integration 88
4.5 Entity Clustering for ER Models 93
4.5.1 Clustering Concepts 94
4.5.2 Grouping Operations 95
4.5.3 Clustering Technique 97
4.6 Summary 100
4.7 Literature Summary 101
5 Transforming the Conceptual Data Model to SQL 102
5.1 Transformation Rules and SQL Constructs 102
5.1.1 Binary Relationships 104
5.1.2 Binary Recursive Relationships 109
5.1.3 Ternary and n-ary Relationships 111
5.1.4 Generalization and Aggregation 120
5.1.5 Multiple Relationships 122
5.1.6 Weak Entities 122
5.2 Transformation Steps 122
5.2.1 Entity Transformation 123
5.2.2 Many-to-Many Binary Relationship Transformation 123
5.2.3 Ternary Relationship Transformation 124
5.2.4 Example of ER-to-SQL Transformation 124
5.3 Summary 125
5.4 Literature Summary 125
6 Normalization 126
6.1 Fundamentals of Normalization 126
6.1.1 First Normal Form 128
6.1.2 Superkeys, Candidate Keys, and Primary Keys 128
6.1.3 Second Normal Form 130
6.1.4 Third Normal Form 132
6.1.5 Boyce-Codd Normal Form 134
6.2 The Design of Normalized Tables: A Simple Example 135
6.3 Normalization of Candidate Tables Derived from ER Diagrams 137
6.4 Determining the Minimum Set of 3NF Tables 141
6.5 Fourth and Fifth Normal Forms 146
6.5.1 Multivalued Dependencies 146
6.5.2 Fourth Normal Form 148
6.5.3 Decomposing Tables to 4NF 151
6.5.4 Fifth Normal Form 152
6.6 Summary 156
6.7 Literature Summary 157
7 An Example of Logical Database Design 158
7.1 Requirements Specification 158
7.1.1 Design Problems 159
7.2 Logical Design 160
7.3 Summary 164
8 Business Intelligence 166
8.1 Data Warehousing 167
8.1.1 Overview of Data Warehousing 167
8.1.2 Logical Design 171
8.2 Online Analytical Processing (OLAP) 185
8.2.1 The Exponential Explosion of Views 186
8.2.2 Overview of OLAP 188
8.2.3 View Size Estimation 189
8.2.4 Selection of Materialized Views 192
8.2.5 View Maintenance 195
8.2.6 Query Optimization 196
8.3 Data Mining 197
8.3.1 Forecasting 198
8.3.2 Text Mining 200
8.4 Summary 204
8.5 Literature Summary 205
9 CASE Tools for Logical Database Design 206
9.1 Introduction to the CASE Tools 207
9.2 Key Capabilities to Watch For 210
9.3 The Basics 211
9.4 Generating a Database from a Design 215
9.5 Database Support 218
9.6 Collaborative Support 219
9.7 Distributed Development 220
9.8 Application Life Cycle Tooling Integration 221
9.9 Design Compliance Checking 223
9.10 Reporting 225
9.11 Modeling a Data Warehouse 226
9.12 Semi-Structured Data, XML 228
9.13 Summary 230
9.14 Literature Summary 230
back matter 232
Appendix: The Basics of SQL 232
A.1 SQL Names and Operators 233
A.2 Data Definition Language (DDL) 234
A.3 Data Manipulation Language (DML) 237
A.3.1 SQL Select Command 238
A.3.2 SQL Update Commands 245
A.3.3 Referential Integrity 246
A.3.4 SQL Views 247
A.4 References 248
Glossary 250
References 258
Exercises 268
ER and UML Conceptual Data Modeling 268
Problem 2-1 268
Problem 2-2 268
Problem 3-1 269
Problem 3-2 269
Conceptual Data Modeling and Integration 270
Problem 4-1 270
Transformation of the Conceptual Model to SQL 271
Problem 5-1 271
Normalization and Minimum Set of Tables 271
Problem 6-1 271
Problem 6-2 271
Problem 6-3 272
Problem 6-4 272
Problem 6-5 273
Problem 6-6 274
Problem 6-7 274
Problem 6-8 274
Problem 6-9 275
Problem 6-10 275
Logical Database Design (Generic Problem) 276
Problem 7-1 276
OLAP 277
Problem 8-1 277
Solutions to Selected Exercises 278
Problem 2-2 278
Problem 4-1 279
Problem 6-1 279
Problem 6-3 279
Problem 6-5 279
Problem 6-7 280
About the Authors 282
index 284
1 Introduction
Database technology has evolved rapidly in the three decades since the rise and eventual dominance of relational database systems. While many specialized database systems (object-oriented, spatial, multimedia, etc.) have found substantial user communities in the science and engineering fields, relational systems remain the dominant database technology for business enterprises.
Relational database design has evolved from an art to a science that has been made partially implementable as a set of software design aids. Many of these design aids have appeared as the database component of computer-aided software engineering (CASE) tools, and many of them offer interactive modeling capability using a simplified data modeling approach. Logical design—that is, the structure of basic data relationships and their definition in a particular database system—is largely the domain of application designers. These designers can work effectively with tools such as ERwin Data Modeler or Rational Rose with UML, as well as with a purely manual approach. Physical design, the creation of efficient data storage and retrieval mechanisms on the computing platform being used, is typically the domain of the database administrator (DBA). Today’s DBAs have a variety of vendor-supplied tools available to help design the most efficient databases. This book is devoted to the logical design methodologies and tools most popular for relational databases today. Physical design methodologies and tools are covered in a separate book.
In this chapter, we review the basic concepts of database management and introduce the role of data modeling and database design in the database life cycle.
1.1 Data and Database Management
The basic component of a file in a file system is a data item, which is the smallest named unit of data that has meaning in the real world—for example, last name, first name, street address, ID number, or political party. A group of related data items treated as a single unit by an application is called a record. Examples of types of records are order, salesperson, customer, product, and department. A file is a collection of records of a single type. Database systems have built upon and expanded these definitions: In a relational database, a data item is called a column or attribute; a record is called a row or tuple; and a file is called a table.
A database is a more complex object; it is a collection of interrelated stored data that serves the needs of multiple users within one or more organizations, that is, interrelated collections of many different types of tables. The motivations for using databases rather than files include greater availability to a diverse set of users, integration of data for easier access to and updating of complex transactions, and less redundancy of data.
A database management system (DBMS) is a generalized software system for manipulating databases. A DBMS supports a logical view (schema, subschema); physical view (access methods, data clustering); data definition language; data manipulation language; and important utilities, such as transaction management and concurrency control, data integrity, crash recovery, and security. Relational database systems, the dominant type of systems for well-formatted business databases, also provide a greater degree of data independence than the earlier hierarchical and network (CODASYL) database management systems. Data independence is the ability to make changes in either the logical or physical structure of the database without requiring reprogramming of application programs. It also makes database conversion and reorganization much easier. Relational DBMSs provide a much higher degree of data independence than previous systems; they are the focus of our discussion on data modeling.
1.2 The Database Life Cycle
The database life cycle incorporates the basic steps involved in designing a global schema of the logical database, allocating data across a computer network, and defining local DBMS-specific schemas. Once the design is completed, the life cycle continues with database implementation and maintenance. This chapter contains an overview of the database life cycle, as shown in Figure 1.1. In succeeding chapters, we will focus on the database design process from the modeling of requirements through logical design (steps I and II below). The result of each step of the life cycle is illustrated with a series of diagrams in Figure 1.2. Each diagram shows a possible form of the output of each step, so the reader can see the progression of the design process from an idea to actual database implementation. These forms are discussed in much more detail in Chapters 2 through 6.
I. Requirements analysis. The database requirements are determined by interviewing both the producers and users of data and using the information to produce a formal requirements specification. That specification includes the data required for processing, the natural data relationships, and the software platform for the database implementation. As an example, Figure 1.2 (step I) shows the concepts of products, customers, salespersons, and orders being formulated in the mind of the end user during the interview process.
II. Logical design. The global schema, a conceptual data model diagram that shows all the data and their relationships, is developed using techniques such as ER or UML. The data model constructs must ultimately be transformed into normalized (global) relations, or tables. The global schema development methodology is the same for either a distributed or centralized database.
a. Conceptual data modeling. The data requirements are analyzed and modeled using an ER or UML diagram that includes, for example, semantics for optional relationships, ternary relationships, supertypes, and subtypes (categories). Processing requirements are typically specified using natural language expressions or SQL commands, along with the frequency of occurrence. Figure 1.2 [step II(a)] shows a possible ER model representation of the product/customer database in the mind of the end user.
Figure 1.1 The database life cycle
Figure 1.2 Life cycle results, step-by-step
b. View integration. Usually, when the design is large and more than one person is involved in requirements analysis, multiple views of data and relationships result. To eliminate redundancy and inconsistency from the model, these views must eventually be “rationalized” (resolving inconsistencies due to variance in taxonomy, context, or perception) and then consolidated into a single global view. View integration requires the use of ER semantic tools such as identification of synonyms, aggregation, and generalization. In Figure 1.2 [step II(b)], two possible views of the product/customer database are merged into a single global view based on common data for customer and order. View integration is also important for application integration.
c. Transformation of the conceptual data model to SQL tables. Based on a categorization of data modeling constructs and a set of mapping rules, each relationship and its associated entities are transformed into a set of DBMS-specific candidate relational tables. We will show these transformations in standard SQL in Chapter 5. Redundant tables are eliminated as part of this process. In our example, the tables in step II(c) of Figure 1.2 are the result of transformation of the integrated ER model in step II(b).
d. Normalization of tables. Functional dependencies (FDs) are derived from the conceptual data model diagram and the semantics of data relationships in the requirements analysis. They represent the dependencies among data elements that are unique identifiers (keys) of entities. Additional FDs that represent the dependencies among key and nonkey attributes within entities can be derived from the requirements specification. Candidate relational tables associated with all derived FDs are normalized (i.e., modified by decomposing or splitting tables into smaller tables) using standard techniques. Finally, redundancies in the data in normalized candidate tables are analyzed further for possible elimination, with the constraint that data integrity must be preserved. An example of normalization of the Salesperson table into the new Salesperson and SalesVacations tables is shown in Figure 1.2 from step II(c) to step II(d).
III. Physical design. The physical design step involves the selection of indexes (access methods), partitioning, and clustering of data. The logical design methodology in step II simplifies the approach to designing large relational databases by reducing the number of data dependencies that need to be analyzed. This is accomplished by inserting conceptual data modeling and integration steps [steps II(a) and II(b) of Figure 1.2] into the traditional relational design approach....
Erscheint lt. Verlag | 5.8.2010 |
---|---|
Sprache | englisch |
Themenwelt | Sachbuch/Ratgeber |
Mathematik / Informatik ► Informatik ► Datenbanken | |
Sozialwissenschaften ► Kommunikation / Medien ► Buchhandel / Bibliothekswesen | |
ISBN-10 | 0-08-047077-7 / 0080470777 |
ISBN-13 | 978-0-08-047077-1 / 9780080470771 |
Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich