Information Modeling and Relational Databases (eBook)
976 Seiten
Elsevier Science (Verlag)
978-0-08-056873-7 (ISBN)
Terry Halpin, a pioneer in the development of ORM, blends conceptual information with practical instruction that will let you begin using ORM effectively as soon as possible. Supported by examples, exercises, and useful background information, his step-by-step approach teaches you to develop a natural-language-based ORM model, and then, where needed, abstract ER and UML models from it. This book will quickly make you proficient in the modeling technique that is proving vital to the development of accurate and efficient databases that best meet real business objectives.
*Presents the most indepth coverage of Object-Role Modeling available anywhere, including a thorough update of the book for ORM2, as well as UML2 and E-R (Entity-Relationship) modeling.
*Includes clear coverage of relational database concepts, and the latest developments in SQL and XML, including a new chapter on the impact of XML on information modeling, exchange and transformation.
* New and improved case studies and exercises are provided for many topics.
* The book's associated web site provides answers to exercises, appendices, advanced SQL queries, and links to downloadable ORM tools.
Information Modeling and Relational Databases, Second Edition, provides an introduction to ORM (Object-Role Modeling)and much more. In fact, it is the only book to go beyond introductory coverage and provide all of the in-depth instruction you need to transform knowledge from domain experts into a sound database design. This book is intended for anyone with a stake in the accuracy and efficacy of databases: systems analysts, information modelers, database designers and administrators, and programmers. Terry Halpin, a pioneer in the development of ORM, blends conceptual information with practical instruction that will let you begin using ORM effectively as soon as possible. Supported by examples, exercises, and useful background information, his step-by-step approach teaches you to develop a natural-language-based ORM model, and then, where needed, abstract ER and UML models from it. This book will quickly make you proficient in the modeling technique that is proving vital to the development of accurate and efficient databases that best meet real business objectives. - Presents the most indepth coverage of Object-Role Modeling available anywhere, including a thorough update of the book for ORM2, as well as UML2 and E-R (Entity-Relationship) modeling- Includes clear coverage of relational database concepts, and the latest developments in SQL and XML, including a new chapter on the impact of XML on information modeling, exchange and transformation- New and improved case studies and exercises are provided for many topics
Front Cover 1
Information Modeling and Relational Databases 4
Copyright Page 5
Contents 8
Foreword by John Zachman 14
Foreword by Sjir Nijssen 18
Foreword by Gordon Everest 20
Preface 22
Chapter 1 Introduction 28
1.1 Information Modeling 29
1.2 Modeling Approaches 33
1.3 Some Historical Background 46
1.4 The Relevant Skills 50
1.5 Summary 51
Chapter 2 Information Levels and Frameworks 54
2.1 Four Information Levels 55
2.2 The Conceptual Level 59
2.3 Database Design Example 70
2.4 Development Frameworks 75
2.5 Summary 82
Chapter 3 Conceptual Modeling: First Steps 86
3.1 Conceptual Modeling Language Criteria 87
3.2 Conceptual Schema Design Procedure 89
3.3 CSDP Step 1: From Examples to Elementary Facts 90
3.4 CSDP Step 2: Draw Fact Types and Populate 108
3.5 CSDP Step 3: Trim Schema Note Basic Derivations
3.6 Summary 131
Chapter 4 Uniqueness Constraints 136
4.1 Introduction to CSDP Step 4 137
4.2 Uniqueness Constraints on Unaries and Binaries 138
4.3 Uniqueness Constraints on Longer Fact Types 149
4.4 External Uniqueness Constraints 155
4.5 Key Length Check 164
4.6 Projections and Joins 177
4.7 Summary 182
Chapter 5 Mandatory Roles 186
5.1 Introduction to CSDP Step 5 187
5.2 Mandatory and Optional Roles 189
5.3 Reference Schemes 201
5.4 Case Study: A Compact Disc Retailer 220
5.5 Logical Derivation Check 227
5.6 Summary 234
Chapter 6 Value, Set-Comparison, and Subtype Constraints 238
6.1 Introduction to CSDP Step 6 239
6.2 Basic Set Theory 239
6.3 Value Constraints and Independent Types 243
6.4 Subset, Equality, and Exclusion Constraints 251
6.5 Subtyping 265
6.6 Generalization of Object Types 287
6.7 Summary 295
Chapter 7 Other Constraints and Final Checks 298
7.1 Introduction to CSDP Step 7 299
7.2 Occurrence Frequencies 299
7.3 Ring Constraints 304
7.4 Other Constraints and Rules 316
7.5 Final Checks 322
7.6 Summary 330
Chapter 8 Entity Relationship Modeling 332
8.1 Overview of ER 333
8.2 Barker notation 335
8.3 Information Engineering notation 345
8.4 IDEF1X 349
8.5 Mapping from ORM to ER 361
8.6 Summary 369
Chapter 9 Data Modeling in UML 372
9.1 Introduction 373
9.2 Object-Orientation 375
9.3 Attributes 378
9.4 Associations 384
9.5 Set-Comparison Constraints 391
9.6 Subtyping 399
9.7 Other Constraints and Derivation Rules 403
9.8 Mapping from ORM to UML 415
9.9 Summary 422
Chapter 10 Advanced Modeling Issues 426
10.1 Join Constraints 427
10.2 Deontic Rules 435
10.3 Temporality 438
10.4 Collection Types 459
10.5 Nominalization and Objectification 466
10.6 Open/Closed World Semantics 477
10.7 Higher-Order Types 483
10.8 Summary 496
Chapter 11 Relational Mapping 500
11.1 Implementing a Conceptual Schema 501
11.2 Relational Schemas 502
11.3 Relational Mapping Procedure 510
11.4 Advanced Mapping Aspects 537
11.5 Summary 552
Chapter 12 Data Manipulation with Relational Languages 554
12.1 Relational Algebra 555
12.2 Relational Database Systems 581
12.3 SQL: Historical and Structural Overview 583
12.4 SQL: Identifiers and Data Types 585
12.5 SQL: Choosing Columns, Rows, and Order 589
12.6 SQL: Joins 597
12.7 SQL: In, Between, Like, and Null Operators 609
12.8 SQL: Union and Simple Subqueries 618
12.9 SQL: Scalar Operators and Bag Functions 629
12.10 SQL: Grouping 638
12.11 SQL: Correlated and Existential Subqueries 646
12.12 SQL: Recursive Queries 653
12.13 SQL: Updating Table Populations 656
12.14 Summary 658
Chapter 13 Using Other Database Objects 664
13.1 SQL: The Bigger Picture 665
13.2 SQL: Defining Tables 665
13.3 SQL: Views 673
13.4 SQL: Triggers 679
13.5 SQL: Routines 682
13.6 SQL: More Database Objects 685
13.7 Transactions and Concurrency 689
13.8 Security and Meta-Data 691
13.9 Exploiting XML 693
13.10 Summary 711
Chapter 14 Schema Transformations 714
14.1 Schema Equivalence and Optimization 715
14.2 Predicate Specialization and Generalization 719
14.3 Nesting, Coreferencing, and Flattening 729
14.4 Other Transformations 745
14.5 Conceptual Schema Optimization 749
14.6 Normalization 761
14.7 Denormalization and Low Level Optimization 780
14.8 Reengineering 786
14.9 Data Migration and Query Transformation 793
14.10 Summary 796
Chapter 15 Process and State Modeling 800
15.1 Introduction/Modeling Dynamic Behavior 801
15.2 Processes and Workflow 804
15.3 State Models 812
15.4 Foundations for Process Theory 822
15.5 Modeling Information Dynamics in UML 827
15.6 Business Process Standards Initiatives 839
15.7 Standard Process Patterns 846
15.8 Summary 859
Chapter 16 Other Modeling Aspects and Trends 862
16.1 Introduction 863
16.2 Data Warehousing and OLAP 863
16.3 Conceptual Query Languages 870
16.4 Schema Abstraction Mechanisms 879
16.5 Further Design Aspects 884
16.6 Ontologies and the Semantic Web 891
16.7 Postrelational Databases 898
16.8 Metamodeling 908
16.9 Summary 915
ORM glossary 920
A 920
B 920
C 920
D 921
E 921
F 921
G 921
I 921
M 921
N 921
O 922
P 922
R 922
S 922
T 922
U 922
V 922
ER glossary 930
UML glossary 934
Useful Web Sites 938
Bibliography 940
Index 952
Symbols and Numbers 952
A 952
B 953
C 953
D 956
E 957
F 958
G 958
H 959
I 959
J 960
K 960
L 960
M 960
N 961
O 961
P 963
Q 964
R 964
S 965
T 967
U 968
V 969
W 969
X 969
Y 969
Z 969
About the Authors 970
1 Introduction
1.2 Information Modeling Approaches
1.5 Summary
Information Modeling
It’s an unfortunate fact of life that names and numbers can sometimes be misinterpreted. This can prove costly, as experienced by senior citizens who had their social security benefits cut off when government agencies incorrectly pronounced them dead because of misreading “DOD” on hospital forms as “date of death” rather than the intended “date of discharge”.
A more costly incident occurred in 1999 when NASA’s $125 million Mars Climate Orbiter burnt up in the Martian atmosphere. Apparently, errors in its course settings arose from a failure to make a simple unit conversion. One team worked in U.S. customary units and sent its data to a second team working in metric, but no conversion was made. If a man weighs 180, does he need to go on a drastic diet? No if his mass is 180 lb, but yes if it’s 180 kg. Data by itself is not enough. What we really need is information, the meaning or semantics of the data. Since computers lack common sense, we need to pay special attention to semantics when we use computers to model some aspect of reality.
This book provides a modern introduction to database systems, with the emphasis on information modeling. At its heart is a very high level semantic approach that is fact-oriented in nature. If you model databases using either traditional or object-oriented approaches, you’ll find that fact orientation lifts your thinking to a higher level, illuminating your current way of doing things. Even if you’re a programmer rather than a database modeler, this semantic approach provides a natural and powerful way to design your data structures.
A database is basically a collection of related data (e.g., a company’s personnel records). When interpreted by humans, a database may be viewed as a set of related facts—an information base. In the context of our semantic approach, we’ll often use the popular term “database” instead of the more technical “information base”. Discovering the kinds of facts that underlie a business domain, and the rules that apply to the facts, is interesting and revealing. The quality of the database design used to capture these facts and rules is critical. Just as a house built from a good architectural plan is more likely to be safe and convenient for living, a well-designed database simplifies the task of ensuring that its facts are correct and easy to access. Let’s review some basic ideas about database systems, and then see how things can go wrong if they are poorly designed.
Each database models a business domain—we use this term to describe any area of interest, typically a part of the real world. Consider a library database. As changes occur in the library (e.g., a book is borrowed) the database is updated to reflect these changes. This task could be performed manually using a card catalog, or be automated with an online catalog, or both. Our focus is on automated databases. Sometimes these are implemented by means of special-purpose computer programs, coded in a general-purpose programming language (e.g., C#). More often, database applications are developed using a database management system (DBMS). This is a software system for maintaining databases and answering queries about them (e.g., DB2, Oracle, SQL Server). The same DBMS may handle many different databases.
Typical applications use a database to house the persistent data, an in-memory object model to hold transient data, and a friendly user interface for users to enter and access data. All these structures deal with information and are best derived from an information model that clearly reveals the underlying semantics of the domain. Some tools can use information models to automatically generate not just databases, but also object models and user interfaces.
If an application requires maintenance and retrieval of lots of data, a DBMS offers many advantages over manual record keeping. Data may be conveniently captured via electronic interfaces (e.g., screen forms), then quickly processed and stored compactly on disk. Many data errors can be detected automatically, and access rights to data can be enforced by the system. People can spend more time on creative design rather than on routine tasks more suited to computers. Finally, developing and documenting the application software can be facilitated by use of computer-assisted software engineering (CASE) tool support.
In terms of the dominant employment group, the Agricultural Age was supplanted late in the 19th century by the Industrial Age, which is now replaced by the Information Age. With the ongoing information explosion and mechanization of industry, the proportion of information workers is steadily rising. Most businesses achieve significant productivity gains by exploiting information technology. Imagine how long a newspaper firm would last if it returned to the methods used before word processing and computerized typesetting. Apart from its enabling employment opportunities, the ability to interact efficiently with information systems empowers us to exploit their information content.
Although most employees need to be familiar with information technology, there are vast differences in the amount and complexity of information management tasks required of these workers. Originally, most technical computer work was performed by computer specialists such as programmers and systems analysts. However, the advent of user-friendly software and powerful, inexpensive personal computers led to a redistribution of computing power. End users now commonly perform many information management tasks, such as spreadsheeting, with minimal reliance on professional computer experts.
This trend toward more users “driving” their own computer systems rather than relying on expert “chauffeurs” does not eliminate the need for computer specialists. There is still a need for programming in languages such as C# and Java. However, there is an increasing demand for high level skills such as modeling complex information systems.
The area of information systems engineering includes subdisciplines such as requirements analysis, database design, user interface design, and report writing. In one way or another, all these subareas deal with information. Since the database design phase selects the underlying structures to capture the relevant information, it is of central importance.
To highlight the need for good database design, let’s consider the task of designing a database to store movie details such as those shown in Table 1.1. The header of this table is shaded to help distinguish it from the rows of data. Even if the header is not shaded, we do not count it as a table row. The first row of data is fictitious.
Table 1.1 An output report about some motion pictures.
Different movies may have the same title (e.g., The Secret Garden). Hence movie numbers are used to provide a simple identifier. We interpret the data in terms of facts. For example, movie 5 has the title The DaVinci Code, was released in 2006, was directed by Ron Howard, and starred Tom Hanks, Ian McKellen, and Audrey Tautou. Movie 1, titled Cosmology, had no stars (it is a documentary). This table is an output report. It provides one way to view the data. This might not be the same as how the data is actually stored in a database.
In Table 1.1 each cell (row-column slot) may contain many values. For example, Movie 3 has two stars recorded in the row 3, column 5 cell. Some databases allow a cell to contain many values like this, but in a relational database each table cell may hold at most one value. Since relational database systems are dominant in the industry, our implementation discussion focuses on them. How can we design a relational database to store these facts?
Suppose we use the structure shown in Table 1.2. This has one entry in each cell. Here, “?” denotes a null (no star is recorded for Cosmology). Some DBMSs display nulls differently (e.g., “<NULL>” or a blank space). To help distinguish the rows, we’ve included lines between them. But from now on, we’ll omit lines between rows.
Table 1.2 A badly-designed relational database table.
Each relational table must be named. Here we called the...
Erscheint lt. Verlag | 27.7.2010 |
---|---|
Sprache | englisch |
Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
Mathematik / Informatik ► Informatik ► Programmiersprachen / -werkzeuge | |
Mathematik / Informatik ► Informatik ► Software Entwicklung | |
ISBN-10 | 0-08-056873-4 / 0080568734 |
ISBN-13 | 978-0-08-056873-7 / 9780080568737 |
Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich