normalization
(computer science) Breaking down of complex data structures into flat files.
|
Results for database normalization
|
On this page:
|
(computer science) Breaking down of complex data structures into flat files.
The act of creating a lifestyle as similar as possible to that of a nonchallenged person for a person with disabilities.
The noun has one meaning:
Meaning #1:
the imposition of standards or regulations
Synonyms: standardization, standardisation, normalisation
Database normalization is a technique for designing relational database tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems, namely data anomalies. For example, when multiple instances of a given piece of information occur in a table, the possibility exists that these instances will not be kept consistent when the data within the table is updated, leading to a loss of data integrity. A table that is sufficiently normalized is less vulnerable to problems of this kind, because its structure reflects the basic assumptions for when multiple instances of the same information should be represented by a single instance only.
Higher degrees of normalization typically involve more tables and create the need for a larger number of joins, which can reduce performance. Accordingly, more highly normalized tables are typically used in database applications involving many isolated transactions (e.g. an Automated teller machine), while less normalized tables tend to be used in database applications that do not need to map complex relationships between data entities and data attributes (e.g. a reporting application, or a full-text search application).
Database theory describes a table's degree of normalization in terms of normal forms of successively higher degrees of strictness. A table in third normal form (3NF), for example, is consequently in second normal form (2NF) as well; but the reverse is not always the case.
Although the normal forms are often defined informally in terms of the characteristics of tables, rigorous definitions of the normal forms are concerned with the characteristics of mathematical constructs known as relations. Whenever information is represented relationally, it is meaningful to consider the extent to which the representation is normalized.
A table that is not sufficiently normalized can suffer from logical inconsistencies of various types, and from anomalies involving data operations. In such a table:
Ideally, a relational database table should be designed in such a way as to exclude the possibility of update, insertion, and deletion anomalies. The normal forms of relational database theory provide guidelines for deciding whether a particular design will be vulnerable to such anomalies. It is possible to correct an unnormalized design so as to make it adhere to the demands of the normal forms: this is called normalization.
Normalization typically involves decomposing an unnormalized table into two or more tables that, were they to be combined (joined), would convey exactly the same information as the original table.
Edgar F. Codd first proposed the process of normalization and what came to be known as the
1st normal form:
There is, in fact, a very simple elimination[1] procedure which we shall call normalization. Through decomposition non-simple domains are replaced by "domains whose elements are atomic (non-decomposable) values."
—Edgar F. Codd, A Relational Model of Data for Large Shared Data Banks[2]
In his paper, Edgar F. Codd used the term "non-simple" domains to describe a heterogeneous data structure, but later researchers would refer to such a structure as an abstract data type.
The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is to such inconsistencies and anomalies. Each table has a "highest normal form" (HNF): by definition, a table always meets the requirements of its HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the requirements of any normal form higher than its HNF.
The normal forms are applicable to individual tables; to say that an entire database is in normal form n is to say that all of its tables are in normal form n.
Newcomers to database design sometimes suppose that normalization proceeds in an iterative fashion, i.e. a 1NF design is first normalized to 2NF, then to 3NF, and so on. This is not an accurate description of how normalization typically works. A sensibly designed table is likely to be in 3NF on the first attempt; furthermore, if it is 3NF, it is overwhelmingly likely to have an HNF of 5NF. Achieving the "higher" normal forms (above 3NF) does not usually require an extra expenditure of effort on the part of the designer, because 3NF tables usually need no modification to meet the requirements of these higher normal forms.
Edgar F. Codd originally defined the first three normal forms (1NF, 2NF, and 3NF). These normal forms have been summarized as requiring that all non-key attributes be dependent on "the key, the whole key and nothing but the key". The fourth and fifth normal forms (4NF and 5NF) deal specifically with the representation of many-to-many and one-to-many relationships among attributes. Sixth normal form (6NF) incorporates considerations relevant to temporal databases.
A table is in first normal form (1NF) if and only if it faithfully represents a relation.[3] Given that database tables embody a relation-like form, the defining characteristic of one in first normal form is that it does not allow nulls or duplicate rows. Simply put, a table with a unique key and without any nullable columns is in 1NF.
A misconception looms from the common assertion that a 1NF table may not have repeating groups[4]. While that statement itself is axiomatic, experts disagree about what qualifies as a "repeating group", thus the precise definition of 1NF is the subject of some controversy. Notwithstanding, this theoretical uncertainty applies to relations, not tables. Table manifestations are intrinsically free of variable repeating groups because they are structurally constrained to the same number of columns in all rows.
See the first normal form article for a fuller discussion of the nuances of 1NF.
The criteria for second normal form (2NF) are:
The criteria for third normal form (3NF) are:
A table is in Boyce-Codd normal form (BCNF) if and only if, for every one of its non-trivial functional dependencies X → Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[6]
A table is in fourth normal form (4NF) if and only if, for every one of its non-trivial multivalued dependencies X →→ Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[7]
The criteria for fifth normal form (5NF and also PJ/NF) are:
Domain/key normal form (or DKNF) requires that a table not be subject to any constraints other than domain constraints and key constraints.
A table is in sixth normal form (6NF) if and only if it satisfies no non-trivial join dependencies at all.[8] This obviously means that the fifth normal form is also satisfied. The sixth normal form was only defined when extending the relational model to take into account the temporal dimension. Unfortunately, most current SQL technologies as of 2005 do not take into account this work, and most temporal extensions to SQL are not relational. See work by Date, Darwen and Lorentzos[9] for a relational temporal extension, Zimyani[10] for further discussion on Temporal Aggregation in SQL, or TSQL2 for a non-relational approach.
Databases intended for Online Transaction Processing (OLTP) are typically more normalized than databases intended for Online Analytical Processing (OLAP). OLTP Applications are characterized by a high volume of small transactions such as updating a sales record at a super market checkout counter. The expectation is that each transaction will leave the database in a consistent state. By contrast, databases intended for OLAP operations are primarily "read mostly" databases. OLAP applications tend to extract historical data that has accumulated over a long period of time. For such databases, redundant or "denormalized" data may facilitate Business Intelligence applications. Specifically, dimensional tables in a star schema often contain denormalized data. The denormalized or redundant data must be carefully controlled during ETL processing, and users should not be permitted to see the data until it is in a consistent state. The normalized alternative to the star schema is the snowflake schema. It has never been proven that this denormalization itself provides any increase in performance, or if the concurrent removal of data constraints is what increases the performance. The need for denormalization has waned as computers and RDBMS software have become more powerful.
Denormalization is also used to improve performance on smaller computers as in computerized cash-registers and mobile devices, since these may use the data for look-up only (e.g. price lookups). Denormalization may also be used when no RDBMS exists for a platform (such as Palm), or no changes are to be made to the data and a swift response is crucial.
In recognition that denormalization can be deliberate and useful, the non-first normal form is a definition of database designs which do not conform to the first normal form, by allowing "sets and sets of sets to be attribute domains" (Schek 1982). This extension is a (non-optimal) way of implementing hierarchies in relations. Some theoreticians have dubbed this practitioner developed method, "First Ab-normal Form", Codd defined a relational database as using relations, so any table not in 1NF could not be considered to be relational.
Consider the following table:
| Person | Favorite Colors |
|---|---|
| Bob | blue, red |
| Jane | green, yellow, red |
Assume a person has several favorite colors. Obviously, favorite colors consist of a set of colors modeled by the given table.
To transform this NF² table into a 1NF an "unnest" operator is required which extends the relational algebra of the higher normal forms. The reverse operator is called "nest" which is not always the mathematical inverse of "unnest", although "unnest" is the mathematical inverse to "nest". Another constraint required is for the operators to be bijective, which is covered by the Partitioned Normal Form (PNF).
| Database management systems | |
|---|---|
| Database
models · Database normalization · Database storage · Distributed DBMS · Referential integrity · Relational algebra · Relational calculus · Relational database · Relational DBMS · Relational model · |
|
| Concepts | Database · ACID · Null · Candidate key · Foreign key · Primary key · Superkey · Surrogate key |
| Objects | Trigger · View · Table · Cursor · Log · Transaction · Index · Stored procedure · Partition |
| SQL | Select · Insert · Update · Merge · Delete · Join · Union · Create · Drop · Begin work · Commit · Rollback · Truncate · Alter |
| Implementations | Relational · Flat file · Deductive · Dimensional · Hierarchical · Object-oriented · Object-relational · Temporal · XML data stores |
| Components | Concurrency control · Query language · Query optimizer · Query plan · ODBC · JDBC |
| Database products: Object-oriented (comparison) · Relational (comparison) | |
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)
Join the WikiAnswers Q&A community. Post a question or answer questions about "database normalization" at WikiAnswers.
Copyrights:
![]() | Sci-Tech Dictionary. McGraw-Hill Dictionary of Scientific and Technical Terms. Copyright © 2003, 1994, 1989, 1984, 1978, 1976, 1974 by McGraw-Hill Companies, Inc. All rights reserved. Read more | |
![]() | Dental Dictionary. Mosby's Dental Dictionary. Copyright © 2004 by Elsevier, Inc. All rights reserved. Read more | |
![]() | WordNet. WordNet 1.7.1 Copyright © 2001 by Princeton University. All rights reserved. Read more | |
![]() | Wikipedia. This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Database normalization". Read more |
Mentioned In: