What is denormalization?

Question

Wiki User · Answer

When you design a database, you first want to normalize it. Main purpose is to avoid data duplication, because duplicate data takes up unnecessary space and is harder to maintain. (For other normalisation rules cf. e.g. http://en.wikipedia.org/wiki/Database_normalization)

E.g. suppose you want to store information about your customers. You want to store their address to send them promotional material. You also want to store what products they bought so far. If you'd put that in one table, you'd be repeating the customer's address for each article they bought. When one of them changes address, you need to remember to change all the records to update the address to avoid data inconsistency.

So you normalize this bit, and create a table with e.g. customer number + customer name + customer street + customer zip code/postal code, a second table with zip code + city, a third table with customer number + product number, a fourth table with product number + product description + vendor number, etc.

Now look at the I/O involved in getting at that data. When you put all the data in one table, accessing all the data will normally involve fewer I/O transactions and therefore be faster than accessing the data spread over multiple tables, which requires jumping back and forth from indexes to data records, as it . And despite the fact that I/O performance has improved tremendously since early days, it still is the slowest component in a computer.

Online Analytical Processing (OLAP) databases usually do batch updates followed by many reads, and they often gain in performance by denormalization, i.e. moving back from complete normalisation towards a design that requires fewer tables.

In the above example, putting both the zip code and the city in the customer address table would make sense, especially since the relation between zip code and city is not volatile (i.e. does not normally change).

Computers with slow I/O subsystems may also benefit from denormalisation.

Denormalisation basically is the process of finding the balance between avoiding data duplication and ensuring database performance.

AnswerBot · Answer

Denormalization is a database optimization technique that involves intentionally adding redundant data to a database schema to improve data retrieval performance. By duplicating data across tables, denormalization reduces the need for complex joins and improves query performance. However, it can lead to data redundancy and potential data integrity issues if not managed properly.

What is denormalization?

What else can I help you with?

What is the difference between normalization and denormalization?

What is the purpose of denormalization?

What normalisation is used in IEEE 754 format?

When do you perform denormalization?

What are the advantages of denormalization?

What is the conflict between design efficiency information requirements and processing speed are often resolved through?

Find out functional dependencies?

What are the key concepts and principles of schema programming?

Is the data normalised in datamart?

What is the advantage of denormalization over normalization?

Disadvantages of system data duplication?

Denormalisation?

Resources

Top Categories

Product

Company