Share on Facebook Share on Twitter Email
Answers.com

Query optimization

 
Wikipedia: Query optimization

Query optimization is a function of many relational database management systems in which multiple query plans for satisfying a query are examined and a good query plan is identified. This may or not be the absolute best strategy because there are many ways of doing plans. There is a trade-off between the amount of time spent figuring out the best plan and the amount running the plan. Different qualities of database management systems have different ways of balancing these two. Cost based query optimizers evaluate the resource footprint of various query plans and use this as the basis for plan selection.

Typically the resources which are costed are CPU path length, amount of disk buffer space, disk storage service time, and interconnect usage between units of parallelism. The set of query plans examined is formed by examining possible access paths (e.g., primary index access, secondary index access, full file scan) and various relational table join techniques (e.g., merge join, hash join, product join). The search space can become quite large depending on the complexity of the SQL query. There are two types of optimization. These consist of logical optimization which generates a sequence of relational algebra to solve the query. In addition there is physical optimization which is used to determine the means of carrying out each operation.

The goal of query optimization

The goal is to eliminate as many unneeded tuples, or rows as possible. The following is a look at relational algebra as it eliminates unneeded tuples.

The project operator is straightforward to implement if <attribute list> contains a key to relation R. If it does not include a key of R, it must be eliminated. This must be done by sorting (see sort methods below) and eliminating duplicates. This method can also use hashing to eliminate duplicates Hash table.

SQL command distinct: this does not change the actual data. This just eliminates the duplicates from the results.

Set operations: Database management heavily relies on the mathematical principles of set theory which is key in comprehending these operations.

Union: This displays all that appear in both sets, each listed once. These must be union compatible. This means all sequences of selected columns must designate the same number of columns. The data types of the corresponding columns must thereby comply with the conditions valid for comparability. Each data type can be compared to itself. Columns of data type CHAR with the different ASCII and EBCDIC code attributes can be compared to each other, whereby they are implicitly adapted. Columns with the ASCII code attribute can be compared to date, time, and timestamp specifications. All numbers can be compared to each other. In ANSI SQL mode, it is not enough for the data types and lengths of the specified columns to be compatible: they must match. Moreover, only column specifications or * may be specified in the selected columns of the QUERY specifications linked with UNION. Literals may not be specified (wisc).

Intersection: This only lists items whose keys appear in both lists. This too must be union compatible. See above for definition.

Set difference: This lists all items whose keys appear in the first list but not the second. This too must be union compatible.

Cartesian Product: Cartesian products (R * S) takes a lot of memory because its result contains a record for each combination of records from R and S.


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
 
 

 

Copyrights:

Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Query optimization" Read more