The optimal access path is determined after the alternative access paths are derived for the relational algebra expression. This chapter focus on query optimization in a centralized system.
Query processing for a centralized system is done to achieve:
- The response time of a query is minimized.
- The system throughput is maximized
- The memory and storage used for processing are reduced.
- Parallelism is increased.
Steps for Query Optimization
There are three steps for query optimization. They are -
Step 1 − Query Tree Generation
A relational algebra expression is represented by a tree data structure known as a query tree. Leaf nodes represent the tables of the query. The internal nodes represent the relational algebra operations and the complete query is represented by a root.
When the operand table is made available, the internal node is executed. The result table replaces the node and the process is continued until the result table replaces the root node.
Example 1
The query considered is as follows:
πEmpID(σEName="ArunKumar"(EMPLOYEE))
The query tree appears as follows:
Step 2 − Query Plan Generation
The query plan is prepared once the query tree is generated. All the operations of the query tree are included with access paths which are known as query plan. The relational operations on the performance of the tree are specified by the access paths. For instance, the access path for a selection operation provides information on the use of B+ tree index.
The information about the intermediate tables that are required to be passed from one operator to another is provided by a query plan. The information about the usage of temporary tables, combining the operations is provided by the query plan.
Step 3− Code Generation
The final step of query optimization is the generation of the code. The type of the underlying operating system determines the form of the query. The results are produced by running the query code thus generated by the Execution Manager.
No comments:
Post a Comment