Reference no: EM13336589
Design and tuning considerations
Question 1
Consider the following BCNF relational schema for a portion of a company database (type information is not relevant to this question and is omitted):
Project (pno, proj_name, proj_base_dept, proj_mgr, topic, budget)
Manager (mid, mgr_name, mgr_dept, salary, age, sex)
Note that each project is based in some department, each manager is employed in some department, and the manager of a project need not be employed in the same department (in which the project is based). Suppose you know that the following queries are the five most common queries in the workload for this company and all five are roughly equivalent in frequency and importance:
• List the names, ages, and salaries of managers of a user-specified sex (male or female) working in a given department. You can assume that, while there are many departments, each department contains very few project managers.2
• List the names of all projects with managers whose ages are in a user-specified range (e.g., younger than 30).
• If a department has a manager who manages a project based in this department, then list the department name as output (exclude those departments in the output whose managers always manage some other departments' project, or don't manage any projects at all).
• List the name of the project with the lowest budget.
• For a given project, list the names of all managers in the department in which the project is based. Note: a department may have more than one manager who can manage projects that may or may not belong to the same department, as described in the question above.
These queries occur much more frequently than updates, so you should build whatever indexes you need to speed up these queries. However, you should not build any unnecessary indexes, as updates will occur (and would be slowed down by unnecessary indexes). Given this information, design a physical schema for the company database that will give good performance for the expected workload. In particular, decide which attributes should be indexed and whether each index should be a clustered or an unclustered index. Assume that both B+ trees and hashed indexes are supported by the DBMS, and that both single-and multiple-attribute index keys are permitted.
1. Specify your physical design by identifying the attributes you recommend indexing on, indicating whether each index should be clustered or unclustered and whether it should be a B+ tree or a hashed index.
2. Assume that this workload is to be tuned with an automatic index-tuning wizard. Outline the main steps in the algorithm and the set of candidate configurations considered.
3. Redesign the physical schema assuming the set of important queries is changed to be the following:
• Find the total of the budgets for projects managed by each manager; that is, list proj_mgr and the total of the budgets of projects managed by that manager, for all values of proj_mgr.
• Find the total of the budgets for projects managed by each manager but only for managers who are in a user-specified age range.
• Find the number of male managers.
• Find the average age of managers.
Question 2
For each of the following queries, identify one possible reason why an optimizer might not find a good plan. Rewrite the query so that a good plan is likely to be found. Any available indexes or known constraints are listed before each query; assume that the relation schemas are consistent with the attributes referred to in the query.
Employee (eno, ename, dno, age, sex, sal )
Project (pno, pname, dno, proj_mgr, topic, budget)
Department (dno, dname, mgr_name, address) 3
1. An index is available on the age attribute:
SELECT E.dno
FROM Employee E
WHERE E.age = 20 OR E.age = 10
2. A B+ tree index is available on the age attribute:
SELECT E.dno
FROM Employee E
WHERE E.age<20 AND E.age>10
3. An index is available on the age attribute:
SELECT E.dno
FROM Employee E
WHERE 2*E.age< 20
4. No index is available:
SELECT DISTINCT *
FROM Employee E
5. No index is available:
SELECT AVG (E.sal)
FROM Employee E
GROUP BY E.dno
HAVING E.dno = 22
6. The dno in Employee is a foreign key that refers to Department:
SELECT D.dno
FROM Department D, Employee E
WHERE D.dno = E.dno