Knowledge of key concepts of data warehousing

Assignment Help PL-SQL Programming
Reference no: EM132755320

KL7011 Advanced Databases - Northumbria University

LO1: Demonstrate deep knowledge of key concepts of data warehousing, data analytics, data standards, and data quality

Appraise, analyse, design, develop and evaluate data warehousing and data analytics solutions using Oracle database system

Develop critical awareness of the responsibilities of database developer with respect to professional, legal, security and ethical issues individually or as part of a team

Assessment Background and Scenario

This assessment is based on two scenarios as follows:

- The Sales History (SH) Data Warehouse scenario. SH is a sample database schema provided by Oracle, which has been extensively used in the Oracle's Data Warehousing Guide (Lane, 2013). The details of this scenario are provided in Appendix 1.

- UniTel's Customer Churn dataset described in Appendix 2.

Assignment Questions

Part 1: Data Warehousing Tasks

This part is based on the Sales History scenario as described in Appendix 1.

You must submit all the SQL queries and any other code that you wrote in answering any of the tasks / questions (e.g., the use of Explain Plan statements for the queries and their outputs using Spooling or other suitable means).

- Study the index definitions in sh_idx.sql. Discuss in detail (using cost-based analysis) why these indexes (choose three different ones) are useful for answering queries over the SH2 and DWU versions of the database. You should not run the sh_idx.sql script at all.

- Identify three new indexes and justify why they could be useful. Write the SQL code for creating these indexes under your DWU account. Give example queries with cost-based analysis for both DWU account (which will have the new indexes) and SH2 shared schema (which will NOT have any of your new indexes). Alternatively, you may choose to run the same queries on your DWU account before and after creating your proposed three indexes.

- Given the two materialized views (MVs) defined in sh_cremv.sql, discuss in detail why these MVs are useful for users of the SH database. You should provide detailed examples of cost based analysis, e.g., using Explain Plan for running sample queries on both SH2 and DWU to illustrate your answer. You should not run the sh_cremv.sql script at all.

- Identify three new MVs based on the base tables in the SH schema under your DWU account and justify why they would be useful for the users of your data warehouse. Write the SQL code for creating these MVs. Moreover, run sample queries on both SH2 and DWU to ensure that queries running on DWU will be re-written by Oracle to use your proposed three MVs instead of the base tables used in the sample queries. Note that you must not query your MVs directly in the FROM clause; let the Oracle Query Optimizer re-write the queries and answer them using your proposed MVs.

- Prior to the introduction of the special aggregation function CUBE, there was no possibility to express an aggregation over different levels within a single SQL statement without using the set operation UNION ALL. Every different aggregation level needed its own SQL aggregation expression, operating on the exact same data set n times, once for each of the n different aggregation levels. With the introduction of CUBE in the recent database systems, Oracle provided a single SQL command for handling the aggregation over different levels within a single SQL statement, not only improving the runtime of this operation but also reducing the number of internal operations necessary to run the query and reducing the workload on the system.

- Using CUBE, write an SQL query over the SH schema under your DWU account involving one fact table (SALES or COSTS) and at least two dimension tables and at least 3 grouping attributes. Provide output of successful execution of your query. Provide reasons why your query may be useful for users of the SH data warehouse.

- Using set operation UNION ALL (and not CUBE), write an SQL query that produces the same result as the query in (a) above. Provide output of successful execution of your query.

- Using EXPLAIN PLAN, provide a detailed discussion analysing costs of evaluating the above queries (i.e. with and without ROLLUP).

Part 2: Data Mining Tasks

This part is based on the UniTel scenario as described in Appendix 2. Moreover, you must use the DMUn Oracle Data Mining Account (where 1 <= n <= 75, e.g., DMU1, DMU2) allocated to your group.

Jessica is the customers relation manager at UniTel. She wants to know the possibility of potential churn of the company's customers based on previous experience, so she may be able take some actions accordingly to retain their customers.

To help Jessica in doing her analysis, we need to investigate what could be a suitable algorithm for solving her problem. The data from last year are used as the training data and the data of February of this year are taken as the testing data to verify the model accuracy. Data of all the columns are used to set up the model. To meet the requirement, many algorithms can be selected.

Oracle Data Mining (ODM) provides the following algorithms for classification:

- Decision Tree
- Naive Bayes
- Generalized Linear Models (GLM)
- Support Vector Machines (SVM)

You are required to perform the following tasks:

- Using PL/SQL API, SQL-Developer's Data Miner Workflows or RODM (R package for Interfacing ODM), develop at least TWO models based on the above algorithms for the dataset accessible as CUSTOMERCHURN table
- Evaluate capabilities of the models you have developed

- Present and discuss your findings and make recommendations for Jessica

Part 3

Critically evaluate the SH data warehouse and the UniTel's Customer Churn data in relation to theory and best practices of data quality and standards.

The report should be concise and comprehensive and in the region of 900-1000 words. You should use Harvard style of citation and referencing by following the guidelines in Pears and Shields (2008).

Appendix 1

The Sales History (SH) Data Warehouse

Reference no: EM132755320

Questions Cloud

Facilities operations : Lean operations create more value for customers while using fewer resources than traditional operations.
What effect does it have on stockholders equity : A corporation reacquires 60,000 shares of its own $10 par common stock for $3,000,000, recording it at cost. What effect does it have on stockholders' equity
Find what journal entry to record the bond issuance on dec : Find What is the journal entry to record the bond issuance on Dec 31, 20X0? Use 3 decimal point time value of money factors in calculating answers
Discuss the characteristics of the industry and company : Discuss the characteristics of the industry and company (and or product life cycle) that would be the most suitable for each of the budgeting approaches
Knowledge of key concepts of data warehousing : Demonstrate deep knowledge of key concepts of data warehousing, data analytics, data standards, and data quality - Develop critical awareness
What is the effect of the stock dividend : An owner of 2,500 shares of Simmons Company common stock receives a stock dividend of 50 shares. What is the effect of the stock dividend
New manufacturing facility : Analyze the national differences in political economy between these three countries
Make a statement of stockholders equity for the month : Common Stock, $50,000; Additional Paid-In Capital, $2,000,000; Make a statement of stockholders' equity for the month ended June 30
Describe the functional currency for lucky trader ltd : Describe the functional currency for Lucky Trader Ltd. The stock was shipped FOB on the 10th December 2019 when the exchange rate was $1: NAD13.00

Reviews

Write a Review

PL-SQL Programming Questions & Answers

  Create a database model

Create a database model and Submit the table creation statements for the Database Model.

  Write pl-sql procedures and functions

Write PL/SQL procedures and functions to populate and query that database

  Sql questions

Write a query to display using the employees table the EMPLOYEE_ID, FIRST_NAME, LAST_NAME and HIRE_DATE of every employee who was hired after to 1 January, 1995.

  Run the lab_03_01.sql script

Run the lab_03_01.sql script in the attached file to create the SAL_HISTORY table. Display the structure of the SAL_HISTORY table.

  Write sql queries

Write a query to display the last name, department number, and salary of any employee whose department number and salary both match the department number and salary of any employee who earns a commission.

  Explaining sql insert statement to insert new row in cds

Write down a SQL insert statement to insert new row in "CDS" table.

  Write down name of actors in ascending order

Write down actors (or actress, your choice, but not both) who have won at least two (2) Academy Awards for best actor/actress. Provide the actor name, movie title & year. Order the result by actor name."

  What is an sql injection attack

What is an SQL injection attack? Explain how it works, and what precautions must be taken to prevent SQL injection attacks.What are two advantages of encrypting data stored in the database?

  Determine resonant frequency in series rlc resonant circuit

Given the series RLC resonant circuit in the figure, operating at variable frequency, determine: The resonant frequency ω o ,  The circuit’s quality factor Q , The cut-off frequencies, f 1  & f 2  and the bandwidth BW

  Query that uses cube operator to return lineitemsum

Write summary query which uses CUBE operator to return LineItemSum (which is the sum of InvoiceLineItemAmount) group by Account(an alias for AccountDesciption).

  Query to show customers were missing for existing orders

As DBA, your manager called a meeting and asked why there are so many orders for customers that don't exist in the customer table. Write query which would shows which customers were missing for existing orders. Use a join or a subquery.

  Sql query into a relational algebra statement

Turn this SQL query into a relational algebra statement? SELECT Request.reqfor, Ordering.invamt, Ordering.invnbr, Ordering.invdat

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd