Reference no: EM132836351
Purpose of Assignment
• Understand how to analyse OLTP databases
• Understand how to design an OLAP Database Schema
• Develop SQL script using appropriate techniques to perform ETL process
Specifications and Instructions
In year 2012, Models Sales Company had MySQL database management systems (OLTP database,salemodels) to store their transactional data related to their customers, products, and sales transactions. Later they bought company that run a same business in Pacific region. That company has MSSQL databasesaleAU_NZ.
Now, the Sales department manager wants to integrate sales data stored in the twodatabases into a single database to help them have a central access to all data and perform data analysis. Therefore, they have decided to implement a data warehouse for thispurpose. Assume that you have been employed as a database analyst for the company and you are assigned to carry out the following tasks to design and populate a data warehouse for the company.
Please note that saleAU_NZ was implemented on MSSQL management studio and salemodels was implemented using MySQL
Fortunately, the same schema is designedfor both salemodels and classicmodels, as shown in Figure 1y, only slight different is in the tables name. Please check.
The Profitability(P) of a product can be defined as follows:
P = [Order Details].PriceEach-Products.buyPrice
If P>0, the product makes profit; If P<0, the product makes loss.
Important Totals for Fact table:
[TotalPrice]=(od.QuantityOrdered* od.PriceEach)
[TotalProfit]= (od.QuantityOrdered * (od.PriceEach - p.buyPrice))
[TotalPossibleProfit] =(od.QuantityOrdered * (p.MSRP - p.buyPrice))
(p- Product table, o- OrderTable, od -Order Detail Table)
The manufacturer's suggested retail price (MSRP) is the price that a product's manufacturer recommends it be sold for at point of sale.
Please note that same product sold price is different in different order details, as different discounts for the same product was run in the different time frames.
Please be mindful that total sales of a product is equal to the total Quantity from all of the orders. The total number of products sold/bought refers to how many different ProductIDs are sold/bought.
Figure 1: OLTP Schema for sale models (MySQL)
Tasks T1 OLAP Design
The sales department has requested you to design a data warehouse so that reports can be generated quickly. A list of sales reports that are required by the department are:
1. Monthly sales and profit/loss report for each product line. For each product line, display its name, the number of products in that category, the total number of sales of all the products in that category, and the total profits/loss generated so far. Rank the line based on the total profits/loss generated in descending order.
2. Monthly report on the most profitable employees. For each employee, display its name, city, country,the total number of customers he/shesupported, the total number of payments from his/her customers processed. Rank the employees based on the number of profits generated in descending order.
3. Summary report on the total sales based on City. For each city, display its name, country, the total number of products sold, the total number of product category sold, the total number of customers who live there and the total sales of product sold. Rank the city based on the total sales of products sold.
4. Monthly report on the customers who have bought the most. For each customer, display the customer's name, the total number of products bought, the total number of product categories bought, the total sales of products bought and the total profits/loss generated so far. Rank the customer based on the total profits/loss generated in descending order.
The company is very cautious so they want you to propose a DW design using each of the DW schemas (star, snowflake and fact constellation). Select two out of 3, and for each schema,
1. draw the detailed OLAP ERD design
2. list a sales report, not mentioned above, that you think is useful for the company and that highlights the advantages of using this schema
3. provide details on the tables and attributes that are ignored and explain why
4. write the advantages and disadvantages of your proposed design with respect to meeting the requirements.
T2 Data Warehouse selection
Choose one of the proposed schemas from the previous task providing a clear rationale on why you chose it. You must present a clear discussion on evaluation and selection criteria. This may include problem specific suitability, drawbacks with other schemas for this problem etc.
T3 ETLprocess.
Create your OLAP database based on your chosen schema then extract, transform and load data from both databases to your OLAP database.
1. record differences between two DBs and record whichsettings you need to do to export DB. Record all you changes and corrections for sql file that you make
2. Use MSSQL for both DB.
All data in the OLAP database must meet the following criteria.
i. All data from alltables of the source databases must correctly appear in your OLAP database.
ii. All PKs and FKs must be valid.
You may use "DB_Assign1.zip" from Moodle as a reference script. You may need to rewrite and modify some parts based on your design.
Describe the ETL and evaluation process that you have done to ensure that your proposed design has met all of the requirements above. You may include the relevant results to highlight how your proposed design has met the requirements, e.g. the time it took to generate each report in OLAP vs. OLTP, the total number of data in OLAP vs. OLTP, etc.
Write SQL queries to produce all of the reports that the client wants and the report that you have proposed for Task 1 of the chosen schema.
Attachment:- Assignment Semester.rar