Reference no: EM133182375
DBW624 Introduction to Datawarehousing - Seneca College
ETL Application
Now we are going to focus on the reference tables and get a feel for what is involved with moving and cleansing data.
For each of the sources below, you need to create a script which will take the data from the government site and load it into your reference tables within our data warehouse.
Once in the warehouse, you need to clean the data, where necessary.
You can use the approach of creating a staging table which is the target of the IMPORT from the .csv files below - then - clean the data and move over to the final reference tables.
Here are the three reference tables we will use.
1. Names (male and female) in Ontario
b. There are two separate files to down load. One for males and one for females
c. Just search on "male baby names" and "female baby names" to find the source of data
d. File name is ontariopbabynames_male_1917-2010_english.csv
e. File name is ontariopbabynames_female_1917-2010_english.csv
f. You can merge the two into a single reference table - or - keep them separate.
2. Population Table
b. Look for file: demo05a-eng.csv
c. It is under the summary tables area
3. Average expected life span (male and female)
All of this assignment can be done with SQL, however, you are free to use any programming language you like.
DBW624 Business Analytic Application - Seneca College
Now that we have a complete data warehouse, loaded with data, we are going to build our Business Intelligence (BI) application which will actually answer the business questions below, which were highlighted in Assignment 1.
Here are the business analytics we will want to perform.
1. Sales Volumes Analysis by fiscal quarter
a. By Store
b. By Product
c. By Product Group (Age Group)
2. Sales Revenue Analysis by fiscal quarter
a. By Store,
b. By Product
c. By Product Group (Age Group)
3. Sales Profit Analysis by fiscal quarter
a. By Store,
b. By Product
c. By Product Group (Age Group)
4. Product Line Analysis by fiscal quarter, measured by revenue and profit
a. Which products have been the most / least successful
b. Which product groups have been the most / least successful
c. What is the product trends (growth or declining)?
5. Store Analysis by fiscal quarter, measured by revenue and profit
a. Which stores are the most / least successful
b. What is the growth trends for each store (growth or declining)?
6. Additional Analysis
a. Which names have been most successful by volume
b. Which gender has been most successful by volume
c. Who was the top sales person for the quarter?
d. What percentage of sales are cash versus credit card?
e. What percentage of sales were using a marketing campaign?
7. Analytics Against Reference Tables
a. Which ten cities should we open stores in, based on population?
b. Which names should we expect will be the most popular for our personalized products?
For each of the business questions above, write a SQL query which provides the business answer.
What you need to hand in is a copy of your BI application showing both the SQL used to run the required analytic query and the result set.
Attachment:- ETL Application.rar