Reference no: EM133182391
DBW624 Introduction to Datawarehousing - Seneca College
ETL Application
Now we are going to focus on the reference tables and get a feel for what is involved with moving and cleansing data.
For each of the sources below, you need to create a script which will take the data from the government site and load it into your reference tables within our data warehouse.
Once in the warehouse, you need to clean the data, where necessary.
You can use the approach of creating a staging table which is the target of the IMPORT from the .csv files below - then - clean the data and move over to the final reference tables.
Here are the three reference tables we will use.
1. Names (male and female) in Ontario
b. There are two separate files to down load. One for males and one for females
c. Just search on "male baby names" and "female baby names" to find the source of data
d. File name is ontariopbabynames_male_1917-2010_english.csv
e. File name is ontariopbabynames_female_1917-2010_english.csv
f. You can merge the two into a single reference table - or - keep them separate.
2. Population Table
b. Look for file: demo05a-eng.csv
c. It is under the summary tables area
3. Average expected life span (male and female)
All of this assignment can be done with SQL, however, you are free to use any programming language you like.
Attachment:- ETL Application.rar