Reference no: EM133674405
Project - Data Warehouse Design
The overall objectives of this project are to build a data warehouse from real-world datasets and to carry out a basic data mining activity, in this case, association rule mining.
Datasets and Problem Domain
Prescribed datasets: the source data to design and populate the data warehouse in this project is based on the Olympic Dataset.
The Olympic Games represent the sole global, multi-disciplinary sports event, celebrated worldwide. Featuring participation from over 200 nations in more than 400 events spanning both the Summer and Winter Games, the Olympics serve as a platform for global competition, inspiration, and unity.
Requirements for CITS5504 and CITS3401
For CIT S3401 students, it is required that you identify at least ONE 1 client.
For CIT S5504 students, you must identify a minimum of TWO 2 clients. Examples provided below offer insights into potential scenarios
Clients may wish to query and analyze a common concept, but from different countries; for instance, Client A focuses on the USA and Client B on Australia.
Clients might be interested in querying and analyzing different concepts; for example, Client A could be exploring the relationships between the economy and the Olympic Games, while Client B is interested in understanding the connections between mental/physical health and the Olympic Games.
Both CIT S5504 and CIT S3401 students must explain the reasons why the identified client(s) are important.
Data Warehousing Design and Implementation
Following the four steps below of dimensional modelling (i.e. Kimball's four steps), design a data warehouse for the dataset(s).
Identify the process being modelled.
Determine the grain at which facts can be stored.
Choose the dimensions
Identify the numeric measures for the facts
To realise the four steps, we can start by drawing and refining a StarNet with the above four questions in mind.
Think about a few business questions that your data warehouse could help answer.
Draw a StarNet to identify the dimensions and concept hierarchies for each dimension.
This should be based on the lowest level of information you have access to.
Use the StarNet footprints to illustrate how the business queries can be answered with your design. Refine the StarNet if the desired queries cannot be answered, for example, by adding more dimensions or concept hierarchies.
Implement a star or snowflake schema using SQL Server Management Studio SSMS , or PostgreSQL, or other software. For the fact table and dimension tables, clearly state which ones are measures and dimensions, and indicate the dimension references.
Use Atoti to build a multi-dimensional analysis service solution, with a cube designed to answer your business queries. Make sure the concept hierarchies match your StarNet design.
Use Power BI/Atoti to visualise the data returned from your business queries.
Make sure you complete the relevant lab before attempting this task. The lab content may be helpful for you in completing this part.
Your objective is to assist a client in identifying significant patterns within the Olympic Games dataset. In order to demonstrate the application of association mining to this dataset, use this example to showcase the process and present the findings to the client.
Explain the top k rules that have suitable columns on the right-hand side based on a suitable metric, where k>=1.
Share insights derived from the mining results. If no meaningful rules are discovered, explore potential reasons for this outcome.
Give the client at least THREE 3 suggestions on commerce based on the obtained results.
Answer the following question in your report
For example, in a 2023 article, Albert Wong argued that "database cubes were popular in the early days of data warehousing, but they have largely been replaced by other technologies."
Do you agree or disagree with this point?
If you disagree with this point, discuss your reason.
If you agree with this point, discuss your reasons and explain at least one technology that can replace the data cube.
Requirements:
Write a short arg umentative essay to answer this question in your submitted PDF file.
Use various forms of evidence, such as data, experience, facts, or literature to support your points.
Word limitation: 400 500.