Reference no: EM133525464 , Length: word count:2500
Data Analytics
Project
Overview
A data analytics project starts with collecting the data and ends with communicating the results from the data. In between, there are multiple steps that are required to be followed- data preprocessing is one of the most important steps among them. The data preprocessing step itself has multiple steps depending on the nature, type, value etc. of the data.
On the other hand, data visualisation uses visual representations to explore, make sense of, and communicate data that often includes charts, graphs, illustrations etc. Today, there is a move towards visualisation that can be observed among many big companies.
Assessment Details
For this assignment, students are required to register their group with the tutor. In this assignment, you will write 2,500 words report on a specific case study and explain the use and applications of data preprocessing and data visualisation techniques on a selected data set. Students can choose any suitable data set that is publicly available on the internet
For answering Question #3, students are not required to use any dataset.
In week 12, students will be required to submit their report on Moodle. Students are expected to work individually and undergo their own research without collaboration with any other student. Students are expected to prepare a comprehensive report on the application of their knowledge of data preprocessing and visualisation on a given case study.
Students are required to select a data set for classification tasks and answer the following questions:
- What is the purpose of the data set, and what kind of insights can be extracted from the chosen data set?
- Have you applied any data cleaning approaches (e.g., missing value handling, noisy data handling) for the chosen data set? Explain in your own words what data cleaning approaches you have perform or why it was not required.
- Have you applied any data transformation techniques (normalisation, attribute creation, discretisation etc.) for the chosen data set? What data transformation techniques you have performed or why it was not required to perform any transformation? Explain in your own words.
- Have you applied any data reduction techniques (reduce dimension, reduce volume, balance data) ?If yes, then describe the data transformation technique(s) you have followed; otherwise, explain why no transformation techniques were not required.
- Determine and justify the appropriate data mining task and method for the selected data set.
- Build and evaluate multiple models for the selected data set.
- Design an interactive dashboard using 3-4 charts/graphs/illustrations to represent the data.
Note: Use knime software.