Reference no: EM133129444
COMP 30044 Mining and Predictive Analytics - Middle East College
Learning Outcome 1: Analyze and apply the concepts of data mining
Learning Outcome 2: Evaluate data preprocessing techniques
Assignment Objective
The objective of this assignment is to enable the student to gain an understanding of the concepts of data mining, various phases involved in data mining and the need for data preprocessing.
Assignment Tasks
Task 1. Identify a problem that involves a Data Mining task. Discuss and analyze The Cross-Industry Standard Process for the Data Mining task you identified.
Task 2. Much of the data collected for the data mining task will be un-preprocessed, incomplete, and noisy. This will affect the outcome of data mining task significantly. Identify at least five issues in the dataset by analyzing the dataset and discuss the solutions to these issues.
Task 3. Apply the data preprocessing techniques discussed in Task 2. using any software of your choice on the dataset collected for the data mining task identified in Task 1. You must place the screen shots of every process with explanation.
Note: The problem identified can be a business problem from any area of research of your choice. Carefully choose the problem since the subsequent phases of the Cross-Industry Standard Process need to be continued to fulfil the requirements of Assignment 2. The approval of the problem with the dataset should be sought from the module instructor by week 3. The dataset aligned to the chosen problem if available freely, can be collected or the student can prepare a dataset of their own by aligning it to the problem identified for Task 3.
• Explain with suitable diagrams wherever required. Diagrams must be drawn using suitable software or by pencil.
• Each student has to do the assignment individually / Students have to do the assignment collaboratively and each student should write a brief reflection on their contribution and learnings from group work.
• You can refer books in E-Library or use internet resource. But you should not cut and paste material from internet nor provide photocopied material from books. The assignment answers should be in your own words after understanding the matter from the above resources.
Attachment:- Mining and Predictive Analytics.rar