ITEC325 Applied Data Mining and Big Data Assignment

Assignment Help Database Management System
Reference no: EM133163320

ITEC325 Applied Data Mining and Big Data - Australian Catholic University

Assessment Artefact: RapidMiner File

The primary purpose of this assessment is to provide students with an opportunity to develop data mining skills for finding human-interpretable patterns that describe the data analysis skills.

Context

Heart disease is one of the leading causes of death for people of most races in the world. According to the CDC, about half of all Americans (47%) have at least 1 of 3 key risk factors for heart disease: high blood pressure, high cholesterol, and smoking. Other key indicators include diabetic status, obesity (high BMI), not getting enough physical activity or drinking too much alcohol. Detecting and preventing the factors that have the greatest impact on heart disease is very important in healthcare.

Instructions

Task 0 Download the data set from LEO.

Task 1 Conduct an exploratory data analysis of the data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set. Summarise the findings of your exploratory data analysis in terms of describing key characteristics of each of the variables in the data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships with other variables if relevant in a table.

Hint: Statistics Tab and Chart Tab in RapidMiner provide a lot of descriptive statistical information and useful charts like Bar charts, Scatterplots etc. You might also like to look at running some correlations and chi square tests. Indicate in Task 1 Table which variables are contributing the most to determining the risk rating of heart disease.

Briefly discuss the key results of your exploratory data analysis and the justification for selecting your five top variables for predicting the risk of heart disease based on the results of your exploratory data analysis and a review of the relevant literature about assessing the risk of heart disease (About 250 words)

Task 2 Build and evaluate two predictive models for determining the risk rating of heart disease using appropriate data mining models in RapidMiner using two appropriate data mining methods you learned in this unit.

Briefly explain your predictive model process, justify your choice of the data mining method, and discuss the results of predictive model drawing on the key outputs. This discussion should be based on the contribution of each of the top five variables to the Final Decision Tree Model and relevant supporting literature (at least 3 credible sources) on the interpretation of the selected data mining models (About 250 words).

Task 3 Discuss and compare the accuracy of the two data mining models (methods). Use a table here to compare the key results of the confusion matrix (About 250 words).

Note the important outputs from your data mining analyses conducted in RapidMiner should be included in your Assignment 3 report to provide support for your conclusions regarding each analysis conducted. Export the important outputs from RapidMiner as jpg image files and insert these screenshots in the relevant parts of your Assignment 3 Report.

Task 4 Based on relevant supporting literature (at least 3 credible sources), briefly discuss the ethical perspectives in data mining and identify the possible ethical issues in the context of this case study (250 words).

Task 5 Use Zoom to record a short video presentation (4-5 minutes). In your presentation turn on your webcam (to include your face in the presentation) and share your screen to show your predictive process/models in RapidMiner. Briefly explain the steps you have followed to create the Rapid Miner processes, run the process, and present the results.

Attachment:- Applied Data Mining.rar

Reference no: EM133163320

Questions Cloud

Comprehensive project plan in an executive presentation : Create bulleted speaking notes to the executive board in the Notes section of your PowerPoint presentation
Create comprehensive project plan : Create an 8-10 page final comprehensive project plan. This document will distill down the most salient points from each of the four previous course deliverables
Develope the system implementation document : Create a comprehensive project plan and an executive presentation for potential investors - Select the most critical information from each that investors need
Experimental and computational studies : Conduct a programme and report the findings by use of accepted methods of analysis and evaluation and demonstrate an in-depth knowledge of subject area
ITEC325 Applied Data Mining and Big Data Assignment : ITEC325 Applied Data Mining and Big Data Assignment Help and Solution, Australian Catholic University - Assessment Writing Service
Choice mining method and the reasons : Choice mining method and the reasons for selecting this method and on the conciseness of the reasoning - Most influential in your first choice of mining method
Describe the data set inclusive of variables : Complete an end-to-end ML exercise using real-world data. In your future workplaces, you will often be expected to undertake similar exercises using suitable
The Significance of Colour In Interior And Islamic Architect : The Significance of Colour In Interior And Islamic Architecture - Interrelate conceptual, theoretical and practical tools and methods
SOAD8020 Practice with Individuals Assignment : SOAD8020 Practice with Individuals Assignment Help and Solution, Flinders University - Assessment Writing Service

Reviews

Write a Review

Database Management System Questions & Answers

  Knowledge and data warehousing

Design a dimensional model for analysing Purchases for Adventure Works Cycles and implement it as cubes using SQL Server Analysis Services. The AdventureWorks OLTP sample database is the data source for you BI analysis.

  Design a database schema

Design a Database schema

  Entity-relationship diagram

Create an entity-relationship diagram and design accompanying table layout using sound relational modeling practices and concepts.

  Implement a database of courses and students for a school

Implement a database of courses and students for a school.

  Prepare the e-r diagram for the movie database

Energy in the home, personal energy use and home energy efficiency and Efficient use of ‘waste' heat and renewable heat sources

  Design relation schemas for the entire database

Design relation schemas for the entire database.

  Prepare the relational schema for database

Prepare the relational schema for database

  Data modeling and normalization

Data Modeling and Normalization

  Use cases perform a requirements analysis for the case study

Use Cases Perform a requirements analysis for the Case Study

  Knowledge and data warehousing

Knowledge and Data Warehousing

  Stack and queue data structure

Identify and explain the differences between a stack and a queue data structure

  Practice on topic of normalization

Practice on topic of Normalization

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd