ITEC632 Data and Information Visualisation Assignment

Assignment Help Database Management System
Reference no: EM133121757

ITEC632 Data and Information Visualisation - Australian Catholic University

Assessment - Data Mining Project

Artefact - RapidMiner

The primary purpose of this assessment is to provide students with an opportunity to develop data mining skills for finding human interpretable patterns that describe the data analysis skills.

What are the types of employability skills that I will acquire upon completion of this assessment?

Context

Consider a set of observations on a large number of white wine varieties involving their chemical properties and ranking by wine tasters contained in white-wines.csv data set. Wine industry has been growing steadily as social drinking of wine is on the rise. The price of a wine largely depends on wine appreciation by wine tasters which may have a high degree of variability. Another key factor in wine certification and quality assessment is physicochemical tests which are laboratory-based and take into account factors like acidity, pH level, presence of sugar and other chemical properties.

For wine producers, it would be of interest if wine tasters' perception of wine quality after tasting can be related to the chemical properties of wine so that certification and quality assessment and assurance process of wines is more rigorous.

The white-wines.csv data set consists of 4898 white wine varieties in total (records). All wines are from one wine producing region. The white-wines.csv data set was collected on 12 different properties of wines. Quality is based on sensory data (wine tasters' perception of the quality of a wine), the rest are based on chemical properties of wines including density, acidity, alcohol content etc. All chemical properties of wines are coded as continuous numeric variables. Quality is an ordinal variable with a possible ranking from 1 (worst) to 10 (best). Each white wine variety is tasted by three independent tasters and final rank assigned is the median rank given by tasters. See Table 1 White Wines Data Set Data Dictionary for full details of white-wines.csv data set.

Instructions

Task 1) Exploratory Data Analysis
Conduct an exploratory data analysis of the white-wines.csv data set using the RapidMiner Studio data mining tool. Summarise the findings of your exploratory data analysis in terms of describing key characteristics of each variable in the wines.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships of variables with other variables if relevant in a table named Table 1 Results of Exploratory Data Analysis for the White-Wines.csv Data Set.

Discuss the key results of your exploratory data analysis presented in Table 1 and provide a rationale for why you have selected your five top variables for predicting a wine taster's ranking of a white wine drawing on the results of your EDA analysis and relevant literature (About 250 words).

Task 2) Building a predictive Linear Regression model
2.1 ) Build a Linear Regression model for predicting the quality ranking of a white wine using a RapidMiner data mining process and an appropriate set of data mining operators and a reduced set of variables from the white-wines.csv data set determined by your exploratory data analysis.
Provide these outputs from RapidMiner
a) Final Linear Regression Model process ( diagram )

b) Summary Table of Results of Final Linear Regression Model for white-wines.csv data set.

2.2) Briefly describe your final Linear Regression Model Process, and discuss the results of the Final Linear Regression Model for white wine.csv data set drawing on the key outputs (coefficient, standardized coefficients, t-statistics values, p-values and significance levels etc) for predicting Wine Quality and relevant supporting literature on the interpretation of a Linear Regression Model (About 250 words).

Attachment:- Applied Data Mining.rar

Reference no: EM133121757

Questions Cloud

What is the interest rate on this disguised loan : If the law firm takes the lease, it will invest $950,000 and in effect borrow $9,050,000, What is the interest rate on this disguised loan
Green plantation corporation management : Due to COVID-19 pandemic in 2020-2021, Green Plantation Corporation's management decides to cut its 2021 dividend following the company's sluggish sales perform
Find the real return-nominal after-tax return : Find the real return, nominal after-tax return, and real after-tax return for each of the following stocks:
Determine a recommended strategy : The products identified in this workshop were chosen at random and are not intended to be an exclusive list of variable annuity products.
ITEC632 Data and Information Visualisation Assignment : ITEC632 Data and Information Visualisation Assignment Help and Solution, Australian Catholic University - Assessment Writing Service
What is the length of firm cash conversion cycle : A receivables conversion period of 42 days, and a payments cycle of 33 days. What is the length of firm's cash conversion cycle
What is the optimal cash conversion size : The company spends, on the average, P30 for every cash conversion to marketable securities. What is the optimal cash conversion size
Perform on the account : A portfolio has an asset mix of 5% safety, 35% income and 60% growth. When the manager reviews the account prior to the clients annual review, she notices that
What is tom effective annual rate : He sold all stocks today for $126.19. During the year the stock paid dividends of $6.01 per share. What is Tom's effective annual rate?

Reviews

Write a Review

Database Management System Questions & Answers

  Knowledge and data warehousing

Design a dimensional model for analysing Purchases for Adventure Works Cycles and implement it as cubes using SQL Server Analysis Services. The AdventureWorks OLTP sample database is the data source for you BI analysis.

  Design a database schema

Design a Database schema

  Entity-relationship diagram

Create an entity-relationship diagram and design accompanying table layout using sound relational modeling practices and concepts.

  Implement a database of courses and students for a school

Implement a database of courses and students for a school.

  Prepare the e-r diagram for the movie database

Energy in the home, personal energy use and home energy efficiency and Efficient use of ‘waste' heat and renewable heat sources

  Design relation schemas for the entire database

Design relation schemas for the entire database.

  Prepare the relational schema for database

Prepare the relational schema for database

  Data modeling and normalization

Data Modeling and Normalization

  Use cases perform a requirements analysis for the case study

Use Cases Perform a requirements analysis for the Case Study

  Knowledge and data warehousing

Knowledge and Data Warehousing

  Stack and queue data structure

Identify and explain the differences between a stack and a queue data structure

  Practice on topic of normalization

Practice on topic of Normalization

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd