ITEC632 Data and Information Visualisation Assignment

Assignment Help Database Management System
Reference no: EM133121757

ITEC632 Data and Information Visualisation - Australian Catholic University

Assessment - Data Mining Project

Artefact - RapidMiner

The primary purpose of this assessment is to provide students with an opportunity to develop data mining skills for finding human interpretable patterns that describe the data analysis skills.

What are the types of employability skills that I will acquire upon completion of this assessment?

Context

Consider a set of observations on a large number of white wine varieties involving their chemical properties and ranking by wine tasters contained in white-wines.csv data set. Wine industry has been growing steadily as social drinking of wine is on the rise. The price of a wine largely depends on wine appreciation by wine tasters which may have a high degree of variability. Another key factor in wine certification and quality assessment is physicochemical tests which are laboratory-based and take into account factors like acidity, pH level, presence of sugar and other chemical properties.

For wine producers, it would be of interest if wine tasters' perception of wine quality after tasting can be related to the chemical properties of wine so that certification and quality assessment and assurance process of wines is more rigorous.

The white-wines.csv data set consists of 4898 white wine varieties in total (records). All wines are from one wine producing region. The white-wines.csv data set was collected on 12 different properties of wines. Quality is based on sensory data (wine tasters' perception of the quality of a wine), the rest are based on chemical properties of wines including density, acidity, alcohol content etc. All chemical properties of wines are coded as continuous numeric variables. Quality is an ordinal variable with a possible ranking from 1 (worst) to 10 (best). Each white wine variety is tasted by three independent tasters and final rank assigned is the median rank given by tasters. See Table 1 White Wines Data Set Data Dictionary for full details of white-wines.csv data set.

Instructions

Task 1) Exploratory Data Analysis
Conduct an exploratory data analysis of the white-wines.csv data set using the RapidMiner Studio data mining tool. Summarise the findings of your exploratory data analysis in terms of describing key characteristics of each variable in the wines.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships of variables with other variables if relevant in a table named Table 1 Results of Exploratory Data Analysis for the White-Wines.csv Data Set.

Discuss the key results of your exploratory data analysis presented in Table 1 and provide a rationale for why you have selected your five top variables for predicting a wine taster's ranking of a white wine drawing on the results of your EDA analysis and relevant literature (About 250 words).

Task 2) Building a predictive Linear Regression model
2.1 ) Build a Linear Regression model for predicting the quality ranking of a white wine using a RapidMiner data mining process and an appropriate set of data mining operators and a reduced set of variables from the white-wines.csv data set determined by your exploratory data analysis.
Provide these outputs from RapidMiner
a) Final Linear Regression Model process ( diagram )

b) Summary Table of Results of Final Linear Regression Model for white-wines.csv data set.

2.2) Briefly describe your final Linear Regression Model Process, and discuss the results of the Final Linear Regression Model for white wine.csv data set drawing on the key outputs (coefficient, standardized coefficients, t-statistics values, p-values and significance levels etc) for predicting Wine Quality and relevant supporting literature on the interpretation of a Linear Regression Model (About 250 words).

Attachment:- Applied Data Mining.rar

Reference no: EM133121757

Questions Cloud

What is the interest rate on this disguised loan : If the law firm takes the lease, it will invest $950,000 and in effect borrow $9,050,000, What is the interest rate on this disguised loan
Green plantation corporation management : Due to COVID-19 pandemic in 2020-2021, Green Plantation Corporation's management decides to cut its 2021 dividend following the company's sluggish sales perform
Find the real return-nominal after-tax return : Find the real return, nominal after-tax return, and real after-tax return for each of the following stocks:
Determine a recommended strategy : The products identified in this workshop were chosen at random and are not intended to be an exclusive list of variable annuity products.
ITEC632 Data and Information Visualisation Assignment : ITEC632 Data and Information Visualisation Assignment Help and Solution, Australian Catholic University - Assessment Writing Service
What is the length of firm cash conversion cycle : A receivables conversion period of 42 days, and a payments cycle of 33 days. What is the length of firm's cash conversion cycle
What is the optimal cash conversion size : The company spends, on the average, P30 for every cash conversion to marketable securities. What is the optimal cash conversion size
Perform on the account : A portfolio has an asset mix of 5% safety, 35% income and 60% growth. When the manager reviews the account prior to the clients annual review, she notices that
What is tom effective annual rate : He sold all stocks today for $126.19. During the year the stock paid dividends of $6.01 per share. What is Tom's effective annual rate?

Reviews

Write a Review

Database Management System Questions & Answers

  Create performance monitor to determine status of sql server

You decide to create a performance monitor to determine the status of the SQL server. Outline three performance counters that could be used, and justify the reasoning for using each.

  Make a spreadsheet to compare e-commerce sites

create a spreadsheet to compare the three identified e-commerce hosting sites and the e-commerce hosting site that you find • prepare a PowerPoint presentation that represents the presentation you would give to management explaining your selection..

  Explain what fields might be used as keys and indexes.

Describe any limitations or constraints related to the data and how it is structured.

  Develop a program to emulate a purchase transaction

Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class.

  Identify the data analytics tasks

Provide a clear statement of the aims and objectives of the data analytics study and the possible outcomes in terms of discovered knowledge and its potential application towards solution of the problem. In this section you need to discuss the busi..

  How can a dba use automation to comply with sox frameworks

Discuss SOX in 500 words or more. How do logging and separation of duties help comply with SOX? How might database auditing and monitori ng be utilized.

  Create a database for use by the employees

The vegetable database: You have to create a database for use by the employees of vegetables.com, an online store which sells many different vegetables, including onions, tomatoes, carrots, potatoes, squash, eggplants and zuchini

  Normalized the erd to third normal form

In your Learning Team this week,you normalized the ERD to third normal form (3NF). This is a key element in effective database design.

  Suppose that you are the database developer for a local

suppose that you are the database developer for a local college. the chief information officer cio has asked you to

  Create a pl-sql procedure - print out names of employees

Create a PL-SQL procedure that a company name, print out names of employees working at that company. Test your procedure with a company name you have in your company table.

  Requirements for operational data and decision support data

Outline the main differences between database requirements for operational data and for decision support data

  Explain the datawarehouse and data mining concepts

There are six major types of information systems which organisations use in their operations. Discuss how these information systems support managers in their decision making role Explain the datawarehouse and data mining concepts using appropria..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd