Assignment - Using Aggregation Functions for Data Analysis

Assignment Help Software Engineering
Reference no: EM133142404

Assignment - Using Aggregation Functions for Data Analysis

The provided zip file contains the data file [RedWine.txt] and the R code [AggWaFit718.R] to use with the following tasks, include these in your R working directory. You can use the R script [template.R] to organise your code.

Red wine quality Dataset -

The given dataset, "RedWine.txt", is used to model wine quality based on physicochemical tests. The dataset provides the 1,599 red wine samples from the north of Portugal. It is a modified version of the data used in the study [1]. This dataset includes 5 variables, denoted as X1, X2, X3, X4, X5, and Y, described as follows:

X1 - citric acid

X2 - chlorides

X3 - total sulfur dioxide

X4 - pH

X5 - alcohol

Y - quality (score between 0 and 10)

[1] P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

Assignment Tasks -

1. Understand the data

(i) Import the txt file (RedWine.txt) and save it to your R working directory.

(ii) Assign the data to a matrix, e.g. using <- as.matrix(read.table("RedWine.txt "))

(iii) The variable of interest is quality (Y). To investigate Y, generate a subset of 440 data, e.g. using: <-[sample(1:1599,440),c(1:6)]

[The following tasks are based on the 440 sample data]

(iv) Using scatter plots and histograms to understand the relationship between each of the variables X1, X2, X3, X4, X5 and the variable of interest Y.

2. Transform the data

Choose any four from the five variables (X1, X2, ..., X5). Make appropriate transformations to the chosen four variables and the variable of interest Y individually, so that the values can be aggregated in order to predict the variable of interest. Assign your transformed data along with your transformed variable of interest to an array.

3. Build models and investigate the importance of each variable

(i) Import AggWaFit718.R file to your working directory and load into the R workspace using, source("AggWaFit718.R")

(ii) Evaluating the following fitting functions on the transformed data:

-A weighted arithmetic mean (WAM)

-Weighted power means (WPM) with P=2

-An ordered weighted averaging function (OWA)

4. Use your model for prediction

Using your best fitting model based on Q3, predict the wine quality for the input: X1=1; X2= 0.075; X3=41; X4=3.53; X5=9.3.

5. Summarising your data analysis procedures in up to 20 slides for a 5-minutes presentation. The slides should include the following contents:

- What kinds of the data distribution you have identified in the raw data.

- Explain the transformations applied for the selected four variables and the variable of interest.

- Include two tables - one with the error measures and correlation coefficients, and one summarising the weights/parameters and any other useful information learned for your data.

- Explain the importance of each of the variables (the four variables that you have selected).

- Which fitting function is the best fitting model on your selected data.

- Give your prediction result and comment on whether you think it is reasonable.

- Discuss the best conditions (in terms of your chosen four variables) under which a higher quality wine will occur.

- Comment the implications and the limitations of the fitting model you used for prediction.

Attachment:- Aggregation Functions for Data Analysis Assignment File.rar

Reference no: EM133142404

Questions Cloud

Path-goal lmx and approaches to leadership : Path-Goal LMX and approaches to leadership. How do you see these applying in the practical world? What are their strengths and weaknesses relative to applicatio
Tell us about your population. : Tell us about your population. Why should we care about them? What is their history? What are their presenting problems?
What is the main cause : In short paragraphs, Describe the Background and Context of the Case. What is the main issue/problem?
Hiring for an assistant human resources manager position : You are the hiring manager at " Solutions Are Us LTD. The company is growing and you are urgently hiring for an assistant Human Resources Manager position.
Assignment - Using Aggregation Functions for Data Analysis : Assignment - Using Aggregation Functions for Data Analysis - Explain the importance of each of the variables (the four variables that you have selected)
Explain quality in healthcare from perspectives of customers : Explain quality in healthcare from the perspectives of customers, providers, and third-party payers, and highlight the National Quality Strategy.
Republic of china in exchange for chinese yuan : In 1994, the chinese government considered a policy of requiring all firms to deposit foreign currencies with commercial banks,In turn this banks were required
Compute the mean and range of sample : The data for 30 samples of three items each, from the study at Birdseye Magnetronics (from Prob. 8- 24, above), was further analyzed in an effort to use it for
Identify the legal issues that are pertinent to the case : John Smith, VP of HR at Lamp Electronics is sitting scratching his head over the conversation he had with Joe Group describing what had transpired in the financ


Write a Review

Software Engineering Questions & Answers

  Research report on software design

Write a Research Report on software design and answer diffrent type of questions related to design. Report contain diffrent basic questions related to software design.

  A case study in c to java conversion and extensibility

A Case Study in C to Java Conversion and Extensibility

  Create a structural model

Structural modeling is a different view of the same system that you analyzed from a functional perspective. This model shows how data is organized within the system.

  Write an report on a significant software security

Write an report on a significant software security

  Development of a small software system

Analysis, design and development of a small software system.

  Systems analysis and design requirements

Systems Analysis and Design requirements

  Create a complete limited entry decision table

Create a complete limited entry decision table

  Explain flow boundaries map

Explain flow boundaries map the dfd into a software architecture using transform mapping.

  Frame diagrams

Prepare a frame diagram for the software systems.

  Identified systems and elements of the sap system

Identify computing devices, which could be used to support Your Improved Process

  Design a wireframe prototype

Design a wireframe prototype to meet the needs of the personas and requirements.

  Explain the characteristics of visual studio 2005

Explain the characteristics of Visual Studio 2005.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd