Explain the transformations applied for the variables

Assignment Help Other Subject
Reference no: EM132487891

Assignment - Using aggregation functions for data analysis

The provided zip file contains the data file [Energy20.txt ] and the R code [AggWaFit718.R ] to use with the following tasks, include these in your R working directory.

Energy Prediction of Domestic Appliances Dataset

The given dataset, "Energy20.txt", can be used to create models of energy use of appliances in a energy-efficient house. The dataset provides the Energy use of appliances (denoted as Y) using 671 samples. It is a modified version of data used in the study [1]. The dataset includes 5 variables, denoted as X1, X2, X3, X4, X5, and Y, described as follows:

X1: Temperature in kitchen area, in Celsius

X2: Humidity in kitchen area, given as a percentage

X3: Temperature outside (from weather station), in Celsius

X4: Humidity outside (from weather station), given as a percentage

X5: Visibility (from weather station), in km

Y: Energy use of appliances, in Wh

Assignment Tasks -

1. Understand the data

(i) Download the txt file (Energy20.txt) from Future Learn and save it to your R working directory.

(ii) Assign the data to a matrix, e.g. using the.data <- as.matrix(read.table("Energy20.txt "))

(iii) The variable of interest is Energy use of appliances (Y). To investigate Y, generate a subset of 350 data, e.g. using: my.data <- the.data[sample(1:671,350),c(1:6)]

(iv) Using scatter plots and histograms, report on the general relationship between each of the variables X1, X2, X3, X4, X5 and the variable of interest Y. Include 5 scatter plots, 6 histograms, and 1 or 2 sentences for each of the variables, including the variable of interest Y.

2. Transform the data

(i) Choose any four from the five variables (X1, X2,..,X5). Make appropriate transformations to the chosen four variables and the variable of interest Y so that the values can be aggregated in order to predict the variable of interest. Assign your transformed data along with your transformed variable of interest to an array (it should be 350 rows and 5 columns). Save it to a txt file titled "name- transformed.txt" using write.table(your.data,"name-transformed.txt") where "name" is replaced with your name - you can use your surname or first name.

(ii) Briefly explain the transformations applied for the selected four variables and the variable of interest. (1- 2 sentences each).

3. Build models and investigate the importance of each variable

(i) Download the AggWaFit718.R file (from Future Learn) to your working directory and load into the R workspace using, source("AggWaFit718.R")

(ii) Use the fitting functions to learn the parameters for

A weighted arithmetic mean (WAM)

Weighted power means (WPM) with p = 0.5, and p = 5,

An ordered weighted averaging function (OWA), and

A Choquet integral.

(iii) Include two tables in your report - one with the error measures and correlation coefficients, and one summarising the weights/parameters and any other useful information learned for your data.

(iv) Compare and interpret the data in your tables. Comment on

a. How good the model is,

b. The importance of each of the variables (the four variables that you have selected),

c. Any interaction between any of those variables (are they complementary or redundant?) and

d. Better models favour higher or lower inputs.

(1-3 paragraphs for part 3(iv))

4. Use your model for prediction

(i) Choose your best fitting model.

Using your best fitting model, predict the Energy use of appliances for the following input X1=17; X2=39; X3=4; X4=77; X5=32.

(ii) Give your result and comment on whether you think it is reasonable. (1-2 sentences).

(iii) Comment on the best conditions (in terms of your chosen four variables) under which a high Energy use of appliances will occur. (1-2 sentences).

5. Comparing with a linear regression model

Linear regression is used to predict the value of an outcome variable Y based on one or more input predictor variables X. The equation is Y = β0 + β1X1 + β2X2 + ... βnXn + ε. The built-in function lm() is used to fit linear models in R.

(i) Build your linear model using the same dataset in Question 3 and describe the summary statistics for your model using the function summary().

(ii) Compare the performance of the linear model you got with your best fitting model in Question 4. Visualise the predicted Y values of both models on the 300 data and compare them with the true Y values.

(iii) Give your comment on the differences between the linear model and your best fitting model. (2-4 sentences).

Attachment:- Data Analysis Assignment File.rar

Reference no: EM132487891

Questions Cloud

Find the z-score for at most 28 successes : Use Normal Approximation to Binomial to find the z-score for at most 28 successes if the probability of success is 30% and the trial size is 80.
Number of mobile phones owned : A study on the number of mobile phones owned was conducted among the students in a college. A random sample of 5 students was taken and the data
Calculate the mean and standard deviation for the sample : A study on the number of mobile phones owned was conducted among the students in a college. A random sample of 5 students was taken and the data is tabulated be
What is the probability that the mill will break down : What is the probability that the mill will break down three or more times in a given month?
Explain the transformations applied for the variables : Using aggregation functions for data analysis - Briefly explain the transformations applied for the selected four variables and the variable of interest
List the sample space for the seating arrangement : Four women and two men go to a show and sit together in a row. Suppose they sit in a random order.
What is the time limit for checking a car : What is the time limit for checking a car if no more than 10% of checks exceed this limit?
Descriptive statistics and inferential statistics : a) There is a different between descriptive statistics and inferential statistics? Explain.
What are the total costs of the issue to the firm : Batman Enterprises has just completed an initial public offering. The firm sold 3,250,000 new shares at an offer price of $18.00 per share.

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd