Explain relationship between price and other variables

Assignment Help Basic Statistics
Reference no: EM131041264

Statistical Models and Methods Linear Models

Please hand in your work by 5.00pm on Thursday 5 May. Work should be submit- ted via the coursework post boxes in the School. Please remember to complete and attach a coursework submission sheet to your report. Your report should contain all relevant plots and R output needed to justify your answers/arguments, together with appropriate discussion, but please do not include pages of irrel- evant plots/output which are not discussed in the report. The easiest way to include R output in your report is to use R Markdown to produce your report, but you do not have to do so. Your report does not need to contain your R code, though you can include it if you wish. If you are using R Markdown, and do not wish to include your R code, then you can suppress the R code using the echo =
FALSE argument, i.e. enclose the code in an {r, echo=FALSE} environment in the Markdown file.

There will be a Moodle forum specifically for answering queries about the course- work, so you may post questions and I will answer them there so that everyone receives the same assistance. Please be careful to not inadvertently give away parts of your answer if you do post a question.

Unauthorised late submission will be penalised by 5% of the full mark per day. Work submitted more than one week late will receive zero marks. You are reminded to familiarise yourself with the guidelines concerning plagiarism in assessed coursework (see the student handbook), and note that this applies equally to computer code as it does to written work.

The Data

Data are available on the recommended prices of used cars in the United States. All cars are the same age, but have done different mileages and have different specifications. You have recently been employed by a used car dealership to build models to describe the dependence of recommended prices on potential explanatory variables, in order to use these models to price your own used cars. The data, which come in two parts, are available on Moodle. They are

TrainData.txt Training data, which will be used to build models.

TestData.txt Test data, which will be used to assess predictions from the models built.

They can be read into R (after saving the file in your working directory) using

Train = read.table("TrainData.txt",header = T)

Test = read.table("TestData.txt",header = T)

A description of the variables can be found in the file description.txt.

After reading in the data, you can look at the structure of the data (number of observa- tions/variable types etc) using the str() command, e.g. str(Train). For both data sets, you should treat the covariates Cylinder, Doors, Cruise, Sound and Leather as factors (they are treated as integers by default). This can be done using, for example,

Train$Cylinder = factor(Train$Cylinder)

The Task

(a) Using the TRAINING data, investigate models to explain the relationship between Price and the other variables. That is, Price (or transformations of it) is to be the response variable, and all other variables are potential explanatory variables.

(b) Use your fitted model(s) from (a) to predict the responses for the observations in the TEST data set. That is, for each of the observations in the Test data, use the values of the explanatory variables as input to your model(s) from (a) to obtain fitted/predicted responses for these observations. Compare your predicted responses with the known observed responses from the observations in the Test data, using suitable plots/numerical summaries.

Notes

- As with any analysis, the first step should be to do some exploratory analysis using any relevant plots and numerical summaries.

- For the model fitting, you can/should use any of the techniques we have covered this semester to investigate potential models. The task is deliberately open-ended, as would be the case in real situations working with real data. As this is a realistic situation with real data, there is not necessarily one single correct answer. Your job is to investigate potential models, and provide a summary of what they tell us about the problem we are trying to solve. The important point is that you correctly use the relevant techniques in a logical and principled manner, and provide a concise but insightful summary of your findings and reasoning. (Note however that you do not have to produce a report in a formal "report" format.)

- You should pay attention as to whether the model assumptions are being met, for example using suitable diagnostic plots, and consider any transformations of the numerical variables if appropriate. Also consider whether your conclusions depend on a few outlying or influential points.

- You should (briefly and concisely) interpret your model(s) and consider whether they make sense in the context of the problem, for example via interpreting the fitted parameters.

- You do not need to include all your R output, as you will generate lots of output when experimenting with the model fitting. However, you should include the output which is relevant to the arguments that you make when describing the logical developments of your model fitting, and any diagnostic plots which justify changes you make in order to meet the modelling assumptions. Finally, at all stages please remember to explain your reasoning and describe (concisely but accurately) the action you take and why, along with the relevant output.

Price: recommended retail price of the car

Mileage: number of miles the car has been driven

Make: manufacturer of the car

Type: body type such as Hatchback, Coupe etc.

Litre: a measure of engine size

Cylinder: number of cylinders in the engine (4,6,8)

Doors: number of doors (2 or 4)

Cruise: indicator variable representing whether the car has cruise control (1 = cruise)

Sound: indicator variable representing whether the car has upgraded speakers (1 = upgraded)

Leather: indicator variable representing whether the car has leather seats (1 = leather)

NOTE: You should change the variables "Cylinder" , "Doors" , "Cruise" , "Sound" and "Leather" to factors, as described on the question sheet.

Reference no: EM131041264

Questions Cloud

State legislature is writing a public sector bargaining : A state legislature is writing a public sector bargaining law and asks you to design the law’s impasse procedures (strike, arbitration, mediation, fact-finding, or some combination). Outline a detailed plan. Do you allow workers to strike? Do you req..
Review hospital disaster preparedness and response plans : Write approximately 750 focused, clear, concise, convincing, well-structured, and individually-authored words explaining how application of concepts in the Module 2 textbook chapters (e.g., organization design and coordination, motivation) informs..
Time value of money calculations : For the next 30 years, her money is expected to grow at 4% interest. Approximately, how much will she have in her savings account on her 65th birthday, that is, 30 years after making her last deposit? (Hint: The answer requires two time value of m..
Communicating health risks assignment overview : The Session Long Project entails you going through the process of influencing policymakers. In your first three SLP assignments you wrote a letter to "raise a concern," "oppose a position," and "support a position." While these are direct ways of ..
Explain relationship between price and other variables : Explain the relationship between Price and the other variables. That is, Price (or transformations of it) is to be the response variable, and all other variables are potential explanatory variables.
Do you have any bonus or incentive programs : Is there a standard performance appraisal that is used to evaluate every employee, including top management? If not, what are the different types of appraisals used in your company? What is the purpose of your performance appraisals? What would y..
Calculate the intrinsic value using the multistage model : Micro Corp. just paid dividends of $2 per share. Assume that over the next three years dividends will grow as follows, 5% next year, 15% in year two, and 25% in year 3. Calculate the intrinsic value using the multistage model
Prepare a planning report on a proposal by kydonia holdings : You will be preparing a planning report to the local planning committee on a proposal by Kydonia Holdings to build a mixed housing, commercial and retail development on a site in Moxon Street, Westminster.
Six months have transpired : Six months have transpired, and you’ve been able to add two employees. In this new arrangement, what agency disputes might now develop that need to be addressed, and how might you address them? What resources do you need to use or consider?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Self-designed fictitious study

For this assignment, you will undertake an analysis based on a self-designed fictitious study that utilizes statistical methodologies. You will first develop a fictitious problem to examine - it can be anything.

  Program are significantly different from rest of class

Can the college conclude that the students in the new program are significantly different from the rest of the freshman class? Use a two-tailed test with .05.

  Colour distribution of the chocolate candies

If you randomly select ten m&m's, what is the probability that exactly three of them are green?

  Readings on ?uoride levels in water

The regulatory board of health in a particular state speci?es that the ?uoride levels in water must not exceed 1.5 parts per million (ppm). The 20 measurements given here represent the randomly selected daily early morning readings on ?uoride leve..

  Is association between hypertension and hyperglycemia

A multiphasic health examination was administered to 1000 employees of a pharmaceutical firm. 50% of these employees had elevated diastolic blood pressure and 45% had hypoglycemia. A total of 37% of employees had both elevated diastolic blood pres..

  Probability that five cardholders pay their monthly balance

Suppose the credit center selects five cardholders at random. What is the probability that all five cardholders will pay their new monthly balance in full before the payment due date?

  Extension to the case

The theorem that Pn → W was proved only for the case that P has no zero entries. Fill in the details of the following extension to the case that P is regular. Since P is regular, for some N, PN has no zeros.

  Description of statistics-hypothesis testing

Compute the test statistic. Use the t distribution table to compute a range for the p value. What is the rejection rule using the critical value? What is the conclusion?

  Determining suitable graphs and statistical tests

Each party's position must be statistically sound. Include as suitable graphs, computations, and statistical tests.

  Time series analysis of historical population

Estimate the population in 1600 and graph the population trend from 1600 through to 2000

  Laplace transform for an expression

Last week was a basic introduction to Matlab. This week you will learn how to use three built-in functions (laplace, ilaplace and dsolve).Using the syms function

  Predicted batting average higher

What does your least-squares regression line predict for the 2010 batting average of someone who hit .365 in 2009? Is the 2010 predicted batting average higher or lower than .365?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd