What is estimated probability that a property built in 1980

Assignment Help Applied Statistics
Reference no: EM131988720

The city tax assessor was interested in predicting residential home sales prices in a midwestem city as a function of a various characteristics of the home and surrounding property. Data on 300 arm-length transactions were obtained for home sales during the year 2002. Each line of the data set provides information on 8 variables. The dataset is "house.txt".

The variables are:

price           Sales price of residence (dollars)

sqft              Finished area of residence (square feet)

bed               Total number of bedrooms in residence

air                      Categorical variable indicating presence or absence of air conditioning: 1 if yes; 0 otherwise

garage Number of cars the garage will hold

year                    Year property was originally constructed

quality Categorical variable indicating quality of construction: 1 indicates high quality; 0 indicates low qu lot                  

Lot size (square feet)

1. Perform Exploratory Data Analysis.

(a) Read the problem above carefully. Understand all variables.

(b) Refer to the information from the introduction, check/change the variable types in R. Find the five number summaries of all continuous variables and for categorical variables, count how many observations are there in each category.

(c) Check the relation among variables graphically and numerically. Which variables seem to have a strong relation with the response variable "price"?

(d) Based on the results from previous questions, do you think there exists multicollinearity among predictors?

2. Suppose only main effect of the predictor variables are included in the model. Perform backward selection via AIC and BIC respectively, report the chosen model respectively. (When report the chosen model, you may just report like this way: Y Xl + X2)

3. Compare the model selected by the above methods, using AIC, BIC, PRESS, and adjusted R squared. Which model do you think is better?

4. Use partial F test to decide which model is better. Write the Full model, null/alternative hypothesis, reduced model, test statistic value, p-value and conclusion.

5. For whichever model you find is better in question 4, check the Normality assumption and constant variance assumption using numeric tests. Discuss if there are any violations of the assump¬tions.

6. Is multicollinearity an issue here in the above model?

7. Follow question 5, conduct a boxcox transformation, choose a number from -1, 0, 0.5 and 2, that is closest to A found for the model. Refit the model, and re-check the normality and constant assumption. Has the violations been modified to some extent?

8. Suppose people are also interested in predicting whether a property has high quality. Fit a logistic regression model using "price", "sqft", "bed" and "year" as predictor variables. State the fitted logistic regression function.

9. What is the estimated probability that a property built in 1980, which has 4 bedrooms, total finished area is 1700 square feet and whose sales price is $200000 will be a high quality property?

Attachment:- house.rar

Reference no: EM131988720

Questions Cloud

Discuss at least three complications of cancer : Discuss at least three complications of cancer, the side effects of treatment, and methods to lessen physical and psychological effects.
Identify whether the behavior is ethical or unethical : If ethical, what greater purpose or law enforcement responsibility does it serve to accomplish and why?
What is the eps for humble company : Their net income for the year was $205,000,000 with $2,000,000 sitting in their cash account. What is the EPS for Humble Company
Operate with increasing economies of scale : What does it mean for firm to operate with increasing economies of scale? Constant economies of scale? Diseconomies of scale?
What is estimated probability that a property built in 1980 : What is the estimated probability that a property built in 1980, which has 4 bedrooms, total finished area is 1700 square feet and whose sales price
Analyze the concept of ethical behavior : Two individuals stated that they could not be impartial because they had loved ones killed in alcohol related crashes as well.
Firm in perfect competition : What is the profit maximizing condition for a monopolist and how is it different from a firm in perfect competition?
What should you tell ms froman and will she need an nas : Patient Betsy Froman, who is enrolled in TRICARE standard, has been referred to you. What should you tell Ms. Froman? Will she need an NAS?
Prepare the journal entries : Prepare the journal entries that should be made in 2014 and 2015 to record the transactions related to the premium plan of the Sycamore Candy Company

Reviews

Write a Review

Applied Statistics Questions & Answers

  A group of brigham young university

A group of Brigham Young University

  Comment on the credibility of the supplier claim

4. A simple random sample of 300 items is selected from a large shipment, and testing reveals that 4% of the sampled items are defective. The supplier claims that no more than 2% of the items in the shipment are defective. Carry out an approp..

  A screening test for a newly discovered disease is being

a screening test for a newly discovered disease is being evaluated. in order to determine the effectiveness of the new

  Sample space

1)Discuss the following concepts and give examples from everyday life in which you might encounter each concept. Hint: For instance, consider the ?experiment? of arriving for class. Some possible outcomes are not arriving (missing class), arriving on..

  Develop a model for the total cost

Develop a model for the Total Cost to put on the seminar. Let x represent the number of students who enroll in the seminar - develop a model for the Total Profit if x students enroll in the seminar.

  Obtain an output as in the tutorials

Obtain an output (as in the tutorials).

  Use of normal approximation to the binomial

Suppose that 20% of the people in a large city have used a hospital emergency room in the past year. If a random sample of 125 people from the city is taken, approximate the probability that fewer than 22 used an emergency room in the past year. Use ..

  Compares the distributions of animals relative to the distri

Resource-selection analysis compares the distributions of animals relative to the distribution of habitat. If the two don't agree, there is evidence of selection. A survey of 106 moose found that 24 were located in "In burn - interior," 22 in "..

  Describe advantages and disadvantages of quasi-experiments

Describe the advantages and disadvantages of quasi-experiments? What is the fundamental weakness of a quasi-experimental design? Why is it a weakness? Does its weakness always matter?

  Question 1- search the internet and the articles database

question 1- search the internet and the articles database in the library for a real-life example of correlation. for a

  Multiple-baseline design differ from a reversal design

What is Goldbergs rationale for the study? Was the study designed to contribute to theory? Do the results of the study contribute to theory and how does a multiple-baseline design differ from a reversal design?

  Create a pivot table for the training data

Create a pivot table for the training data with Online as a column variable, CC as a row variable, and Loan as a secondary row variable - Create two separate pivot tables for the training data.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd