What is the average error on the validation and test data

Assignment Help Finance Basics
Reference no: EM131668731

Question: Refer to the scenario described in Problem and the file Housing Bubble.

a. Consider the Pre-Crisis worksheet data. Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Predict the sale price using multiple linear regression. Use Price as the output variable and all the other variables as input variables. To generate a pool of models to consider, execute the following steps. In Step 2 of XLMiner's Multiple Linear Regression procedure, click the Best subset option. In the Best Subset dialog box, check the box next to Perform best subset selection, enter 16 in the box next to Maximum size of best subset:, enter 1 in the box next to Number of best subsets:, and check the box next to Exhaustive search. Once you have identified an acceptable model, rerun the Multiple Linear Regression procedure and in Step 2, check the box next to In worksheet in the Score new data area. In the Match variable in the new range dialog box,

(1) specify the NewDataToPredict worksheet in the Worksheet: field,

(2) enter the cell range A1:P2001 in the Data range: field, and

(3) click Match variable(s) with same name(s).

i. From the generated set of multiple linear regression models, select one that you believe is a good fit. Express the model as a mathematical equation relating the output variable to the input variables.

ii. For your model, what is the RMSE on the validation data and test data?

iii. What is the average error on the validation data and test data? What does this suggest?

b. Repeat part a with the Post-Crisis worksheet data.

c. The MLR_NewScore worksheets generated in parts a and b contain the sales price predictions for the 2000 homes in the New Data To Predict using the pre-crisis and postcrisis data, respectively. For each of these 2000 homes, compare the two predictionsby computing the percentage change in predicted price between the pre-crisis and postcrisis models. Let percentage change 5 (postcrisis predicted price 2 pre-crisis predicted price)/pre-crisis predicted price. Summarize these percentage changes with a histogram. What is the average percentage change in predicted price between the pre-crisis and postcrisis model?

Problem: As an intern with the local home builder's association, you have been asked to analyze the state of the local housing market that has suffered during a recent economic crisis. you have been provided three data sets in the file Housing Bubble. The Pre-Crisis worksheet contains information on 1978 single-family homes sold during the one-year period before the burst of the housing bubble. The Post-Crisis worksheet contains information on 1657 single-family homes sold during the one-year period after the burst of the housing bubble. The New Data To Predict worksheet contains information on homes currently for sale.

a. Consider the Pre-Crisis worksheet data. Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Predict the sale price using k-nearest neighbors with up to k = 20 Use Price as the output variable and all the other variables as input variables. In Step 2 of XLMiner's k-Nearest Neighbors Prediction procedure, be sure to Normalize input data and to Score on best k between 1 and specified value. Check the box next to In worksheet in the Score new data area. In the Match variables in the new range dialog box, (1) specify the New Data To Predict worksheet in the Worksheet: field, (2) enter the cell range A1:P2001 in the Data range: field, and (3) click Match variable(s) with same name(s). Completing the procedure will result in a KNNP_New Score worksheet that will contain the predicted sales price for each home in New Data To Predict.

i. What value of k minimizes the root mean squared error (RMSE) on the validation data?

ii. What is the RMSE on the validation data and test data?

iii. What is the average error on the validation data and test data? What does this suggest?

b. Repeat part a with the Post-Crisis worksheet data.

c. The KNNP_NewScore1 and KNNP_NewScore2 worksheets contain the sales price predictions for the 2000 homes in the New Data To Predict using the precrisis and postcrisis data, respectively. For each of these 2000 homes, compare the two predictions by computing the percentage change in predicted price between the precrisis and postcrisis models. Let percentage change 5 (postcrisis predicted price 2 precrisis predicted price)/precrisis predicted price. Summarize these percentage changes with a histogram. What is the average percentage change in predicted price between the precrisis and postcrisis model?

Reference no: EM131668731

Questions Cloud

What is the rmse : Consider the Pre-Crisis worksheet data. Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets.
Design a file-copying program : Design a file-copying program, in the C programming language, named filecopy using ordinary pipes. Write this program using UNIX pipes
Multiply and simplify the given expression : Multiply and Simplify the following Expression. (2+2 v2 )(-5-2 v2). Simplify the following: (2+v6)/(2+v3) 9) and -2 v24-2 v6 +5 v6 -2 v20.
Calculate the least-squares line : Conduct a goodness-of-fit test to determine if the actual college majors of graduating males fit the distribution of their expected majors
What is the average error on the validation and test data : For your model, what is the RMSE on the validation data and test data? What is the average error on the validation data and test data?
Why do departments refuse to change : Explain why police patrols methods have remained basically reactive in nature for decades. Why do departments refuse to change ?
What is the vertical intercept of function : Assume that Y is on the vertical axis and X is on the horizontal axis. What is the vertical intercept of this function?
Discuss about the grey code corporation : Grey Code Corporation(GCC) is a media and marketing company involved in magazine and book publishing and television broadcasting.
Find the price elasticity of demand : At this price, find the price elasticity of demand. USE THE POINT SLOPE METHOD to find this elasticity. Hint: You'll have to find the quantity at this price.

Reviews

Write a Review

Finance Basics Questions & Answers

  Financial reporting and analysis

Finance is about Gunns Ltd, a company in dealing with forestry products in Australia. The company has also been listed in Australian Stock Exchange. As many companies producing forestry products, even Gunns Ltd is facing various problems. Due to the ..

  A report on financial accounting

This report is specific for a core understanding for Financial Accounting and its relevant factors.

  Describe the types of financial ratios

Describe the types of financial ratios and other financial performance measures that are used during venture's successful life cycle.

  Differences between sole proprietorship and corporation

Briefly describe the major differences between a sole proprietorship and a corporation

  Prepare a cash budget statement

Calculate the expected value of the apartment in 20 years' time. What is the mortgage loan repayment at the beginning of each month

  What are the implied interest rates

What are the implied interest rates in Europe and the U.S.?

  State pricing theory and no-arbitrage pricing theory

State pricing theory and no-arbitrage pricing theory

  Small business administration

Identify the likely stage for each venture and describe the type of financing each venture is likely to be seeking and identify potential sources for that financing.

  Effect of financial leverage

The Effect of Financial Leverage and working capital management

  Evaluate the basis for the payment to the lender

Evaluate the basis for the payment to the lender and basis for the payment to the company-counterparty.

  Importance of opps, ipps, mpfs and dmepos

Research and discuss the differences and importance of : OPPS, IPPS, MPFS and DMEPOS.

  Time value of money

Time Value of Money project

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd