Evaluating predictive models using sas enterprise miner

Assignment Help Other Subject
Reference no: EM133534278

Predictive Analytics

Assignment - Building and Evaluating Predictive Models using SAS Enterprise Miner

Objective:
a) Demonstrate knowledge of building different types of predictive models using SAS Enterprise Miner
b) Demonstrate skill and knowledge in applying predictive models in real-life predictive analytics task
c) Relate theoretical knowledge of predictive models and best practices to application scenarios.

Business Case - Predictive Model for Property Price Prediction

A real estate company in Melbourne is in the process of updating their property (housing) price assessment method and the management of the company wants to build a property price estimation system to help sellers to sell their properties at the best price.

The company management is very keen to trial predictive modeling for this task and has gathered the historical property sales dataset. The dataset contains 18 variables describing previously sold properties. The attributes include the selling price of properties, year the property is built, year the property is sold, number of bedrooms, number of bathrooms, number of car spots, etc. The list of attributes and their descriptions are given below (a more detailed description can be found in data_description.txt).

The management of real estate company is considering you as an external consulting group to outsource the task to develop a reliable predictive model to predict the selling price of the properties, using the aforementioned historical dataset. You are required to build different predictive models, compare and contrast which is the best model for the selected dataset. You are also provided with a data set with new properties about to be listed, for which you have to predict the house prices (scoring dataset).

Q1. Setting up the project and exploratory analysis
Needs to provide a screen shot as evidence for each subsection of Q1
a. Create a new project and create a data source based on the given datasets. Set Price as the role of Target and make sure the Role and Level assigned to each variable is correct.
b. Carry out a data exploration by using a StatExplore Node. Explain your findings with regard to your property dataset.
c. Create a Data Partition with 70% of the data for training and 30% for validation.

Q2. Decision tree-based modeling and analysis
Carry out the following modeling tasks for the selected property value dataset.
a. Create two Decision Tree models based on two-way and three-way splits to create the two separate decision tree models. Provide the relevant diagrams of the Decision trees.
For each decision tree,
I. How many leaves are in the optimal tree?
II. Which variable was used for the first split?
III. What were the competing splits for this first split?
b. Which of the decision tree models appears to be better? Justify your answer.
c. Refer to the selected decision tree model in part (b) and
I. Identify two leaf nodes which have good predictive performances and two leaf nodes with poor predictive performances.
II. Provide justifications for your selections.
III. Write down the rules for the pathways leading up to each selected leaf node.

Q3. Regression-based modeling and analysis

a. In preparation for regression, is any missing values imputation needed? If yes, should you do this imputation before generating the decision tree models? Why or why not?
b. Use an Impute node connected to Data Partition node to handle missing values. Which variables have been imputed?
c. Are there any ordinal variables? Use the Replacement node to assign relevant values.
d. Conduct data exploration to select the best variables for the model with Variable Clustering node. Describe and justify how you ascertained the best variables to the model.
e. Create a Regression model using the set of variables you identified as suitable in part (d). You can choose the stepwise selection and use validation error as the selection criterion.
f. Run the Regression node and view the results.
I. Which variables are included in the final model? Explain what this means to the real estate company (very briefly).

II. What is the validation of Average Square Error (ASE) (or Mean Square error (MSE))? What does this mean in a predictive model?

4. Model Comparison and Scoring

a. Use the model comparison to compare and contrast the results from the decision trees and regression-based analysis. Provide a summary table for comparison. Describe and justify how you ascertained the better model.

b. Would it have been sufficient to use only one modeling technique (decision tree or regression)? Provide justifications for your answer. Use the outcome of 4a solutions.

c. Use the scoring data sets to score using the best predictive model. Explain the output using plots.

5. Extending current knowledge with additional reading - SEMMA

Relate the predictive analytics life cycle from your lectures, SAS diagram created in this case study and the SEMMA analytics methodology proposed by SAS. You can use diagrams with brief explanations.

(This section is based on your understanding of the flow of process diagram in this case study. The objective of this question is to get you to think deeper and ‘connect' the generic predictive analytics life cycle discussed in the lectures with the SAS specific (particular vendor and tool specific) SEMMA methodology (this is generic to SAS) and then also relate to a specific project using the SAS diagram for the project.)

Reference no: EM133534278

Questions Cloud

What are the implications of the findings : What are the implications of the findings? What does this study offer for counsellors to improve their practice? Important: Support your assertions with
What qualities would you seek in a judge who was selected : What qualities would you seek in a judge who was selected to adjudicate a dispute between labor and management?
Research method its advantages and disadvantages : Define: Experimental Research method its advantages and disadvantages Define: Correlational Research method its advantages and disadvantages
Discuss quality data showing the successful use of cannabis : Discuss quality data showing the successful use of cannabis either as a cancer treatment OR as an adjunctive treatment alongside traditional treatment.
Evaluating predictive models using sas enterprise miner : BUS5PA Predictive Analytics, La trobe university - Demonstrate knowledge of building different types of predictive models using SAS Enterprise
What are the potential consequences of academic integrity : Why is proper documentation of sources through in-text citations and reference entries in your coursework so important? What are the potential consequences
Find a total professional articles from the literature : Find a total of three professional articles from the literature or journals: one article related to ethical decision making and at least two articles related
What steps can be taken to minimize confounding caused from : Using the previous scenario suppose that you have another experimenter helping you run some of the scenarios. She frequently dresses in Goth makeup and outfits
What is the genetic basis of the disease : What is the genetic basis of the disease? What are the physiologic implications of the disease? What are the treatment options?

Reviews

Write a Review

Other Subject Questions & Answers

  Write paper on organizational behavior in a criminal justice

Organizational Behavior Paper - Write a 1,050- to 1,400-word paper on organizational behavior in a criminal justice or security agency

  Structuralism: observable behavior

structuralism: observable behavior, behaviorism: stimulus-response, psychoanalytic: unconscious conflict

  What leaves you feeling defensive against the message

When you are confronted with messages encouraging or discouraging behaviors, what is effective?

  Summarizing the opportunities and threats of your company

Prepare a 2-3-page double-spaced report, summarizing the opportunities and threats of your Company not only implementing this request by the CEO to the Cloud.

  Confirm your status as a gendered person

Describe your own experience of becoming consciously aware that you "have" a gender, male and discuss the role this awareness may have played in your overall growth and development as a person (Male).

  How do we counteract this as a society

How do we counteract this as a society? Should there be harsher penalties?

  Explain the fundamental role of embodied learning

From the first e-Activity, (ABOVE) explain the fundamental role of embodied learning in modality preference. Next, suggest two (2) biologically rooted.

  Analyze and explain foot printing in network security

Analyze and explain foot printing in network security - Evaluate hacker's foot print and their presence into network by using network vulnerability tools

  Is the ship liable for the loss

The ship nonetheless pulled into the port at Beira and tied up at the pier. Immediately, the ship was struck by a mortar round and the goods that the ship carried were destroyed. Is the ship liable for the loss.

  How does technology exist in your daily life

How does technology exist in your daily life and What devices or tech services do you use regularly and why?

  Clear association to social change

A study has results that seem fine, but there is no clear association to social change. What is missing?

  Discuss prevalence-risk factors

Choose a health problem that affects men specifically and discuss prevalence, risk factors, and risk mitigation strategies that a man could employ to avoid.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd