Reference no: EM132872417
MIS772 Predictive Analytics - Deakin University
Advanced Predictive Models for Business
AirbnbAI approached you again to develop a RapidMiner process(es) capable of analysing and predicting customer feedback about their stay in Melbourne Airbnb rental properties. AirbnbAI provided you with a sample dataset of appromimatley 1,000 rental listings and 100,000 associated customer reviews. This sample dataset can be downloaded from the unit website.
The provided dataset (A2-AirBNB-Melbourne-dataset.zip) has been partially cleaned up and includes a variety of numerical, nominal and text attributes, and descriptions of these attributes.
AirbnbAI would like you to use RapidMiner to address the following questions:
A) Is there a significant correlation between the sentiment (postive vs negative) of customer reviews of a property, and their review score ratings?
B) Can the review score ratings of properties be predicted (estimated) based on relevant attributes?
C) What are the most meaningful different segments that exist in the retail properties?
AirbnbAI wants you to use RapidMiner to cleanup and explore the provided data, conduct text mining, sentiment analysis, develop, evaluate and optimise linear regression and cluster analysis models.
Individual Tasks and Deliverables
1st Partial Submission (Question A - resubmit this with the final submission)
Exec Summary: Define your problem in business terms, in doing so answer question A, cross-reference with other report sections for support.
Data Preparation: Deal with any duplicates, bad and missing values, and anomalies. Transform selected attributes or create the new ones as needed.
Data Exploration: Use text mining techniques and simplistic sentiment analysis (i.e. simply calculate postive-negative words) in review comments as illustrated in the lectures/seminars in Week 4; refer also to partial example process provided).
2nd Partial Submission (Question A, B - resubmit this with final submission)
Exec Summary: Answer question B, cross-reference with other report sections for support.
Data Exploration: Identify appropriate attributes to predict the review score ratings of properties.
Modelling: Develop a linear regression model to predict the review scrore rating of properties by selecting appropriate predictive attributes.
Test the linear model and investigate results.
Final Submission (Resubmit final versions to answer Questions A, B and C)
Exec Summary: Answer questions A, B and C, cross-reference with other report sections for support.
Data Exploration: Investigate groups of rental properties and identify appropriate attributes to identify different clusters. Visualise clusters (Question C).
Modelling: Develop a cluster model to reveal the most meaningful segments (Question C).
Evaluation and Optimisation: Evaluate and optimise the performance of all models. Report metrics for the best performing models (Questions B, C).
Attachment:- Predictive Analytics.rar