Reference no: EM133129603
Business Intelligence
Practical Report
Assignment
1. Analyse and apply strategies processes and underlying technologies for effective management of data to make evidence based decisions;
2. Critically analyse organisational and societal problems using descriptive and predictive analysis and internal and external data sources to generate insight, create value and support evidence based decision making;
3. Communicate effectively in a clear and concise manner in written report style for both senior and middle management with correct and appropriate acknowledgment of the main ideas presented and discussed.
Assignment Task - Business Intelligence
Task overview:
Task 1: Data Mining and Text Mining Concepts
Drawing on the course textbook and relevant and current literature on data mining process and text mining process answer the following three sub tasks:
Task1.1) Identify and describe the six steps in CRISP-DM Data mining process (300 words)
Task 1.2) Explain why the first three steps of the CRISP-DM Data mining process are considered the most important and where should one spend the most time in the data mining process
(500 words)
Task 1.3 Identify and describe the three sequential tasks in the Text mining process and explain the main purpose and outcomes of each task (500 words)
Task 1.4 Identify and discuss one application of Text mining widely used in specific industry
Task: 2 Exploratory Data Analysis and Linear Regression Analysis
Carefully study the Data Dictionary for house-prices.csv Data Set (See Table 1) which contains 21 variables including target variable price which determines the price of a house (third variable in data dictionary).
Task 2.1) Conduct and report on exploratory data analysis (EDA) of the house- prices.csv data set using RapidMiner Studio data mining tool. Note this will require use of number of RapidMiner operators.
Provide following for Task 2.1:
i. A screen capture of your final EDA process, briefly describe your EDA process.
ii. Summarise key results of your exploratory data analysis in Table 2.1 Results of Exploratory Data Analysis for house-prices.csv. Table 2.1 should include key characteristics of each variable in house-prices.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc.
iii. Discuss key results of exploratory data analysis presented in Table 2.1 and provide a rationale for your selection of top variables for predicting house prices (price), in particular focusing on the relationships of independent variables with each other and with dependent variable house price (price) drawing on results of EDA analysis and relevant literature on determinates of house price.
Task 2.2) Build and report on Linear Regression model for predicting house price (price) using RapidMiner data mining process and appropriate set of data mining operators for the house-prices.csv data set as determined by your exploratory data analysis in Task 2.1. Note this will require use of number of RapidMiner operators.
Provide the following for Task 2.2:
i. A screen capture of Final Linear Regression Model process and briefly describe your Final Linear Regression Model process.
ii. Table 2.2 named Results of Final Linear Regression Model for Task 2.2 for house-prices.csv data set.
iii. Discuss the results of Final Linear Regression Model for house-prices.csv data set drawing on key outputs (coefficients, standardised coefficients, t-statistics values, p-values and significance levels etc) for predicting house price (price) and relevant supporting literature on interpretation of a Linear Regression Model.
(About 300 words)
Include all appropriate outputs such as RapidMiner Processes, Graphs and Tables that support key aspects of exploratory data analysis and linear regression model analysis of the house- prices.csv data set in your Assignment 2 report.
Attachment:- Business Intelligence.rar