Describe the six steps in crisp-dm data mining process

Assignment Help Other Subject
Reference no: EM133129603

Business Intelligence

Practical Report

Assignment

1. Analyse and apply strategies processes and underlying technologies for effective management of data to make evidence based decisions;

2. Critically analyse organisational and societal problems using descriptive and predictive analysis and internal and external data sources to generate insight, create value and support evidence based decision making;

3. Communicate effectively in a clear and concise manner in written report style for both senior and middle management with correct and appropriate acknowledgment of the main ideas presented and discussed.

Assignment Task - Business Intelligence

Task overview:

Task 1: Data Mining and Text Mining Concepts

Drawing on the course textbook and relevant and current literature on data mining process and text mining process answer the following three sub tasks:

Task1.1) Identify and describe the six steps in CRISP-DM Data mining process (300 words)

Task 1.2) Explain why the first three steps of the CRISP-DM Data mining process are considered the most important and where should one spend the most time in the data mining process
(500 words)

Task 1.3 Identify and describe the three sequential tasks in the Text mining process and explain the main purpose and outcomes of each task (500 words)

Task 1.4 Identify and discuss one application of Text mining widely used in specific industry

Task: 2 Exploratory Data Analysis and Linear Regression Analysis

Carefully study the Data Dictionary for house-prices.csv Data Set (See Table 1) which contains 21 variables including target variable price which determines the price of a house (third variable in data dictionary).

Task 2.1) Conduct and report on exploratory data analysis (EDA) of the house- prices.csv data set using RapidMiner Studio data mining tool. Note this will require use of number of RapidMiner operators.
Provide following for Task 2.1:

i. A screen capture of your final EDA process, briefly describe your EDA process.

ii. Summarise key results of your exploratory data analysis in Table 2.1 Results of Exploratory Data Analysis for house-prices.csv. Table 2.1 should include key characteristics of each variable in house-prices.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc.

iii. Discuss key results of exploratory data analysis presented in Table 2.1 and provide a rationale for your selection of top variables for predicting house prices (price), in particular focusing on the relationships of independent variables with each other and with dependent variable house price (price) drawing on results of EDA analysis and relevant literature on determinates of house price.

Task 2.2) Build and report on Linear Regression model for predicting house price (price) using RapidMiner data mining process and appropriate set of data mining operators for the house-prices.csv data set as determined by your exploratory data analysis in Task 2.1. Note this will require use of number of RapidMiner operators.

Provide the following for Task 2.2:
i. A screen capture of Final Linear Regression Model process and briefly describe your Final Linear Regression Model process.

ii. Table 2.2 named Results of Final Linear Regression Model for Task 2.2 for house-prices.csv data set.

iii. Discuss the results of Final Linear Regression Model for house-prices.csv data set drawing on key outputs (coefficients, standardised coefficients, t-statistics values, p-values and significance levels etc) for predicting house price (price) and relevant supporting literature on interpretation of a Linear Regression Model.
(About 300 words)

Include all appropriate outputs such as RapidMiner Processes, Graphs and Tables that support key aspects of exploratory data analysis and linear regression model analysis of the house- prices.csv data set in your Assignment 2 report.

Attachment:- Business Intelligence.rar

Reference no: EM133129603

Questions Cloud

What is sean calculated debt-to-income ratio : Sean earns $7,000 a month. He pays $3,000 for entertainment, food, utilities, and transport and saves $500. His mortgage payment is $600. His credit card paymen
Calculate the rate charged per hour of labour : The company wants a $40 profit margin per hour of labour and a 20% profit margin on parts. Calculate the rate charged per hour of labour
What is your estimate of the firm net income : The S&H Construction Company expects to have total sales next year totaling $15,200,000. What is your estimate of the firm net income
Please record the required journal entries : The notes allow customers a two-year period to pay with an interest rate of 4%. Please record the required journal entries
Describe the six steps in crisp-dm data mining process : Describe the six steps in CRISP-DM Data mining process - Identify and describe the three sequential tasks in the Text mining process
What was the company net operating income : The company closes out any underapplied or overapplied overhead to Cost of Goods Sold. What was the IS-79541 company's net operating income for last year
Explain the differences between an audit of internal control : Explain the differences between an audit of internal controls as required by PCAOB AS 2201 and the testing of internal controls
What would happen to the company overall net income : Of the administrative expenses, $10,000 is avoidable if the product line were dropped. What would happen to the company overall net income
Prepare dody journal entry to correct the error : The equipment should have been depreciated over five years, Prepare Dody's 2020 journal entry to correct the error and record 2020 depreciation

Reviews

len3129603

4/21/2022 4:39:08 AM

This is Assignment 02 based on a Data Mining and Text Mining Concepts and Dataset of 'House Price' YOU MUST USE ‘RapidMiner Studio’ for Task 2 in this Assignment 2 Dataset file (House Price): Please refer to the attachment for your review. And please read the full description carefully before you start the task, as it is highly important to make sure that you won't miss anything on the task list. Each task clearly explains what needs to be done and please keep your attention to every detail. These activities will test your understanding of the key concepts and theories covered in this course. Please provide brief answers. And everything is documented in your attachment, Please refer to the attachment.

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd