Perform the necessary tasks using rapidminer

Assignment Help Computer Engineering
Reference no: EM131754499

Comparing Methods Assignment

This assignment will be completed in teams of students.

Introduction

The purpose of this assignment is to demonstrate your knowledge and understanding of the analytical techniques and tools learned in the course and to show your understanding of how it relates to a business scenario. This assignment is somewhat different from previous ones: I do not give you very detailed instructions on how to build your analytical process in RapidMiner. Instead, you are expected to do the modeling, validation and performance analysis on the given dataset so you could answer the questions below and make some recommendations in the business situation as it applies.

Submission Instruction

Perform the necessary tasks using RapidMiner, answer the questions below and prepare the required screenshots.

- a Word document file with the answers and screenshots to the lettered questions. (Make sure that the lettering of questions stays the same!) Place the team member names on the top of the document. Name your file Comparing Methods Assignment LastName1-LastName2... .docx. (Warning: for full points, make sure that you name documentscorrectly and keep the answers correctly numbered lettered.)

- the RapidMiner project file, named Comparing MethodsAssignment LastName1-LastName2... .rmp. (The project file can be generated from RapidMiner by going to File -> Export Process. Select the destination folder and the name for the file. It will be saved as a .rmp file.)

Instructions

Download the mobile-churn.csvfile posted on Canvas. The file contains a dataset collected by a phone company about attrition, in other words, about customers who cancelled their services and possibly signed up with another company. The company is interested in what it could do to keep customers, to prevent their defection. Look at the data and make some recommendations based on the findings of your analysis.

Here is the explanation of the variables in the dataset:

a. Gender_Female: female or not
b. PhoneService_Yes: whether the customer has phone service with the company
c. MultipleLines_Yes: whether the customer has multiple line service
d. InternetService_DSL: whether the customer has DSL internet
e. InternetService_Fiber optic: whether the customer has Fiber optic internet
f. StreamingTV_Yes: customer streams TV
g. StreamingMovies_Yes: customer streams movies
h. Contract_One year: type of contract for customer: 1 yr
i. Contract_Two year: type of contract the customer: 2 yr
j. PaperlessBilling_Yes: whether the customer signed up for paperless billing
k. PaymentMethod_ Automatic: payment set up to be automatic
l. Retired: 0 for not, 1 for yes
m. Tenure (months): how long has been a customer with the company
n. MonthlyCharges: $ amount of monthly payments for the subscribed services
o. Churn: Whether the customer churned (i.e. is not a customer any more)

1. As a first step, build 3 models using different classification techniques (Neural Net: use the default settings; Decision Tree: use gini_index as the criterion; and Logistic Regression: use the default settings) that are capable of classifying customers into 2 categories (churn/no churn.)Use the X-validation operator right away for each techniques used. Set the number of folds to 3 (it will result in shorter process runtimes).For measuring the performance of the 3 models, look at the following performance measures:Accuracy, Kappa, Lift, F-measure, AUC (NOT the optimistic or pessimistic). (Hint: use the binomial classification performance operator to obtain all of these measures.)

Make 3 readable screenshots of the following for all 3 models (9 screenshots; 9pts):

- Top level processes
- Parametersettings for the 3 different techniques that are inside the cross validation operator
- Appropriate model results (Network, Tree, Weights)

2.

a. Make a screenshot of the confusion matrix output for each of the 3 methods.

b. Prepare a table to report the 5 performance measuresfor the 3 models. Put the different models in the rows and have 5 columns for the 5 measures.

 

Accuracy

Kappa

Lift

F

AUC

NN

 

 

 

 

 

DT

 

 

 

 

 

LR

 

 

 

 

 

c. Discuss the performance for each of the three models based on the performance measures. Relate the performances to the baseline model (calculate thea priori probabilities first!).

Prepare a visual evaluation of the 3 models by including a screenshot of the ROC comparison chart. (Hint: Use the Compare ROC operator. Have the same models with the same parameters as in the other runs above.)

d. Usingthe observed performance measures, compare the performance of the 3 models. Do they perform the same? Which one is better, worse, why?

e. Are the 3 models giving you more or less the same suggestions regarding the important factors/variables? If there are differences, what are they?

3. Choose one of the models (possibly the best performing one) and address the following questions:How can you interpret the results of the model? Which attributes seem to matter the most? How do you know it? Discuss their importance and/or effect sizes.

4. How could the results of the model be useful for the telecommunications company? What business recommendations can be suggested based on the results?

Attachment:- Mobile-Churn.rar

Reference no: EM131754499

Questions Cloud

Describe in detail how you would harden the target : Pick a high profile target in the area where you live or an area that you are familiar with and describe in detail how you would harden the target.
Should there be a separation of church and state : In modern Western society, should there be a separation of church and state? Are there reasonable limits?
Suppose we roll 10 fair six-sided dice : Suppose we roll 10 fair six-sided dice. What is the probability that there are exactly two 2's showing and exactly three 3's showing?
What till the machine net book value : Glenmore corp. purchased a fax machine at the beggining of 2007. Using straight-line depreciation what till the machine net book value be at the end of 2008\
Perform the necessary tasks using rapidminer : Perform the necessary tasks using RapidMiner, answer the questions below and prepare the required screenshots. This assignment will be completed in teams.
Samsung has been building dozens of smart appliances : Samsung is innovative with their fast decision making, and getting things done quickly. Samsung has been building dozens of smart appliances.
How should sam and george record unpaid tuition fees : How should Sam and George record unpaid tuition fees? Which method should the client use to record its revenue
Same number of spades : What is the probability that the North and East hands each have exactly the same number of spades?
Calculate the depreciation expense for the fourth year : Straight-line depreciation was used throughout the machine's life. Calculate the depreciation expense for the fourth year of the machine's useful life

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd