Visualisation and model development assessment

Assignment Help Python Programming
Reference no: EM133000457

BDA601 Big Data and Analytics - Laureate International Universities

Assessment - Visualisation and Model Development

Learning Outcome 1: Apply data science principles to the cleaning, manipulation, and visualisation of data
Learning Outcome 2: Design analytical models based on a given problems; and
Learning Outcome 3: Effectively report and communicate findings to an appropriate audience.

Task Summary
Customer churn, also known as customer attrition, refers to the movement of customers from one service provider to another. It is well known that attracting new customers costs significantly more than retaining existing customers. Additionally, long-term customers are found to be less costly to serve and less sensitive to competitors' marketing activities. Thus, predicting customer churn is valuable to telecommunication industries, utility service providers, paid television channels, insurance companies and other business organisations providing subscription-based services. Customer-churn prediction allows for targeted retention planning.

In this Assessment, you will build a machine learning (ML) model to predict customer churn using the principles of ML and big data tools.
As part of this Assessment, you will write a 1,000-word report that will include the following:
a) A predictive model from a given dataset that follows data mining principles and techniques;
b) Explanations as to how to handle missing values in a dataset; and
c) An interpretation of the outcomes of the customer churn analysis.
Please refer to the Task Instructions (below) for details on how to complete this task.

Task Instructions
1. Dataset Construction

Kaggle telco churn dataset is a sample dataset from IBM, containing 21 attributes of approximately 7,043 telecommunication customers. In this Assessment, you are required to work with a modified version of this dataset (the dataset can be found at the URL provided below). Modify the dataset by removing the following attributes: MonthlyCharges, OnlineSecurity, StreamingTV, InternetService and Partner.
As the dataset is in .csv format, any spreadsheet application, such as Microsoft Excel or Open Office Calc, can be used to modify it. You will use your resulting dataset, which should comprise 7,043 observations and 16 attributes, to complete the subsequent tasks. The ‘Churn' attribute (i.e., the last attribute in the dataset) is the target of your churn analysis.
Kaggle.com. (2020). Telco customer churn-IBM sample data sets.

2. Model Development
From the dataset constructed in the previous step, present appropriate data visualisation and descriptive statistics, then develop a ‘decision-tree' model to predict customer churn. The model can be developed in Jupyter Notebook using Python and Spark's Machine Learning Library (Pyspark MLlib). You can use any other platform if you find it more efficient. The notebook should include the following sections:
a) Problem Statement
In this section, briefly state the context and the problem you will solve in the notebook.
b) Exploratory Data Analysis
In this section, perform both a visual and statistical exploratory analysis to gain insights about the dataset.
c) Data Cleaning and Feature Selection
In this section, perform data pre-processing and feature selection for the model, which you will build in the next section.
d) Model Building
In this section, use the pre-processed data and the selected features to build a ‘decision-tree' model to predict customer churn.
In the notebook, the code should be well documented, the graphs and charts should be neatly labelled, the narrative text should clearly state the objectives and a logical justification for each of the steps should be provided.
3. Handling Missing Values
The given dataset has very few missing values; however, in a real-world scenario, data- scientists often need to work with datasets with many missing values. If an attribute is important to build an effective model and have significant missing values, then the data- scientists need to come up with strategies to handle any missing values.
From the ‘decision-tree' model, built in the previous step, identify the most important attribute. If a significant number of values were missing in the most important attribute column, implement a method to replace the missing values and describe that method in your report.

4. Interpretation of Churn Analysis
Modelling churn is difficult because there is inherent uncertainty when measuring churn. Thus, it is important not only to understand any limitations associated with a churn analysis but also to be able to interpret the outcomes of a churn analysis.
In your report, interpret and describe the key findings that you were able to discover as part of your churn analysis. Describe the following facts with supporting details:
• The effectiveness of your churn analysis: What was the percentage of time at which your analysis was able to correctly identify the churn? Can this be considered a satisfactory outcome? Explain why or why not;
• Who is churning: Describe the attributes of the customers who are churning and explain what is driving the churn; and
• Improving the accuracy of your churn analysis: Describe the effects that your previous steps, model development and handling of missing values had on the outcome of your churn analysis and how the accuracy of your churn analysis could be improved.

Attachment:- Visualisation and Model Development.rar

Reference no: EM133000457

Questions Cloud

What is the effective monthly rate : Mortgages have an APR (annual percentage rate - a stated rate) of 6.24%. Payments and compounding are monthly.
Explain the financial thinking and behaviour : How does your current age or life stage affects your financial thinking and behaviour?
Describe function of financial statements to different users : Describe the function of financial statements to different users.
Model evaluation assessment : Model Evaluation Assessment - Effectively report and communicate findings to an appropriate audience.
Visualisation and model development assessment : Visualisation and Model Development Assessment - Apply data science principles to the cleaning, manipulation, and visualisation of data
What is the value of purchase discount : On 9/4/2019 the company paid the full amount in cash assuming that the sales term was (2/10, n/30). What is the value of purchase discount
Design data pipeline assessment : Identify best practices in data collection and storage, including data security and privacy principles; and Effectively report and communicate findings
Identify best practices in data collection : Identify best practices in data collection and storage, including data security and privacy principles; and Effectively report and communicate findings
What the price at which you willing to purchase these bonds : If the market interest rate is 8% per annum, compounded semi-annually, what will be the price at which you will be willing to purchase these bonds

Reviews

Write a Review

Python Programming Questions & Answers

  Write a Python program that creates hash signatures

Write a Python program that creates hash signatures for each file in your cloud 9 project directory, write a Python program that utilizes the HTTP protocol

  Find the number of vowels in the string

Using list and one definition for find the Number of vowels in the String.

  Give examples that show different features of string slices

Give at least three examples that show different features of string slices. Describe the feature illustrated by each example. Invent your own examples.

  BDA601 Big Data and Analytics Assignment

BDA601 Big Data and Analytics Assignment Help and Solution, Laureate International Universities - Assessment Writing Service

  Design a program that displays the number of slices

You're planning a pizza party and you plan to give each person 3 slices of pizza. Design a program that displays the number of slices that will be leftover.

  Define a definition for the function and recursive function

Define a definition for the function, pow(x, N), to compute x N for integer x and integer N, e.g. 3 1001 . Write a recursive function.

  Calculate and display how much money the winner will receive

Calculate and display how much money the winner will receive annually before tax and after tax if annual installments is chosen.

  Define a class person and its two child classes

Define a class Person and its two child classes: Male and Female. All classes have a method getGender which can print Male for Male class and Female for Female.

  Generate x sinx for x values ranging

You will need to use standard I/O and perhaps even copy and paste the output of the Python application to your graphing tool.

  Interaction between the customer and the machine

In Python:Simulate a cash register or ATM including the interaction between the customer and the machine (i.e. assume that you are automating the responses)

  Write a function with parameter num cycles

Write a function shampoo_instructions() with parameter num_cycles. If num_cycles is less than 1, print "Too few.". If more than 4, print "Too many."

  Write a gradebook program that lets a teacher keep track

Write a gradebook program that lets a teacher keep track of test averages for his or her students. Your program shoudl begin by asking the teacher.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd