Apply key statistical theories and data mining concepts

Assignment Help Basic Statistics
Reference no: EM133188961

MIS772 Predictive Analytics - Deakin Business School

Assessment - Data Analysis and Report

Learning Outcome 1: Understand and apply key statistical theories and data mining concepts

This assignment aims for students to learn how to ...
• Articulate problems and solutions in business terms
• Gain insights from data
• Prepare data for different models
• Develop classification models

Case Study Description
This case is about a Portuguese banking institution, which has a growing customer base. The bank manager would like to employ data analytics and machine learning to analyse its customer data for bank direct marketing campaign. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required.

The data set contains approximately 45,211 observations with 17 variables as described in the below:
bank client data:

1 - age (numeric)

2 - job: type of job (categorical)

3 - marital: marital status (categorical)

5 - default: has credit in default? (binary)

6 - balance: average yearly balance, in euros (numeric) 7 - housing: has housing loan? (binary)

8 - loan: has personal loan? (binary)

related with the last contact of the current campaign: 9 - contact: contact communication type (categorical) 10 - day: last contact day of the month (numeric)

11 - month: last contact month of year (categorical)

12 - duration: last contact duration, in seconds (numeric)

13 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact)

14 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric, -1 means client was not previously contacted)

15 - previous: number of contacts performed before this campaign and for this client (numeric) 16 - poutcome: outcome of the previous marketing campaign (categorical)

17 - y - has the client subscribed a term deposit? (binary)
Financial AI are interested in generating some insights about the clients, especially answering the below questions:

A. What is the distribution of customer age by marital status?

B. What are the (top 5) most popular occupations among the bank customers? Among them, which occupation has the highest average yearly balance? Which occupations has the most people completed tertiary education?

C. How to reliably predict if the client will subscribe to the term deposit? Define appropriate measures and compare the performance of different classifiers to predict client's subscription.
Financial AI wants you to use RapidMiner to process and explore the provided data, and then develop and evaluate classifiers to predict the customer's subscriptions to the term deposit, and to minimise misclassifications.

Task and Deliverables:
• Executive Summary: Define your problem and solution in business terms, in doing so answer questions A, B and C, cross-reference with other report sections for support.
• Data Exploration, Pattern Discovery, and Preparation: Visualise the selected attribute characteristics. Use the visualisations to support answering questions A and B.
Prepare data for predictive modelling. Transform attributes or create new ones as needed. Use appropriate analysis and data visualisation to investigate relationships between attributes (predictors and label). Interpret results.
• Predictive Modelling: Create and explain two classification models, e.g., k-NN and Decision Tree, to address part of question C. Explain and justify your model's properties. Investigate and deal with any class imbalance.
• Model evaluation and improvement: Use hold-out and cross-validation of the model. Utilize honest testing. Compare the performance of different models and select the best. Qualify how much we can trust the answer to question C.

Attachment:- Data Analysis and Report.rar

Reference no: EM133188961

Questions Cloud

What was the estimated useful life of the machinery in years : Unadjusted accumulated depreciation on December 31, 2022 amounted to 100,000. What was the estimated useful life of the machinery in years
Explain how items treated in triangle financial statements : Explain how the items in (i) to (iii) above should be treated in Triangle's financial statements for the year to 31 March 2020
What are dawits income and price of good : What are dawits income and price of good y and what is the simplified version of dawit''s budget line equation
What is appropriate method for communicating with the team : The assessment of the risk treatment for appropriateness and ensuring it is lawful and follows organisational requirements
Apply key statistical theories and data mining concepts : Understand and apply key statistical theories and data mining concepts - What are the (top 5) most popular occupations among the bank customers
How much amortization expense should company a record : Company B set its annual rate of return at 4%, and Company A is aware of this rate. How much amortization expense should Company A record
Describe experiences you have had in your palces of work : Reflect on the notion of philanthropy and what it means to you based on your personal experiences and the experiences you have had in your places of work
Discuss how the methods under capital budgeting will be used : Discuss how the methods under capital budgeting will be used by managers to come up with corporate decisions
Make a cost of production report using fifo : The ending inventory was 70% complete with respect to materials. Make a cost of production report using FIFO and weighted-average method

Reviews

Write a Review

Basic Statistics Questions & Answers

  Statistics-probability assignment

MATH1550H: Assignment:  Question:  A word is selected at random from the following poem of Persian poet and mathematician Omar Khayyam (1048-1131), translated by English poet Edward Fitzgerald (1808-1883). Find the expected value of the length of th..

  What is the least number

MATH1550H: Assignment:  Question:     what is the least number of applicants that should be interviewed so as to have at least 50% chance of finding one such secretary?

  Determine the value of k

MATH1550H: Assignment:  Question:     Experience shows that X, the number of customers entering a post office during any period of time t, is a random variable the probability mass function of which is of the form

  What is the probability

MATH1550H: Assignment:Questions: (Genetics) What is the probability that at most two of the offspring are aa?

  Binomial distributions

MATH1550H: Assignment:  Questions:  Let’s assume the department of Mathematics of Trent University has 11 faculty members. For i = 0; 1; 2; 3; find pi, the probability that i of them were born on Canada Day using the binomial distributions.

  Caselet on mcdonald’s vs. burger king - waiting time

Caselet on McDonald’s vs. Burger King - Waiting time

  Generate descriptive statistics

Generate descriptive statistics. Create a stem-and-leaf plot of the data and box plot of the data.

  Sampling variability and standard error

Problems on Sampling Variability and Standard Error and Confidence Intervals

  Estimate the population mean

Estimate the population mean

  Conduct a marketing experiment

Conduct a marketing experiment in which students are to taste one of two different brands of soft drink

  Find out the probability

Find out the probability

  Linear programming models

LINEAR PROGRAMMING MODELS

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd