Reference no: EM133340172
Assignment - Create clusters of songs basis their attributes using K-means clustering
- ABC is a Asian bank. It has multiple banking products that it sells to customer such as saving account, credit cards, investments etc. It wants to which customer will purchase its credit cards. For the same it has various kind of information regarding the demographic details of the customer, their banking behavior etc.
- Once it can predict the chances that customer will purchase a product, it wants to use the same to make pre-payment to the authors
Data details:
Number of Attributes: 18 (16 predictive attributes, 1 non-predictive, 1 goal field)
- Predict which customer will subscribe for term deposit (product)?
- Use different methods, and suggest the best method for this problem
Attribute information:
Input variables:
# bank client data:
0 - Id (ID variable)
1 - age (numeric)
2 - job : type of job (categorical: "admin.","unknown","unemployed","management","housemaid","entrepreneur","student",
"blue-collar","self-employed","retired","technician","services")
3 - marital : marital status (categorical: "married","divorced","single"; note: "divorced" means divorced or widowed)
4 - education (categorical: "unknown","secondary","primary","tertiary")
5 - default: has credit in default? (binary: "yes","no")
6 - balance: average yearly balance, in euros (numeric)
7 - housing: has housing loan? (binary: "yes","no")
8 - loan: has personal loan? (binary: "yes","no")
# related with the last contact of the current campaign:
9 - contact: contact communication type (categorical: "unknown","telephone","cellular")
10 - day: last contact day of the month (numeric)
11 - month: last contact month of year (categorical: "jan", "feb", "mar", ..., "nov", "dec")
12 - duration: last contact duration, in seconds (numeric)
# other attributes:
13 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact)
14 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric, -1 means client was not previously contacted)
15 - previous: number of contacts performed before this campaign and for this client (numeric)
16 - outcome: outcome of the previous marketing campaign (categorical: "unknown","other","failure","success")
Output variable (desired target):
17 - y - has the client subscribed a term deposit (product)? (binary: "yes","no")
Assignment
Answer the following Questions:
1. Which of the method among is the best method to use for classification, provide reasoning?
1. Neural Networks
2. Logistic Regression
3. Decision Trees
4. Random forest
2. For what purpose do we use Market Basket Analysis?
3. Where do we use clustering, provide real life examples?
Attachment:- K-means clustering.rar