Compare the cluster centroid to characterize clusters

Assignment Help Applied Statistics
Reference no: EM132296289

Part 1: Frequent Flyers and Marketing

Overview - The dataset below EastWestAirlinesCluster.csv contains information on 3999 passengers who belong to an airline's frequent flier program.

For each passenger, the data include information on their mileage history and on different ways they accrued or spent miles in the last year. The goal is to try to identify clusters of passengers that have similar characteristics for the purpose of targeting different segments for different types of mileage offers.

In R Your Job is To:

  • Apply hierarchical clustering with Euclidean distance and Ward's method. Make sure to normalize the data first. How many clusters appear?
  • Tell me: What would happen if the data were not normalized?
  • Compare the cluster centroid to characterize the different clusters, and try to give each cluster a label.
  • Check the stability of the clusters, by removing a random 5% of the data (by taking a random sample of 95% of the records), and repeat the analysis. Does the same picture emerge?
  • Use k-means clustering with the number of clusters that you found above. Does the same picture emerge?
  • Tell me: Which clusters would you target for offers, and what types of offers would you target to customers in that cluster?

Part 2: Classifying Internet Discussion Posts

Overview - In this problem, you will use the data from the chapter assigned for this week, particularly problem 20.6 Online Discussions on Autos and Electronics, in which the task is to develop a model to classify documents as either auto-related or electronics-related.

In R Your Job is To:

  • Load the above file into R and create a label vector.
  • Preprocess the documents. Explain what would be different if you did not perform the "stemming" step.
  • Use the lsa package from R to create 10 concepts. Explain what is different about the concept matrix, as opposed to the TF-IDF matrix.
  • Using this matrix, fit a predictive model (different from the model presented in the chapter illustration) to classify documents as autos or electronics. Compare its performance to that of the model presented in the chapter illustration.

Attachment:- Assignment Files.rar

Reference no: EM132296289

Questions Cloud

What is the relationship between planning and management : What is the relationship between planning and management? What is the relationship between planning and policy?
The organisation quality and delivery standards : When monitoring the team’s performance to consistently meet the organisation’s quality and delivery standards,
Explain anchoring and adjustment : Relate your analysis to the roles of System 1 and System 2 reflecting a clear understanding of these concepts.
Benefits and risks associated with financial leverage : What is financial leverage? What are the benefits and risks associated with financial leverage?
Compare the cluster centroid to characterize clusters : Frequent Flyers and Marketing - Compare the cluster centroid to characterize the different clusters, and try to give each cluster a label
Manager used good strategy in your organization : What happened when a manager used good strategy in your organization.
Explain the importance of planning : Explain the importance of planning, organization, staffing, directing, and controlling for effective business management
How do you summarize thoughts on value propositions : How do you summarize thoughts on value propositions and their relevance to your startup business?
Which quantitative-qualitative manpower forecasting method : Which quantitative or qualitative manpower forecasting method do you believe Honeywell used to decide to move forward with furloughs rather than layoffs?

Reviews

Write a Review

Applied Statistics Questions & Answers

  Find the value of k

Let "x" be a random variable from the standard normal distribution. Find the value of "K" for the following problems. (a) P(x=0)=K  (b) P(x ≤ K)= 0.9

  Water specimens contain nitrates

Water specimens contain nitrates, a solution that is dropped into the water will cause the specimen to turn red 95% of the time. When used on water specimens without nitrates, the solution turns the water red 10% of the time. Past experience in the l..

  State the conclusion of the test in the context

SPH-Q381 HOMEWORK - T-TEST & HYPOTHESIS TESTING PROBLEMS. State the conclusion of the test in the context of this setting

  Manager determine the number of each biscuit to make

The franchise has 6 hours of labor available and has contracted to get 30 lbs. of sausage and 30 lbs. of ham. The manager also plans on purchasing 16 lbs. of flour. Profit on a sausage biscuit is $.60 and $.50 on a ham biscuit. Create a model to help..

  Describe the advantages and disadvantages of quasi-experimen

2.Describe the advantages and disadvantages of quasi-experiments? What is the fundamental weakness of a quasi-experimental design? Why is it a weakness? Does its weakness always matter?

  Create a pie chart and a pareto chart

MAT 152 Signature Assignment - Using SPSS, For the qualitative data - Create a pie chart and a Pareto chart and Create a frequency table

  Thoughts on the value of statistics in general

Thoughts on the value of statistics in general

  What is the average rating for all cbc movies

What is the average rating for all CBC movies? How about ABN movies and BBS movies and create a line graph of the monthly average ratings for CBC for the year

  Construct a hypothesis test

Continuing with the previous example, the hat company now wants to get a sense of how many hats men and women will purchase. Construct a hypothesis test to determine whether these sample have different means while assuming different population varian..

  Keeps track of the number of complaints

A customer service center keeps track of the number of complaints received each day about one of their new products. The number of complaints received over the last 11 day period are 19, 18, 22, 21, 17, 18, 22, 19, 16, 23, and 25. The IQR for this sa..

  Find the z scores for which the distributions area lies wit

Find the z-scores for which 88% of the distributions area lies within -z and z. The z-scores are ?

  Is there anything unusual about geographical distribution

A closer examination of the top 100 showed 55 in the Americas, 37 in Europe, and 8 elsewhere. Is there anything unusual about the geographical distribution of the world's top 100 universities?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd