Reference no: EM131113575
Data Mining - Milestone Two
Page length requirements: 1-2 pages
Overview
Milestone Two is centered on tools and visualizations. Describe a simple sampling strategy that might be used to address the business question from a subset of the customer population. What type of information do you need to address the question?
The scenario: Bubba Gump Shrimp Company is a successful retailer of regional food, both in its restaurants and through other retail channels. Bubba Gump began as a small, privately owned restaurant. Thanks to unexpected exposure from a blockbuster movie, Bubba Gump grew rapidly from its humble beginnings and now operates several restaurants, sells branded merchandise through an online retail site, and wholesales its branded merchandise to other retail outlets. Bubba Gump's growth was initially very rapid in response to a strong demand and high name recognition that followed from its movie exposure. After its first few years of rapid growth, sales increased at slower rates and finally leveled off. Sales have declined in each of the last two years.
Bubba Gump Shrimp Company has collected a large amount of data about its business, including restaurant point-of-sale (POS) data, web channel sales performance, customer information through restaurant loyalty programs, and customer and sales transaction data through its website and retail partners. Bubba Gump's leadership has decided to commission an analysis of the company's vast data assets to better understand its customers and look for ways to create new revenue growth.
You have been assigned to plan, conduct, and report on this data mining initiative for Bubba Gump Shrimp Company. The company data that is available to you includes Bubba Gump's restaurant point-of-sale (cash register, credit card) data, its customer database (collected from its restaurant loyalty program and online sales channel), its web store sales transaction data, and customer and sales data from third-party retailers.
All of Bubba Gump's data has recently been integrated in a data warehouse. That enterprise data warehouse was built specifically to support data mining initiatives like the one you have been assigned to conduct, by consolidating data from multiple operations and channels in one place and integrating the data across sources for a complete view of the customer experience. For the first time, Bubba Gump analysts can link sales transactions to specific customers at specific restaurants, for example. It also means that you can link customer transactions across channels; that is, for any given customer, you can link to both their restaurant purchases, their online purchases, and (in some cases) their purchases from third-party retail partners.
You have been selected to develop and execute the data mining analysis plan for Bubba Gump's customer analysis project. Your project will be the first major data mining project conducted against the new Bubba Gump data warehouse. Because Bubba Gump's data was not previously integrated in a single data warehouse, company leadership has never been able to analyze its customers across their complete experience. In other words, customer restaurant purchases, online purchases, and third-party retailer purchases could not be analyzed together previously; each channel had to be analyzed separately.
As a first step, a sample of 500 customers has been selected from the analytics data warehouse and given a survey in exchange for purchase credits at one of Bubba Gump's sales channels. The survey sample was selected from the universe of customers who have made purchases from at least one Bubba Gump outlet (restaurant, web store, etc.). Responses to various customer satisfaction questions were recorded, and historical purchase information has been extracted from the data warehouse for each customer in the sample.
Your task is to analyze the survey responses to understand whether there are natural "clusters" within Bubba Gump's customer population. You are then to create a visualization of this survey data that describes Bubba Gump's customers across any dimensions that define those subgroups.
Your Assignment
In your response, address the following critical elements:
Analysis Tools
What data mining tools will you use to perform the analysis? Why these particular ones?
Data Visualizations
What data visualizations will you use in your report, and why?
Research Question
What is the specific research question that needs to be addressed? What research question will you work from in order to analyze the given data for meaningful patterns?
Research Measurement
How will you determine if your research question was answered or if your hypothesis-generation was successful? How will you measure progress?
Follow-Up Questions
What are cogent follow-up questions or explorations that should follow from your initial research?
Research and Support
Are there any published sources or other resources that address your line of inquiry? Where do they fall short? How will they help guide your analysis?
Guidelines
Assignment must follow these formatting guidelines: double spacing, 12-point Times New Roman font, one-inch margins, and APA citations. Page length requirements: 1-2 pages.