Reference no: EM132896743
A primary deliverable for this course will be a group-based analytics project proposal in which you will address some business question that can be answered using the analytical techniques we studied. This document should be 5 pages long (approximately 2500 words) and should contain the sections below, which are recognized as best practices in analytics by the Institute for Operations Research and the Management Sciences (INFORMS). Note: for parts VI-VII, you are not required to have gathered real data. You will be conducting all summaries, visualizations, and hypothesis tests on simulated data. See the video on D2L called "Generating Random Data That Follows a Distribution" for instructions.
I. Business Problem Framing. Provide a detailed description of the business question or problem in terms a layperson can understand. For example: Why is revenue low? Why is a certain process slow? Why are there so many returned products? Why is repeat business so low? Why is turnover so high? How can we accurately predict sales? Identify stakeholders and risks, and delineate the anticipated benefits of finding an answer.
II. Analytics Problem Framing. Explain how the business question can be answered using analytics. What are the variables? What data sources are available? Assess the costs and the probability of success. Discuss the kinds of tools and personnel you will need to complete the project successfully.
III. Data. Explain the source of the data you would collect to adequately answer your question. How/when would it be generated? Who would create it? What format would it be in? Would there be any security/privacy concerns? Would there be any special validity problems with the data (for example, social media data tends to be not only untrustworthy, but also full of spelling and grammar mistakes). Note: in these questions, you are asked to comment on problems that would arise with any real data you would collect, even though your project data will be simulated.
IV. Methodology. Explain how you generated the simulated data. What distribution did you use? What characteristics of the data generating process caused you to use that distribution? Explain in detail how you simulated that data. Did you use inverse functions? What parameters did you use for the inverse functions? Did you use the inverse CDF method? Generate descriptive statistics and visualizations to describe your data.
V. Model Building. What analytical techniques would you use to answer your question? For example, z-test, one-sample t-tests, independent samples t tests, related samples t-test, ANOVA, chi-squared test, linear regression, logistic regression, decision analysis, linear programming? Conduct the analysis and report the results. How strong are your results? Is your question answered?
VI. Model Deployment. How will the results be used? Is your question answered definitively? If so, what managerial decisions could be made as a result? Did your analysis aid in the creation of software tools that can be used by stakeholders on an on-going basis?
VII. Lifecycle Management. Anticipate what you will need to do in the future to ensure that your results remain relevant and useful. Will data need to be periodically refreshed? Will classification models have to be retrained? Under what circumstances would the results be no longer useful?