Reference no: EM132400109
Big Data and Analytics Assignment -
Analytic Report -
Purpose: The purpose of this task is to provide students with practical experience in working in teams to write a data analytical report to provide useful insights, pattern and trends in the chosen/given dataset. This activity will give students the opportunity to show innovation and creativity in applying SAS Analytics, and designing useful visualization and predictive solutions for various analytics problems.
Project Details:
This is a group assignment and you will complete the task with your team. Your team will be made up of at most 3 members who are all enrolled in the same laboratory - the teams will be allocated by your tutor. It is expected that each team member will contribute equally to the project.
Your team will use SAS Visual Analytics to explore, analyze and visualize the dataset provided. You will receive feedback on the draft about presentation choices, content, analysis, and style.
The aim is to use the data set allocated to provide interesting insights, trends and patterns amongst the data. Your intended audience is the CEO and middle management of the U.S. Department of Health and Human Services who are responsible for overseeing the health industry in America.
In addition, each individual team member will write a short reflection as part of the report on their individual experience on working on the project.
Tasks -
- Task 1- Background information - Write a description of the dataset and project, and its importance for the organization. Discuss the main benefits of using visual analytics to explore big data. In this you should include a justification for using the visualizations that you will use and how they have been successful in other similar projects. This discussion should be suitable for a general audience. Information must come from at least 6 appropriate sources (2 per student) be appropriately referenced. [2 to 3 pages].
- Task 2 - Reporting / Dashboards - For your project, perform the relevant data analysis tasks by answering the guided questions provided (see Appendix for questions and dataset) and, identify the visualization you need to develop. Note: remove any missing data points from your visualizations where possible/suitable
- Task 3 - Additional Visualizations - In addition to the guided questions, it is expected that each student will provide at least two other visualizations of the data (i.e. for a group of 3 students this is 6 extra visualizations). These additional visualizations will be judged in terms of quality of the findings and complexity of analysis. These visualizations should be using multi-dimensional, filtering and advance calculation techniques.
- Task 4 - Justification -Justify why these visualizations are chosen in Task 2 and 3. Note: To ensure that you discuss this task properly, you must include visual samples of the reports you produce (i.e. the screenshots of the BI report/dashboard must be presented and explained in the written report; use 'Snipping tool'), and also include any assumptions that you may have made about the analysis in your Task 2 (i.e. the report to the operational team of the company). [1 to 2 pages].
- Task 5 - Discussion of findings - using the visualizations created discuss the findings from the data set. In this discussion you should explain what each visualization shows. Then summarize the main findings. [3 to 4 pages].
- Task 6 - Executive Summary - summary of the data analysis including a brief introduction, methods used and a list of the key findings [1 page only].
- Task 7 - The Reflection (Individual Task) - each team member is expected to write a brief reflection about this project in terms of challenges, learning and contribution. [1 to 2 pages].
The report will be approximately 8 to 12 pages in length (not counting cover page and references). The report will include the following in the order provided below:
- A cover page including the names and student id of all team members
- Table of Contents
- Table of Figures / Tables
- Executive Summary
- Background
- The body of the report including reports, insights, justifications and visuals
- Discussion of findings
- Conclusion
- References
- Appendices
Appendix: Data Set and Guided Questions
- Teradata - SAS Visual Analytics Data Source - READMIT-HISTORICAL
Guided Questions
1. GROUP TASK: Create a data dictionary for the data source by the group.
2. What are the average number of ICU days with respect to diagnose group and gender?
3. For each region, what is most and least common diagnosis group?
4. For each diagnosis group, which is most and least popular disease?
5. What are top 5 departments with respect to number of patients?
6. What are top 3 regions with respect to female patient numbers?
7. What are top 5 places where patients are discharged?
8. What are top 3 regions with respect to "black" race?
9. What are the top 5 hospitals with respect to Asthma patients' number of visits?
10. What are the active and inactive months in terms of admission for both male and female patients?
11. What are top 3 regions with respect to average days spend in hospital? Hint- You need to create a measure to calculate number of days spend in hospital
12. What are top 10 cities with respect to number of patients?
13. What is the trend of number of patient's admission from October 2011 to June 2012 with respect to region for both male and female?
14. Display only the most and least popular month in question 9 at a time.
15. What is the trend of patient numbers between Jan 2012 to June 2012 diagnosed with "CHF" only?
16. What is the trend of different diagnose group over the months?
17. What are top 5 departments in terms of number of operations and how these operations vary across months?
18. What are the most appropriate predictors of heart disease? Hint- use decision tree
19. Create a geomap of the Hospitals and patient number.
20. Create a cluster analysis on patient related data.