Reference no: EM133676526
Big Data
Research Report
Introduction
In this assessment you will write a critical analysis report on an academic paper(s) approved by your lecturer in the field of Big Data, or a specific industry case study.
Objective(s)
In this assessment you will have to search for a recent paper/case study about the Big Data and write a report on the paper. In the report, you need to discuss the proposed idea, methods, findings, critiques, and your recommendations. Your report should be limited to approx. 1500 words (not including references). Use 1.5 spacing with a 12-point Times New Roman font. Though your paper will largely be based on the chosen article, you should use other sources to support your discussion or the chosen paper's premises. Citation of sources is mandatory and must be in the IEEE style.
Assessment Details
Introduction
In this assignment you will be given a small case study, and you will need to apply your knowledge to identify the main issues, prioritise, provide insights and to discuss alternatives.
Objective(s)
In this assessment you will be required to write a report and present a 5-10-minute presentation on real-world applications which required big data tools to store and process their data. You need to identify and describe the datasets in the application and discuss how the data will be handled using at least three big data techniques from Hadoop ecosystem. The dataset can be either structured or unstructured data.
Case Study:
GlobalHealth Innovations Ltd, a leading healthcare organization based in Melbourne, is gearing up for worldwide expansion. As a seasoned Big Data specialist in the company's IT department, you have been assigned the responsibility of creating a comprehensive report. The organization is keen on harnessing the power of Big Data to enhance healthcare delivery, optimize resource management, and improve patient outcomes. With a strategic vision to establish a unified global health platform, the company aims to leverage advanced analytics and Hadoop ecosystem tools to process and analyze diverse healthcare datasets. The goal is to provide personalized patient care, streamline operations, and contribute to medical research by analyzing trends and patterns in healthcare data. Your report will be instrumental in guiding the organization's technological strategy for the international healthcare landscape.
Specific requirements:
Identify a big data application from your choice
Explain the specification of the large dataset
Discuss the reasons to use big data tools
Explain how the large dataset will be handled using at least three big data tools from Hadoop ecosystem
Conclusion and References
Introduction
In this assessment you will work in groups on a major practical based case to leverage data by applying big data techniques to implement a solution, provide insights on analytics performed and make recommendations.
Objective(s)
You will work with your group and leverage the Google Play Store Apps data set.
Context
While many public datasets (on Kaggle and the like) provide Apple App Store data, there are not many counterpart datasets available for Google Play Store apps anywhere on the web. On digging deeper, I found out that iTunes App Store page deploys a nicely indexed appendix-like structure to allow for simple and easy web scraping. On the other hand, Google Play Store uses sophisticated modern-day techniques (like dynamic page load) using JQuery making scraping more challenging.
You will prepare a final report outlining the following tasks:
Task 1: Using Tableau or any other visualization tool, explore the dataset by creating at least six different charts to visualize the attributes and the relationship between the attributes in the dataset. It is required to interpret the figures in the report.
Task 2: Propose a data analytics question and build a data analytic model based on this question (e.g. prediction or clustering model). Implement the model using Python or any other programming language. It is required to cover the following subtasks:
Propose the data analytics question.
Describe the method used to create the model.
Discuss the model construction.
Create the experiments and discuss the results.
Discuss the challenges working with the large dataset and how did you overcome these challenges?
Submission requirements:
Your report should have 1500-2000 words addressing the tasks. The report structure includes the following: a cover page, introduction about the case study, dataset description, addressing the above tasks, and conclusion.
The presentation should be a maximum of 7 minutes for the whole team. Each member should talk for at least 2 minutes related to the project and findings. The entire presentation should cover the dataset, results, and conclusion.
General Instructions
Your writing should be clear and concise and be in your own words.
The report must be in the range of 1,500-2,000 words in length excluding references.
Your report should be a single word or pdf document containing your report and need to be submitted through Moodle. 4. One submission per group and make sure all group members are there with contribution table at the end of the report.
One submission per group and make sure all group members are active in the video with at least 2 minutes' talk from the project.
Use headings to guide the reader and include tables or diagrams that make the case clearer.
The program code needs to be attached at the end of the report as an Appendix.
The referencing style must follow the IEEE referencing style.