Reference no: EM133775493 , Length: word count:4000
Big Data
Distributed Big Data Computing Frameworks Assessment
Objectives: Evaluate and compare various distributed big data computing frameworks, focusing on their architecture, performance, scalability, ease of use, and application areas.
Structure:
Introduction (10%)
Define distributed big data computing.
Importance of distributed computing frameworks in handling big data.
Overview of the report.
Framework Analysis (40%)
Apache Hadoop
Architecture (HDFS, MapReduce, YARN)
Performance and scalability
Pros and cons
Use cases
Apache Spark
Architecture (RDD, DAG, Spark SQL, MLlib)
Performance and scalability
Pros and cons
Use cases
Apache Flink
Architecture (DataStream API, Batch Processing, CEP)
Performance and scalability
Pros and cons
Use cases
Other Relevant Frameworks (e.g., Apache Storm, Apache Samza)
Brief overview
Comparison with the above frameworks
Comparative Analysis
Comparative table highlighting key features, advantages, and disadvantages.
Discussion on the best framework for different use cases (real-time processing, batch processing, machine learning, etc.).
Case Study
Detailed analysis of a real-world application using one of the discussed frameworks.
Evaluation of the chosen framework's performance and impact on the application.
Conclusion
Summary of findings.
Recommendations based on the comparative analysis.
References (not graded but mandatory)
Cite all sources in a consistent format (APA/MLA/Harvard).
Presentation:
Objective: Present the key findings from the report in a clear, engaging, and concise manner.
Structure:
Introduction
Brief overview of the topic and purpose of the presentation.
Key Findings
Highlight major points from the framework analysis.
Use visuals (charts, tables, diagrams) to illustrate comparisons.
Case Study Summary
Summarize the case study, focusing on the application and impact of the chosen framework.
Conclusion
Summarize the overall findings and recommendations.
References
Citation and listing of references as per IEEE format.
Students need to demonstrate to the tutor that each team member has made a significant contribution to the report. It is suggested the group use a collaborative environment such as google drive to store documents and work on the assignment. You will also create a document that lists each task and the name of the team member/s responsible for the task.
The task allocation must be approved by the tutor before commencing other work on the report.The group is also required to discuss their progress with the tutor on a weekly basis.
Additional information regarding this Assessment:
Report document standards
Normal font is Calibri, size 11 point for the body of all documents with the text fullyjustified.
Headings should not exceed 14 points in size except on a title page where larger fontsare appropriate for the title of a report.
Documents should use 1.15 spacing within a paragraph and have an 8-point spacebetween paragraphs.
Footers should be created on the report that includes a page number.
Up to 15% of the Report contents may be quoted or paraphrased from
other sources provided you acknowledge and cite the original source of the material you use.
Use IEEE referencing on all quoted or paraphrased material.