Explain the characteristics of big data

Assignment Help Other Subject
Reference no: EM133129408 , Length: word count:2000

COMP1702 Big Data

Learning Outcome 1: Explain the concept of Big Data and its importance in a modern economy
Learning Outcome 2: Explain the core architecture and algorithms underpinning big data processing
Learning Outcome 3: Analyse and visualize large data sets using a range of statistical and big data technologies
Learning Outcome 4 Critically evaluate, select and employ appropriate tools and technologies for the development of big data applications

Part A

o Task A.1 Explain the main characteristics of Big Data. (Word count: 200 words ±10%)

o Task A.2 Compare Hadoop and Relational Database Systems. Give an application scenario that is well suited to Hadoop and explain your reason. (Word count: 300 words ±10%)

Part B MapReduce Programming

Suppose that you have a large student file which cannot be stored in a single machine. Each record of this file contains information: (Student_ID, Student_Name, Sex, Age, Module, Grade, Department).

o Task B.1 Please design a MapReduce Algorithm (Pseudo-codes or Java Codes) to output the average grade for each module. The algorithm is expected to be as efficient as possible.

o Task B.2 Describe the algorithm designed. You should explain how the input is mapped into (key, value) pairs by the map stage, i.e., specify what is the key and what is the associated value in each pair, and, if needed, how the key(s) and value(s) are computed. Then you should explain how the output (key, value) pairs of the map stage are processed by the reduce stage to

get the final answer(s). You should also analyse the efficiency of the MapReduce algorithm designed. (Word count: 300 words ±10%)

Part C: Big Data Project Analysis
The CropY company is a leading provider of precision agriculture service. Precision agriculture is the science of gathering, processing, and analysing temporal, spatial and individual data. It combines other information to support management decisions according to estimated variability for improved resource use efficiency, productivity, quality, profitability.

The CropY company is now plan to develop a big data project to meet the following requirements: help worldwide users better understanding the implications of the weather and making contingency plans; buying supplies, such as fertilizer and seeds; as well as maintaining and monitoring the quality of yield, whether livestock or crops; knowing the variety of cultivated plants, conditions of its growth and its needs of seeds; choosing the type of fertilizer and pesticides, understanding their employment conditions and their impact on the climate- soil-plant; recognizing daily water needs for each kind of plant; calculating the median and mean values of yield; studying the conditions of natural environment; estimating the financial revenue and manage the potential risks.

o Task C.1 : The volume of big data is expected to be more than 500 Petabytes. The data will come from various sensors, satellites, drones, social media, market data, Online news feed etc. The Figure 1 below shows some example data of CropY company. Some IT technician plan to build a data warehouse to store data for further data analysis tasks but some others believe data lake is a better choice. Which choice do you prefer? Please justify your choice. (Word count: 300 words ±10%)

o Task C.2: The data of CropY company includes a large collection of plants, corps, diseases, symptoms, pests, and relationships between them. The CropY company needs to build a data analytical store which can facilitate queries like: "find all diseases which are directly or indirectly caused by nitrogen deficiency". Please recommend a data store and justify your choice. (Word count: 300 words ±10%)

o Task C.3: Some prediction and analytics services provided by the CropY company require to response in a few seconds after the arrival of new data. Namely, they are real time or near real time prediction and analytics tasks. Some IT managers suggested a popular distributed processing framework - MapReduce to implement these tasks. Do you agree with that? Please justify your choice. (Word count: 300 words ±10%)

o Task C.4: CropY company decided to move most of applications and services to cloud. These applications and services need to be highly available, scalable, and accessible from worldwide. Note that some data such as price and customer data are confidential. Please design a cloud hosting strategy for this big data project and explain how your design will meet the security, scalability, high availability. (Word count: 300 words ±10%)

Attachment:- Big Data.rar

Reference no: EM133129408

Questions Cloud

Explain the four social studies concepts : What is one, whole or small group project-based activity that you feel would be appropriate for any one of these four social studies concepts?
Journalize the listed transactions for the years : The number of bonds owned has not changed from December 31, 2022, to December 31, 2024. Journalize the listed transactions for the years
Example of a dramatic play center : An example of a dramatic play center and describe what items would we see as props in this learning center?
Calculate the percentage increase in us population : Calculate the percentage increase in U.S. population from year 2010 to 2018. Make Us population 2018 sheet active and use the estimated 2018 US population
Explain the characteristics of big data : Critically evaluate, select and employ appropriate tools and technologies for the development of big data applications - Explain the concept of Big Data
What is the annual interest rate on your account : You have deposited some amount in a bank account paying a daily interest rate of 0.021%. What is the annual interest rate on your account
Current state of the us labor market : What do these indicators tell us about the current state of the U.S. labor market. Identify both strengths and weaknesses.
Gathered political and economic data : Rank the appropriateness of introducing your selected product into the three markets from "most appropriate" to I'least appropriate" using the gathered politica
What is geometric return : Question - The following are the returns of company A for five years: What is geometric return

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd