How these sources of data facilitate making decisions

Assignment Help Other Subject

Reference no: EM132919660

MITS6005 Big Data

Answer the following questions about big data and tools and technologies to grow businesses and help to make appropriate decisions.

Question 1 Variety of Big data refers to the heterogeneous sources and nature of data. There are three types of data, namely structured, semi-structured and unstructured.

How these sources of data facilitate making decisions in various businesses? Illustrate the answer with an appropriate case explaining their roles that they may play in the analysis of Big Data sets for large companies.

Question 2 A MapReduce job usually splits the input dataset into independent chunks processed by the map tasks in a completely parallel manner. The MapReduce framework has many phases, amongst which the sort phase maps the input to the appropriate intermediate key-value pair. Discuss the different phases of the MapReduce framework and demonstrate the working with an appropriate example.

Question 3 An Australia based Higher Education Institute, "Victorian Institute of Technology", has international students across the ten countries of continent Asia. It has campuses across all ten countries and stores the details of the student in the Hive database. The structure of the data stored in Hive includes records such as student_id, student_name, student_school, home_state, home_country, and enrollment_year. Each table in Hive can have one or more partition columns to organize the Hive table and optimize it.
Analyze and propose column(s) that can be picked up as partition key to the given table in Hive? Give justification for your selection.

Question 4 Hadoop Distributed File System (HDFS) is designed to store and transfer massive data sets reliably and handles fault tolerance. Discuss how HDFS handles fault tolerance by performing replication of the big data.
Further, consider Steve has a Hadoop cluster, and there is a file of size 514 MB stored in HDFS (Hadoop 2.x) using default block size configuration and default replication factor. Calculate the number of blocks that needs to be generated for the given size and find each block's size to be stored in the Hadoop.

Question 5 NoSQL is an alternative to the traditional relational database system. There is a significant growth of using NoSQL databases, particularly in big companies.
Answer the following questions in relation to NoSQL.
a) Why NoSQL is better than relational database for big data?Compare and contrast the differences between relational databases and NoSQL databases. Your discussion should touch on performance, operational workloads and scale. Compare the circumstances under which you would use one over the other and provide contrasting examples.
b) Which guarantee (Consistency, Partition tolerance, availability)and can be relaxed for the following use case?
1. The data in banking applications should respond accurately to customer's query.
2. An online store wants to function 24/7, so that shoppers can make purchases exactly when they need.
3. A distributed system share data to different regions without failure.

Attachment:- Big Data.rar

Reference no: EM132919660

Questions Cloud

Provide audit opinion reasoning disclose on situation : Provide audit opinion reasoning disclose on following situation with. And also mention the amount that auditor wants company to.

Describing wwe corporate level strategy : doa comprehensive report describing WWE corporate level strategy applying key concepts such as diversification, synergy and related and unrelated diversificatio

Determine the issue price of the debenture : Determine the issue price of the debenture. On 1 July 2018 JTX Ltd issues $2 million in 10-year debentures that pay interest each six months at a coupon rate.

Between five and ten suggestions regarding behaviour : Your organisation/service has decided to develop a behaviour policy setting out the guidelines for acceptable and positive behaviour which children and employee

How these sources of data facilitate making decisions : How these sources of data facilitate making decisions in various businesses? Illustrate the answer with an appropriate case explaining their roles

Have ever been apart of drawing up a business plan : Have you ever been apart of drawing up a business plan? If so, what did it entail and how was the experience? If not, Did you ever have to do a plan

Identify any role or function of government : Identify any role or function of government that is intended to prevent the destabilization of society within Hampton roads Virginia?

What role does religion play in the domestic policy : I am studying South Asian Politics and society and I've just finished my exam so I'm pretty much done with the module. However I have a question I wanted to ask

Calculate the value of a three-month european put option : Calculate the value of a three-month European put option on the stock with exercise price of $40. Verify that no-arbitrage arguments and risk-neutral valuation.

User Account

All Pages