Discuss the features of aoache hive for data warehousing

Assignment Help Management Information Sys
Reference no: EM131729467

Part 1: 200 words with reference

Discuss the features of Hadoop

Part 2: 100-125 words with references

Topic Question:

Discuss the features of Aoache Hive for data warehousing

Discussion Post:

Apache Hive is an open-source data warehouse system for querying and analyzing large datasets stored in Hadoop files. Hadoop is a framework for handling large datasets in a distributed computing environment

Hive has three main functions: data summarization, query, and analysis. It supports queries expressed in a language called HiveQL, which automatically translates SQL-like queries into MapReduce jobs executed on Hadoop. In addition, HiveQL supports custom MapReduce scripts to be plugged into queries.

Hive also enables data serialization/deserialization and increases flexibility in schema design by including a system catalog called Hive-Metastore. Hive supports text files, SequenceFiles (flat files consisting of binary key/value pairs) and RCFiles (Record Columnar Files which store columns of a table in a columnar database way.

Reference

What is Apache Hive? - Definition from WhatIs.com. (n.d.).

Critical Reply with reference:

Part 3: 100-125 words with references

Topic Question:

Most inputs are validated by some combination of completeness checks, format checks, range checks, check digits, consistency checks, and database checks. Provide and explain in detail at least two validation methods for these inputs.

Discussion Post:

Input validation is a key function of a data entry system. A check for completeness is often done automatically by specifying a data field not be empty. A required data element such as a birth date or address, which must be present for a record to be valid, should make use of such a validation.

However, it is also important to verify that the proper type of information is present and of proper format. A numeric field should contain numeric data just as a text field should contain text. It is very difficult to validate text data that is unknown, such as a street address.

Although ascertaining the validity of such entries is difficult, a developer must also protect the system against malicious code injections by ensuring certain text strings are not present. SQL injection attacks are carried out by inserting specific text strings into data fields, coercing the database to perform action not available to users. It is the responsibility of the developer to ensure such string are not present in text fields before submitting them to the database.

References

Dennis, A., Haley Wixom, B., Roth, R. (2012). System Analysis and Design, Fifth Edition.

Critical Reply with reference:

Part 4: 100-125 words with references

Topic Question:

Explain the reasons why the study of HCI has become increasingly important for systems analysts and for the SDLC.

Discussion Post:

HCI, or Human Computer Interaction, has become increasing important as computers continue to become more important to the everyday life of humans. The way that humans have used computers has arguably reshaped the thought process of the entire millennial generation from that of generation X, and the market for computer software and systems is continually growing.

In its most basic sense, HCI is the study of human and computer interaction and activities ("HCI", 2017). The goal of HCI is to understand how interactions with computers affect individuals thinking and behaviors to attempt to determine the proper way to make computer systems safe to use, easy to understand, promote productive use, and ensure that computer use is enjoyable.

To this end, many elements of HCI focus on improving the interfaces through which people interact with computer software and hardware to ensure that users can pick up and intuitively understand how to operate complex systems.

With regards to the impact of HCI on the System Development Life Cycle (SDLC), HCI has become one of the most important considerations because it helps ensure that users are able to get use value out of a system without having to undergo training or push past a learning curve that could discourage user participation.

Unless you are creating a government or corporate system, in which, use is mandated by the employer, people will always have the option of using a separate software platform that may not be as developed or sophisticated on the back end, but may support better HCI. And, without user support, most systems will fail.

Reference

HCI (2017). Technopedia.

Critical Reply with reference:

Part 5: 100-125 words with references

Topic Question:

Discuss the features of Hadoop

Discussion Post:

Hadoop is a framework or platform made for large amount of data and used to perform analytical tasks, searches, data retention, log file processing and more.

Hadoop has two main components HDFS and MapReduce. HDFS which stands for Hadoop Distributed File System, replicates data across multiple nodes which increases data reliability. MapReduce performs computations on large data sets. It divides the computations into parts and assigns each to worker nodes. A quick glance at the characteristics by Sindol (2014):

Here are the prominent characteristics of Hadoop:

Hadoop provides a reliable shared storage (HDFS) and analysis system (MapReduce).

Hadoop is highly scalable and unlike the relational databases, Hadoop scales linearly. Due to linear scale, a Hadoop Cluster can contain tens, hundreds, or even thousands of servers.

Hadoop is very cost effective as it can work with commodity hardware and does not require expensive high-end hardware.

Hadoop is highly flexible and can process both structured as well as unstructured data.

Hadoop has built-in fault tolerance. Data is replicated across multiple nodes (replication factor is configurable) and if a node goes down, the required data can be read from another node which has the copy of that data. And it also ensures that the replication factor is maintained, even if a node goes down, by replicating the data to other available nodes.

Hadoop works on the principle of write once and read multiple times.

Hadoop is optimized for large and very large data sets. For instance, a small amount of data like 10 MB when fed to Hadoop, generally takes more time to process than traditional systems.

Sindol, D. (2014, January 30). Big Data Basics - Part 3 - Overview of Hadoop.

Reference no: EM131729467

Questions Cloud

Calculate the differential profit or loss : Calculate the differential profit/loss if the order is accepted, Input all amounts as positive values. Leave no cells blank - be certain to enter "0" wherever
Personal mentoring and coaching strategies and techniques : Evaluate your personal mentoring and coaching strategies and techniques.
Find value of the standard error of the sample proportion : In a CBS News/New York Times nationwide poll done in 2009, the proportion of respondents who thought that it should be illegal to use a handheld cellular.
What are the current incentives faced by medical students : What are the current incentives and disincentives faced by medical students in choosing to become primary care versus specialty-focused physicians?
Discuss the features of aoache hive for data warehousing : Discuss the features of Aoache Hive for data warehousing.Discuss the features of Hadoop.
Compute confidence interval for the population proportion : Refer to Example, in which for a sample of 1003 American adults, .56 was the proportion who think that it is somewhat or very likely that intelligent.
How you would conduct an admission-seeking interview : Discuss how you would conduct an admission-seeking interview of the plant controller. Discuss some of the options that the company has.
Create an outline of your project : This week you will create an outline of your project. Next week you will use this outline to guide you as your create your presentation
Find population proportion who would fire their boss : Suppose that in a random sample of 300 employed Americans, there are 57 individuals who say that they would fire their boss if they could.

Reviews

Write a Review

Management Information Sys Questions & Answers

  Effective strategies for global organizationsglobal

effective strategies for global organizationsglobal organizations encompass many cultures. what is an effective

  Discussion of the exsisting system

Discussion of the exsisting system and brief discussion of the current standard used to manage forms - manage the transistion to a scanned form.

  Use of the major types of computer systems

Important information about Computer System - development and use of the major types of computer systems?

  Describe internal or online information security risk

Discuss some of the key issues to be aware of and the best practices to mitigate them.List and describe internal (online) information security risk.

  Explain attack surfaces to your manager

You attempt to explain attack surfaces to your manager. Much as the same before, she asks that you provide her with a memo detailing the concept. Complete the following assignment:Write a ½ page memo summarizing what an attack surface is and why it i..

  Study of is from a business perspective

Answer to Information system - What is the focus of the study of IS from a business perspective

  Identify the website the sender and perceived receiver

Identify the Website, the sender, and perceived receiver. Analyze the integrated business communication. Assess the media richness of the section.

  Define the sdlc model and methodology

Select a System/Software Development Life Cycle (SDLC) model and methodology then apply this model and methodology to a project using the Information Technology (IT) specialization you wrote about in your Week 1 paper. Be sure to define the SDLC ..

  Explanation of the strategies

Explanation of the strategies that at least two companies you make electronic purchases from use to market products to their customers.

  How mastercard could use its data warehouse to help

Do you think that there are limits to the types of applications that "tap & go" payments can be used for? Why or why not?

  The human resource department

Write a 350- to 500-word description, individually, based on your Week Three Learning Team Collaborative discussion of what you would do

  Discuss about the hci theories

Discuss the idea that all applications have to be visually consistent. Argue whether you agree or disagree with the idea and why.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd