Identify the challenges that will arise in integrating

Assignment Help Other Subject
Reference no: EM133545280 , Length: word count:1500

Big Data and Analytics

Assessment - Design Data Pipeline

Learning Outcome 1: Explain and evaluate the V's of Big Data (volume, velocity, variety, veracity, valence, and value)

Learning Outcome 2: Identify best practices in data collection and storage, including data security and privacy principles; and

Learning Outcome 3: Effectively report and communicate findings to an appropriate audience.

Task Summary

Critically analyse the online retail business case (see below) and write a 1,500-word report that:
a) Identifies various sources of data to build an effective data pipeline;
b) Identifies challenges in integrating the data from the sources and formulates a strategy to address those challenges; and
c) Describes a design for a storage and retrieval system for the data lake that uses commercial and/or open-source big data tools.

Context
A modern data-driven organisation must be able to collect and process large volumes of data and perform analytics at scale on that data. Thus, the establishment of a data pipeline is an essential first step in building a data-driven organisation. A data pipeline ingests data from various sources, integrates that data and stores that data in a ‘data lake', making that data available to everyone in the organisation.

This Assessment prepares you to identify potential sources of data, address challenges in integrating data and design an efficient ‘data lake' using the big data principles, practices and technologies covered in the learning materials.

Case Study
Big Retail is an online retail shop in Adelaide, Australia. Its website, at which its users can explore different products and promotions and place orders, has more than 100,000 visitors per month. During checkout, each customer has three options: 1) to login to an existing account; 2) to create a new account if they have not already registered; or 3) to checkout as a guest. Customers' account information is maintained by both the sales and marketing departments in their separate databases. The sales department maintains records of the transactions in their database. The information technology (IT) department maintains the website.

Every month, the marketing team releases a catalogue and promotions, which are made available on the website and emailed to the registered customers. The website is static; that is, all the customers see the same content, irrespective of their location, login status or purchase history.

Recently, Big Retail has experienced a significant slump in sales, despite its having a cost advantage over its competitors. A significant reduction in the number of visitors to the website and the conversion rate (i.e., the percentage of visitors who ultimately buy something) has also been observed. To regain its market share and increase its sales, the management team at Big Retail has decided to adopt a data-driven strategy. Specifically, the management team wants to use big data analytics to enable a customised customer experience through targeted campaigns, a recommender system and product association.

The first step in moving towards the data-driven approach is to establish a data pipeline. The essential purpose of the data pipeline is to ingest data from various sources, integrate the data and store the data in a ‘data lake' that can be readily accessed by both the management team and the data scientists.

Task Instructions

Critically analyse the above case study and write a 1,500-word report. In your report, ensure that you:

- Identify the potential data sources that align with the objectives of the organisation's data- driven strategy. You should consider both the internal and external data sources. For each data source identified, describe its characteristics. Make reasonable assumptions about the fields and format of the data for each of the sources;

- Identify the challenges that will arise in integrating the data from different sources and that must be resolved before the data are stored in the ‘data lake.' Articulate the steps necessary to address these issues;

- Describe the ‘data lake' that you designed to store the integrated data and make the data available for efficient retrieval by both the management team and data scientists. The system should be designed using a commercial and/or an open-source database, tools and frameworks. Demonstrate how the ‘data lake' meets the big data storage and retrieval requirements; and

- Provide a schematic of the overall data pipeline. The schematic should clearly depict the data sources, data integration steps, the components of the ‘data lake' and the interactions among all the entities.

Reference no: EM133545280

Questions Cloud

Understand the importance of leadership for quality : Understand the importance of leadership for quality. Compare the Total Quality (TQ) view of leadership to several prominent leadership theories
Describe the key customer segments in the market : Identify and describe the key customer segments (buyer groups) in the market in which your product competes. How is your product/service currently positioned
Why do you think cain chooses to focus on extroversion : why do you think Cain chooses to focus on extroversion/introversion? What is the main message Cain hopes you take away from her talk?
What are some industries that might be particularly : What are some advantages to an employer to provide accommodations to workers with disabilities? What are some industries that might be particularly
Identify the challenges that will arise in integrating : BDA601 Big Data and Analytics, Torrens University Australia - Identify the challenges that will arise in integrating the data from different sources
What are eight key components of an effective business model : What are the eight key components of an effective business model? Describe the five primary revenue models used by e-commerce firms.
Explain how robust corporate social responsibility policies : Explain how robust corporate social responsibility policies can enhance a company's reputation and public perception. Examine how having responsible corporate
What is the strategy proposed by norton for matching : Give an example of traditional methodology and briefly explain that methodology. Give an example of agile methodology and briefly explain that methodology.
How well you summarize a culture through a sound : You will be graded on how well you summarize a culture through a sound and validated set of dimensions such as the ones learned in this course.

Reviews

Write a Review

Other Subject Questions & Answers

  The 4th amendment prohibition of general warrants means

The winner of a U.S. Presidential election is not always the winner of the largest number of votes cast by citizens.

  Would the hbm be described as a shared theory

Would the HBM be described as a shared theory? In the second paragraph, second sentence, the key "outcome" of the model is specified. What is it

  What does the cap depend on for medical compensatory damages

Gender and religious discrimination have a $300,000 cap total on non-pecuniary compensatory and punitive damages.

  How does the research address your picot question

Summarize the findings from the articles you selected for your literature review. Describe at least one nursing practice that is supported by the evidence in the articles. Justify your response with specific references to at least 2 of the article..

  How the genders of the five characters affects your reading

Need help with an English research paper. Trifles by Susan Glaspell is what the 2,000 word research needs to be based off of and it.

  Using okashas philosophy of science

I have just had a short introduction (using Okasha's "Philosophy of science - a very short introduction" as main text-book) and covered philosophers such as Popper and Kuhn.

  Describe the health care organization or network

Identify any current or potential issues within the organizational culture and discuss how these issues may affect aspects of the strategic plan.

  Inspire the team to achieve high goals

explain why you characterized him or her as inspiring or uninspiring. You must use information presented in the course materials or outside sources, which you reference, to make your determination.

  Ideology impinges on communicative competency

Discuss how ideology impinges on communicative competency in practice, especially legal venues such as child protective investigations and services.

  What benefits or consequences do the behaviors have

Provide an explanation as to why it is important to address the health behavior you have chosen. What benefits or consequences do the behaviors have on individuals' health

  The distance travelled in a given time period t by a

the distance travelled in a given time period t by a cyclist riding a fixed gear bike depends on the rotations of the

  Do you receive advertisements through e-mail

Do you receive advertisements through e-mail? Are they directed toward any specific audience or product category? Do you pay much attention or just delete them? How much work is it to get off an advertising list?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd