Reference no: EM133052758 
                                                                               
                                       
Assignment :  Big Data and Data Warehouse.
In this assignment you will make recommendations on the collection and management of Big Data on behalf of your client, and you will explain how and where all of this data will be stored.
What database will you use? Will it store raw unstructured data or pre-formatted structured data?
Your choices here also depend in part, on your prior choices involving the cloud and will further influence your future choices for networks.
Your client's data might be found in multiple places and in multiple formats. For the purposes of this assignment assume you CAN get the data by either partnering with the data-owner, or maybe by recommending the rights to the data be purchased by the client, or perhaps by screen-scraping data from the client's own website(s), or by uploading client financial data.
If you are using video data as well as part of your proposal, then there are other considerations. How will you store and use that data?  Perhaps you use software that interprets activity within videos? Perhaps you will plan for constituents to upload mobile phone video data to your client's website.
1.	For your chosen business (the business of your client) and the industry he/she is in, determine if it is advisable to plan this new data analytics function and database in a manner where it will be established at a cloud service provider (CSP)? Explain why. Find similar cases elsewhere.
2.	Where is this Big Data found?
3.	What is the format and type of the database going to be?
4.	How will the data get from wherever it is into this database? Supply a data flow diagram (DFD).
5.	Will you store unformatted data? If so what application will format the data when you read it for analysis?
6.	Will you store formatted data in a Data Warehouse? If so supply the schema diagram.
7.	Is this data going to be historical in nature?
8.	Is this data going to include a real-time component? If so this greatly complicates the scenario and you need to address the impact of outages on data loss and probably need to mention the need for a Helpdesk to support the real-time function. Any real-time component will significantly impact your future networking recommendations.
9.	Will you be recommending some form of data warehouse? If so, will you use ETL formatting or something else?
10.	Will you be recommending a Hadoop structure? If so where will this be hosted?
11.	Create a workflow diagram (WFD) to show the activities from data generation, to data capture, to analysis of data, to report generation.