Reference no: EM133191064
Question 1) When securing a data warehouse, there are many security considerations and techniques you can use:
a. List the 4 layers of security that need to be considered
b. Provide 2 best practices for each of the layers listed in part a.
Question 2) When implementing a data warehouse, performance is usually a paramount consideration. List 10 items which could have a performance impact on the data warehouse itself and whether that impact would be positive or negative. You should have 10 items and 10
positive/negative indications.
Question 3) If you were handed the requirement to deploy a data warehouse which is continuously available (up and running 24x7x365), describe the following considerations:
a. The difference between high availability and disaster recovery
b. The kinds of outages you must protect against
c. The approach options available to address this requirement
d. The topology options for deploying a continuously available environment
e. Any other considerations
Question 4) Describe the different degrees of latency, how they would be implemented and some of the key characteristics of each regarding normalization, cleansed, etc. Also include a description of an Operational Data Store (ODS) and how it distinguishes itself from the Data Warehouse:
Question 5) Big Data is a new and important trend in the market place today.
a. List the “4 Vs†or possible characteristics of a Big Data workload.
b. Include a description and example of each
c. List 2 ways that a user or developer can access/work with data in Hadoop
Question 6) We are going to build a logical model (like you did in Assignment 1) for a car rental agency, however, weâ€TMll have some enhancements which include a multi-model database and leveraging sources of data from Internet of Things. You are only building the logical model.
You do not need to build the physical model. This is a paper exercise. As you build your logical model, here are the considerations and common queries:
a. The FACT table contains rental facts. Each car rental by a customer is a record in the fact table.
b. You must have the following DIMENSION tables. Cars. Customers.
c. You must pick at least FOUR additional DIMENSION tables for your logical model
d. You must have a REFERENCE table which contains Canadian postal codes
e. Part of the data you will be collecting is from sensors and devices from the rental vehicles.
You will need an EVENT table which gathers this input (more details below)
f. The actual rental contract will be in JSON format and must be stored with each fact.
g. The following are the common queries that the users will be performing:
i. The service department will be leveraging data from the sensors and devices, like alignment, tire pressure, break pad wear and fluid levels â€" they will want to have a maintenance requisition prepared for each rental, prior to its planned return. Assume each sensor has a SENSOR ID which uniquely identifies it. Have a table (maintenance table), not part of the star schema, to meet this need â€" and â€" a table that stores the information from the sensors and devices (event table) which maps to the vehicle identification number (VIN) and date/time.
1. Define a reference table which maps each sensor ID to a vehicle (VIN)
2. Write the SQL query that will take data from the event table and produce a maintenance record in the maintenance table.
ii. The financial analysts will want to have weekly, monthly, quarterly and annual revenue reports by rental agency location.
1. Write the SQL queries that will generate these financial results.
2. Define a summary table which will help with the performance of these queries
iii. The business analysts will want to have a summary report showing the utilization rate for each vehicle â€" and a utilization summary for the companyâ€TMs fleet as a whole.
Note: utilization rate means what % of days is a particular car being rented versus sitting on the lot (idle).
1. Write the query that will produce a report for each vehicle in the fleet showing its utilization rate and the summary for the company.
2. Define a summary table which will help with the performance of these queries
iv. The legal team will want to have access to the original signed documents, requested
by rental agreement number, to pull up agreements and look at insurance options chosen by customers.
1. Write a query, that takes a rental agreement number as input, and returnsthe rental agreement document, in JSON format.
2. Create an INDEX which would help improve the performance of these queries.
Question 7 List ten 'best practice' capabilities you should (or should not) use when building your physical data warehouse. Think in terms of your CREATE TABLE command and what you would / would not want to leverage.
Question 8 Describe the various options you have for dealing with latency requirements in a warehouse. Include a few points for each option which distinguishes them from the other options
(b) Also make sure you describe clearly what an operational data store is in comparison to the data warehouse.
3) If you were asked to build a “never down†(highly available) data warehouse, what are the various deployment considerations you must consider. Make sure to include a point or two of description for each one.
Question 9 When figuring out the system size for a Data Warehouse / BI System, what are the four key influencing factors? Include a couple of considerations associated with each
Question 10. Describe the various options you have for dealing with latency requirements in a warehouse. Include a few points for each option which distinguishes them from the other options (b) Also make sure you describe clearly what an operational data store is in comparison to the data warehouse. 3) If you were asked to build a “never down†(highly available) data warehouse, what are the various deployment considerations you must consider. Make sure to include a point or two of description for each one.