What you would expect for full industrial dw implementation

Assignment Help Database Management System
Reference no: EM131469313

Module Learning Outcomes

Module Learning Outcomes are the official statement of what you are intended to gain from the module. ("On successful completion of the module the student will be able to ....."). All module specifications carry these statements.

In the table below we list the three Module Learning Outcomes for Data Warehousing. Then in purple italics we state what this assignment asks you to do in order to demonstrate your achievement of each MLO.

- Apply concepts and justify decisions when modelling, designing and constructing practical examples or paper descriptions of applications in this area.

You are provided with data and a working project definition for a Simple Star. Demonstrate your ability to design, apply and discuss FOUR sets of changes to implement more advanced concepts.

A range of suggestions for concept topic areas is given.

- Describe and critically evaluate the role and relevance of data warehousing and analytical investigation to the solution of business information problems.
and ...

- Explain the concepts that underpin the subject area of data warehousing, making reference to main established concepts and some developing areas.

Using online and / or printed literature sources, critically evaluate your implementation, in particular addressing the question "Except for scale (number of rows of data), what is different about your implementation compared to what you would expect for a full industrial* DW implementation?"

Part A Amend/extend an existing Data Warehousing implementation.

You are provided with all the data and a working project definition for a Simple Star. Demonstrate your ability to design, apply and discuss FOUR sets of changes to implement more advanced concepts.

A range of suggestions for concept topic areas is given.

During the module you have been provided with an SQL Database "SalesStarAssignment" containing data tables forming a Simple Star, plus a Visual Studio (aka Data Tools) Project to generate a Data Cube. The teaching notes also provided small number of NewFacts tables, each representing three months and one day's further information on sales.

Amend and/or extend this application in FOUR of the following ways

- Ensuring data quality

The datasets provided for this assignment have been checked to ensure that all data is valid. Even within the provided FactUpdate tables there are no incorrect dates, and no references to items that are not listed in the Dimension tables.

In real implementations it is very unsafe to assume that all entries in NewFacts are valid, because these tables are usually compiled from data from operational data sources, and those sources may have low quality control.

- Use literature (cite your sources, of course) to find what you believe to be the most common data quality problems that occur with new facts.

- Edit one of the provided FactUpdate tables to include examples of the problems you list.
Implement processes to detect and resolve the "problems" you introduced into the data set. In your report illustrate and explain your processes.

- Implementing additional Calculated Measures and Key Performance Indicators (KPIs)

The teaching notes for this module show how to implement Calculated Measures and KPIs.

- Identify further situations where Calculated Measures and KPIs would be useful for data analysis and implement your own examples. In your report illustrate and explain your work. Higher marks are available for examples that go beyond the module's basic teaching notes - cite the literature sources for all techniques you use that go beyond the course notes.

- Integration of data from different sources including data conformation

Nearly every Data Warehouse for analytic systems draws data from multiple sources. For example, sales could be made in a number of different countries and reported in slightly different forms (eg different currencies, or using different codes/names for the same items).

Assume the "SalesStarDemo" company is receiving its sales data from two franchised outlets, each of which report its data in formats that are not encoded in the same way as the provided Facts table. For example, for some fields the two sources don't even the same labels for the same things (they might sell the same product, but under different branded names).

- Create two tables, each representing a day's sales for one of the franchises (the number of records does not need to be large). Implement and explain processes to generate a unified NewFacts table that is appropriate for then uploading into the Production star.

- Explain any scripts and Staging area tables (eg lookup tables) you create. Wherever you use techniques you read about in the literature, cites sources.

- Cubes with multiple fact tables

The above example discusses a situation where multiple sources are effectively all about the same things (sales by our company), but coded inconsistently. During the Extract-Transform-Load processes, these sources are integrated and recoded into a single NewFacts table, which can then be appended to the Simple Star held within SalesStarDemo.

In other situations the multiple sources may be about related but different things. In this case it is often appropriate to generate Complex Stars with multiple Facts tables, and new Dimension tables.

Assume that our case study company is able (legally! perhaps though a market research company) to get hold of a summary of competitors' sales data on a monthly aggregated basis (i.e. once a moth we receive a report listing totals of HOW MUCH each competitor has sold in that month, but not to which customer or exactly what date). You may need to make other assumptions about the contents or level of detail in the summary report.

- Show and explain the following: Change the provided Data Warehouse to implement the Competitor data as a separate Facts table. Populate the new Facts table with suitable data. Rebuild the Cube such that it now has two Facts tables. Demonstrate the use of the Cube. If you make other design decisions, explain these. Cite any literature sources of help.

- Other (Eg Visualisation, Slowly Changing Dimensions, Use of Tabular facilities, your own choice)

- The module teaching notes make brief mention of quite a number of other techniques not listed in the "dot" titles above. Learn about one of these from literature (eg online tutorials; give references) and apply the approach to the "SalesDemo" dataset.

Marks will be awarded according to the extent of independent learning you demonstrate.

The total writing for Part A should be around 1200 words. Words beyond 1400 will not be read.

Part B

Using literature sources, critically evaluate your implementation, in particular addressing the question "Except for scale (number of rows of data), what is different about your implementation compared to what you would expect for a full industrial DW implementation?"

To answer Part B well you will have to establish what methods of implementation there are for full commercial or production analytical data warehousing.

You can get this information by reading books, journals and company white papers. You can use video lectures, tutorials, people's online blogs, company adverts too - but do not rely only on non-peer reviewed sources.

Many items can be accessed online books through the Library "Gateway" facility - see menu bars of Blackboard.

Almost certainly you will discuss what several authors say, highlighting the similarity or differences between their answers (try to analyse why they differ - eg what perspective are the authors taking? when was their document was written? does it have a bias towards a particular application/usage sector, etc), and you will use this comparison to review some pieces of your implementation, to explain to what extent your work is representative of what DW industry or researchers say is the topic of "data warehousing".

It is recommended that you pick a small number of topics (eg three, or four) and discuss these in detail rather than take a large number of topics and only discuss each in trivial depth.

Examples of topics you might discuss are:-

- What methodologies are used to structure data warehousing projects ("Inmon vs Kimball" is a good search starting point). What does the DW industry use? The approach you have used is closest to which?

- Industry-scale Data Warehouses probably use many tools to help automate routine processes. What are the main tools? Which are the processes most often covered by tools? How far do your scripts/processes illustrate the PRINCIPLES covered by the tools?

- Why does real data need so much data cleansing? What is industries' practice about data cleansing? In what ways does (or could) your assignment solution simulate what industry does?

- Results of Analytical data investigation are often presented visually (eg via graphs or displays). Why? You will have used a particular tool for your implementation (probably Excel). How representative of analytic visualisation tools is your assignment?

- Managing metadata is important. What is metadata? How does metadata help? What examples of metadata are there in your implementation?

Notice that we recommend that you discuss THREE areas, yet we have already listed FIVE topic examples. This is to illustrate that there is a wide range to choose. You can choose other topic areas yourself.

The total writing for Part B should be around 1200 words. Words beyond 1400 will not be read.

Citations and References

You MUST use APA-style for citations and referencing. This is SHU requirement (not just a module requirement).

Reference no: EM131469313

Questions Cloud

Why is process of assessing project feasibility so important : Why is the process of assessing project feasibility so important?
Discuss the proper accounting treatment for fleahead corp : Briefly discuss the proper accounting treatment for Fleahead Corp. Prepare any necessary journal entry (omit explanations)
Producing a cast product follows a learning curve : Producing a cast product follows a learning curve of a ratio of 80%.
Was this conviction unconstitutional under first amendment : Freedom of Speech. Mark Wooden sent an e-mail to an alderwoman for the city of St. Louis. Attached was anineteen-minute audio file that compared.
What you would expect for full industrial dw implementation : Data Warehousing Distance Learning (Spring 2017) - Amend/extend an existing Data Warehousing implementation.
Explain the general transition of us health laws : "Contracts Protocols Based on Criminal Aspects of Health Care"- Analyze the general transition of U.S. health laws based on criminal misconduct in health care.
Does this dispute go to arbitration or to trial : Arbitration. PRM Energy Systems owned patents licensed to Primenergy to use in the United States. Their contract stated that "all disputes" would be settled.
How 9/11 has affected the role of security : The topic is about how 9/11 has affected the role of security and in what ways have they changed. It must be a seven to nine page research paper.
Identify the three goals of a project : identify the three goals of a project

Reviews

len1469313

4/21/2017 2:58:52 AM

• The structure of your report is clear (Note:- you can choose the format, layout etc; just choose a clear structure) • Spelling and grammar is appropriate • The tone of the language is descriptive and evaluative. • Writing is concise. As examples :- There are no equivalents of “If you look at Figure 1 you will see that is shows …..”. The phrase “Figure 1 shows” is sufficient. There are no equivalents of “If you think about it, you will find that it is a fact that ….”. It is a fact whether or not the reader thinks about it. • Extent of use of literature is appropriate --- you have not used quotations that do not add to the logic of the argument; have paraphrased where appropriate Therefore there are no additional marks for writing in the expected style. Marks may be scaled back by up to 20% of failure to use a style appropriate for the context.

len1469313

4/21/2017 2:58:43 AM

Marking criteria : Writing style It is already expected that you conform to normal academic writing practices:- • Quotations are visibly distinguishable from the main text. Remember that quotations themselves don’t get any marks at all. So quoting a large block of text just uses up word count. However, it is often necessary to make a quotation to THEN follow it up with an analysis of what the quote says. The analysis often gets many marks (if done well, of course). • Paraphrases, and use of other people’s ideas, must be cited just as well as direct quotes. • APA referencing + page numbers used in all relevant places • Writing is academic style, 3rd person.

len1469313

4/21/2017 2:58:18 AM

Criteria Section A (LO2) Section B (LO1, LO3) (50%) Suitable for publication to a relevant audience with only formatting amendments. (eg 85 = suitable for module, 90+ = suitable for conference 95+ = suitable for journal

len1469313

4/21/2017 2:57:31 AM

Format of the report Please use WORD (which allows the tutor to give feedback directly into your document) or if necessary PDF. Do not submit in any other formats. “Report” format Although we have used the word “report”, a formal Abstract and Table of Contents, a Table of Figures and a Conclusion or Summary are not needed. Screen images illustrating your implementation are very likely to be helpful. “A picture tells a thousand words” However, you are not writing a tutorial class on how to implement a particular feature. Do not, for example, have a sequence of screen images which show step-by-step what was done per feature.

len1469313

4/21/2017 2:57:22 AM

Citations and References You MUST use APA-style for citations and referencing. This is SHU requirement (not just a module requirement). If you are not sure about APA style, ask your tutors or search SHUSPACE (there are several online guides to the style). Note too that all recent version of Word have APA-style referencing as one of the Reference / Style options. For this assignment is it required that you put the page number of any reference you use (Websites under four pages in length (if printed) need not have page numbers; Audio/video should have time indicators). This helps the reader quickly locate the section of text you are referring to.

len1469313

4/21/2017 2:57:13 AM

“Section X of the assignment was implemented by using / doing ............. . But (Author, Date, Page) says this is limited because of ...... and suggests solving the issue by ........” “[This] was implemented using / doing ...... . In the literature (Author, Date, Page) and (Author, Date, Page) both use very similar methods.” “The literature indicates that the method used [in the assignment] scales well. For example (Author, Date, Page) reports on an implementation ten times as big, taking only double the effort”.

len1469313

4/21/2017 2:57:04 AM

Make sure that quotations from sources stand out visually. Quotes longer than a sentence or two should normally be put into italics and given their own indented paragraph. Use quotations intelligently. Do not simply bump up the number of quotations by quoting obvious facts. For example, the quotation “Business Intelligence is a rapidly growing subject” (TDWI, 2015, p12)” may be true, but it does not help you discuss how far your implementation illustrated the principle topics of data warehousing. Here are some examples of how you can use references to note what others are saying in a way that helps you evaluate your work

Write a Review

Database Management System Questions & Answers

  Knowledge and data warehousing

Design a dimensional model for analysing Purchases for Adventure Works Cycles and implement it as cubes using SQL Server Analysis Services. The AdventureWorks OLTP sample database is the data source for you BI analysis.

  Design a database schema

Design a Database schema

  Entity-relationship diagram

Create an entity-relationship diagram and design accompanying table layout using sound relational modeling practices and concepts.

  Implement a database of courses and students for a school

Implement a database of courses and students for a school.

  Prepare the e-r diagram for the movie database

Energy in the home, personal energy use and home energy efficiency and Efficient use of ‘waste' heat and renewable heat sources

  Design relation schemas for the entire database

Design relation schemas for the entire database.

  Prepare the relational schema for database

Prepare the relational schema for database

  Data modeling and normalization

Data Modeling and Normalization

  Use cases perform a requirements analysis for the case study

Use Cases Perform a requirements analysis for the Case Study

  Knowledge and data warehousing

Knowledge and Data Warehousing

  Stack and queue data structure

Identify and explain the differences between a stack and a queue data structure

  Practice on topic of normalization

Practice on topic of Normalization

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd