Reference no: EM132225824
Assignment -
Part I - Step-By-Step Instructions Demonstrating How To Create A Query
Click the New Query window and make sure the database dropdown above the Object Explorer window is set to the AdventureworksDW2016 database.
Right-click anywhere in the Query window you just opened. Select the "Design Query in Editor" option. This brings up the Query Designer/Add Table windows. If you don't see the same table options for the AdventureworksDW2016 database, then you didn't have the Query window set to the correct database. You will have to close out of both windows, set the database to AdventureworksDW2016 as indicated above and go through the steps again to get to the Query Designer/Add Table windows.
Start off adding the DimDate, DimGeography, DimProduct and FactInternetSales tables. You can select them all at the same time by holding the CTRL key down and choosing Add, or click Add after selecting each individually. Once they have all been added, then Close the Add Table window.
Make the Query Designer window larger and move the tables around so that you can clearly see the joins that were automatically created for you. Notice that the FactInternetSales table is connected to the DimProduct using the ProductKey. This is a good join and doesn't required any changes.
Next review the connection(s) from the FactInternetSales table to the DimDate table. The DateKey in the DimDate table is connected to the OrderDateKey, the DueDateKey and the ShipDateKey. Right-click the connections to the DueDateKey and the ShipDateKey and remove them. Only the connection to the OrderDateKey should remain. You should only have one connection between each of the tables, otherwise all three have to be true-the OrderDateKey, DueDateKey and ShipDateKey would have to be the same. This situation might be of interest to the users, but in this case we are just wanting the OrderDateKey.
Finally, notice that the DimGeography table is not connected to the FactInternetSales. This means that either there is not a valid connection path between the two tables, or there is one or more additional table(s) needed to make the two tables connect.
Scroll down the fields in the DimGeography table and look for any key fields. In this case, there is only one other key field other than the GeographyKey. Add the additional table named in this other key. You can bring up the Add Table window again by right-clicking in the Query Designer table area. The new table connects to both the FactInternetSales table and the DimGeography table.
Now, go through and select the fields you want in the SELECT section of the query. Start with the fact table and choose all non-key fields. You can always remove fields if you determine later they are not needed. Then review the dimension tables and add the non-key fields of interest.
Click OK and the new query appears in the original query window you opened. Notice the join types default to INNER JOINS. If you click OK in the Query Designer window before you have verified all the connections and a connection is missing, you will see a CROSS JOIN added between the two tables. Do not run the query if a CROSS JOIN is present. The CROSS JOIN gives you all the rows of one of the tables for each row in the other table (i.e. table A rows multiplied by table B rows). If you have a large number of rows in either or both tables, this will create a huge record set that can take up all the resources in the database and potentially the server. (This is not desirable and may force you to delete and re-create your Immersion server.)
After verifying that you do not have any CROSS JOINS, execute the query. Inspect the data returned. Does there appear to be any duplicate data? Add DISTINCT after SELECT and run the query again. If this did not resolve the duplicate data issue, then find the field(s) that are different in the duplicate rows. Determine what table(s) this field is from.
My version of the query is creating multiple rows because of the DimGeography table. This makes sense because as it turns out, there are a number of different cities, state province names, zip codes and IP address locators in each sales territory. If I remove these fields, then the query is only pulling EnglishCountryRegionName from the DimGeography table, but the duplicate data issue is resolved.
If you still have duplicate data after removing fields from one table, you may need to take the same steps for other tables.
Part II - Create Queries and Power BI Visualizations
Review all the dimension and fact tables to make yourself familiar with the information available in this data warehouse. Expand each of the following tables so that you can see the available fields. In a Word document, list the primary and/or foreign key(s) contained in each of these tables and also list the table that the foreign key(s) links to.
DimCustomer, DimDate, DimEmployee, DimGeography, DimOrganization, DimReseller, FactFinance, FactInternetSales, FactResellerSales, FactProductInventory (40 points)
Create a query that pulls distinct internet sales data by quarter and year. Include all the non-key fields from the FactInternetSales table, the primary key(s), the English product name, standard cost, list price, product line, and the territory group, region and country with a product status of "Current".
Run the query and take a screenshot of the results (include the query in the screenshot). Submit this screenshot in a Word document and copy and paste the query text into the Word document, as well. (10 points)
Pull the query into Power BI (How to import query into Power BI). Create 4 different visualizations (use a different type for each) that displays the information in the query in an interesting and useful way. You may add additional fields from the dimension tables already included in the query specifications, if you find that you need them for what you wish to show in these visualizations. The purpose of this part of the assignment is for you to become familiar with using the different date and geography dimensions that would be typical in a data warehouse. Submit the Power BI document(s). (20 points)
Create another query that pulls the same data for reseller sales. Make sure you list the corresponding fields and that they are listed in the same order as the original query. Perform a UNION of the two queries. Run the resultant query and take a screenshot of the results (include the query in the screenshot). Submit this screenshot in a Word document and copy and paste the query text into the Word document, just as before. Re-create the 4 visualizations from above and submit the Power BI document(s), as well.
Attachment:- Assignment Files.rar