Prepare a data prep with text and excel files

Assignment Help Computer Engineering
Reference no: EM131924823

Assignment: Tableau Basic Module

Data Prep with Text and Excel Files

Poorly Formatted Data

Data files are not always well-formatted. Now we will practice using a file that needs some work before it is ready for analysis.

For this assignment we'll be using the "Data Prep - Flights" Excel file, which you can download from the course website. (Global Superstore, our main dataset, is too well-structured! This gives us a messier example to work with.)

Open the file in Excel. Here we have a report in Excel, showing the number of resolved incidents per Employee per month. The "Ideal" tab shows how we wish the data would be formatted - like a database table.

However, sometimes we receive data that looks more like what we see in the "Resolved Incidents" tab. Luckily, there are several features in Tableau Desktop to help automatically reshape Text and Excel files to get them ready for analysis in Tableau.

Let's connect to this Excel file and see if we can work with that poorly formatted sheet.

• In Tableau Desktop,click on Excel, navigate to where you saved the file and click open.

• Drag out the "Resolved Incidents" sheet.

Data Interpreter

Although Tableau can connect to this sheet, we can see here in the preview that there are some issues. There are no column names, the headers from Excel have a lot of nulls, and so on. Tableau has also recognized this and suggests the Data Interpreter (Tableau's built-in tool for preparing your data for analysis).

• Click turn on

• Now we see that those headers and nulls have been stripped out, and our columns are properly identified!

• If we want more specifics on what the Data Interpreter did, we can click "Review Results" on the right. This will open an Excel file describing the changes.

• If we click to the tab we used, Resolved Incidents, we see which fields are being used as headers, in red, and which are considered data, in green

Before we go back to Tableau and our data connection, let's take one more look at that "Ideal" tab. Note that instead of having a column for each month with data underneath, in this format, there is a "Date" column and each row contains the number of resolved incidents for each unique combination of date and employee. This data is in the preferred format for analysis: taller, with more rows, rather than wider, with more columns. Let's see if we can do that in Tableau.

Pivot

Back in Tableau, we want to change the format from that column-per-month layout into a single date column and a single column for Resolved Incidents.

• To do this easily, we'll simply select all the date columns. Click on the first, scroll if necessary, then shift click on the last. We'll open the menu and select "Pivot"

• This pivot feature essentially merges the information from the original columns and rows into two new columns - Pivot field names, and Pivot field values.

• We can see that "Pivot field names" is actually our Date, so we can click to open the menu and select rename.

• Similarly, "Pivot field values" can be renamed "Resolved Incidents"

Split

There's one more thing we can do to prepare this data. Note that the "Employee" field is actually two pieces of information - a location code, A, B, C, D, or E, followed by an Employee ID number. We can split the column based on the hyphen delimiter:

• Click to open the menu and select Split

• There are now two new fields - Employee - Split 1 and Split 2

• We'll use the Metadata Grid view (click the icon to the left of Sort Fields) to rename our split fields

• Click on the name to edit in-line, Split 1 should be Location, and we'll hit tab, Split 2 should be Employee ID

• There's an Abc next to the Date field indicating this column is considered a String. We know it's actually a Date, though, so we can click on the Abc and select Date to update the data type.

Now if we click on Sheet 1, we'll see nice tidy data ready for analysis!

Custom Split

Let's create a viz now: bring Employee ID to the view, Resolved Incidents to Columns, and sort it.

It's clear from this view that there are really two groups of employees - some who resolve a much higher number of incidents than others. Looks like some employees are often able to get through more cases, and they have a Tier II designation.

If we look at our original data set in Excel, we see there's a tab called Tiers. This report adds a -II to the end of an employee ID if they're tier II. Because not all rows have this -II, a standard split won't work. Let's see if we can create a viz that incorporates this Tier designation.

[[side note:both Split and Custom Split require consistent delimiters. If our data has irregular delimiters, Tableau won't be able to split out the data using these options.]]

Open a new Tableaufileand recreate the viz, this time using the Tiers sheet from Excel.

• Remember to usethe Data Interpreter, and Pivot the dates again

• Click on the Employee column to open the menu and select Custom Split

o We can choose our delimiter, we'll use a hyphen
o And now we can say we want to have 3 columns
o This forces Tableau to break off that 3rd column with the tier II indicator

Now finish the viz by yourself without detailed instructions: rename all columns, and do your bar chart as above, and this time color the bars by Tier.

Reference no: EM131924823

Questions Cloud

How process works in relates to essential definition of dss : How the process works in relates to the essential definition of DSS. Explain the important DSS classifications. Design and implementation apply to your project?
Prepare overview in form of table of contents for manual : Design Communications Manual - Your job, in this assignment, is to prepare that overview in the form of table of contents for the manual with brief description
What are the uses for indexes and registers with health data : What procedures should a healthcare worker do when handling medical records in healthcare facilities? Why is this important?
Evaluating leaders in a specific leadership position : List three to five specific skills, abilities, behaviors, attitudes, and/or knowledge areas you believe are important for success in the position.
Prepare a data prep with text and excel files : Prepare a Data Prep with Text and Excel Files. Now we will practice using a file that needs some work before it is ready for analysis.
Explain the strategic importance of cloud computing : Write a brief synthesis and summary of the two articles. How are the topics of the two articles related? What information was relevant and why?
Determine the payback for the new center : Determine the payback for this new center. Determine the net present value using a cost of capital of 15 percent.Should the project be accepted?
Identify a computer system you have recently had experience : Identify a computer system you have recently had experience with. Describing a potential computer security problem related to that system.
List and describe porters competitive forces model : List and describe the five key factors that are contributing to the increasing vulnerability of organizational information resources.

Reviews

Write a Review

Computer Engineering Questions & Answers

  What would be the average disk access time on your system

What would be the average disk access time on your system using a RAID-5 array with two sets of four disks if 25% of the database transactions must wait behind one transaction for the disk to become free?

  Write a program in php and javascript

In PHP and Javascript, when the given function has verified that all of the necessary field have been filled, a cookie is added to the user's computer.

  Discuss how specialized learning and the attainment of an

write a 1-page paper in which you discuss the following topics1.reflect on and write two to three of your personal and

  Calculate the sum of the numbers from one to n

Write an assignment statement to calculate the sum of the numbers from 1 to N using Gauss's formula

  Explain where in operating system the instruction would used

A typical hardware architecture provides an instruction called return from interrupt, abbreviated by something like iret. This instruction switches the mode.

  Demonstrate prims algorithm starting from vertex a

Demonstrate Prim's algorithm starting from vertex A. Write the edges in the order they were added to the minimum spanning tree.

  Write a program that asks the user for the pin

When you use an automated teller machine (ATM) with your bank card. Write a program that asks the user for the PIN no more than three times.

  What does xml stand for

What does XML stand for? Describes the purpose of the new system, the potential start and completion dates, and the key stakeholders and sponsors of the new system.

  How does rsa algorithm differ from data encryption standard

How does the RSA algorithm differ from the Data Encryption Standard (DES)? A security engineer uses a digital certificate and Pretty Good Privacy (PGP) when sending an e-mail. What is the digital certificate used for?

  The presentation of web site or the content

What are the good and useful features and presentation of a web site that you look for.

  List source and destination ethernet addresseson the message

List the source and destination Ethernet addresseson the message. What value is in the Ethernet type field in thismessage? Why?

  Discuss why you think the network is effective and efficient

Examine its network design and explain why you think the network is effective and efficient

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd