Perform an exploratory analysis of your dataset

Assignment Help Computer Engineering
Reference no: EM132677769

Assignment: In this individual assignment, you will perform an exploratory analysis with What-If Tool, to better understand the structure of datasets, investigate initial questions, and develop preliminary insights and hypotheses. Your final submission will take the form of a report consisting of key insights gained during your analysis.

Step 1: Dataset Selection and Initial Questions

Pick two datasets. These can be ones that are available for demo. But we'll give you additional points if you choose to use datasets that are not available there.

After selecting datasets - but prior to analysis - write down an initial set of three questions you'd like to investigate about the datasets and prediction results from ML models.

Part 2: Exploratory Visual Analysis

Next, you will perform an exploratory analysis of your dataset and results from ML models using What-If Tool. You can either use their web demo if you use their provided datasets. You can also use notebooks and revise them with your datasets and models.

You should consider two different phases of exploration.

In the first phase, you should seek to gain an overview of the structure of your datasets and results from their models. What is the structure of datasets? Which features are used? Are there any notable issues with the distributions of datasets? What is the model performance? What features contributed the most? Are there any surprising relationships among subsets of data and model results? Are there any fairness issues?

In the second phase, you should investigate your initial questions, as well as any new questions that arise during your exploration. For each question, playing with the visualizations in What-If Tool, that might provide a useful answer. Interact with their functionalities (e.g., datapoint editors, dropdown menus, fairness analysis) to develop better perspectives, explore unexpected observations, or sanity check your assumptions. You should repeat this process for each of your questions, and also feel free to revise your questions or branch off to explore new questions.

What to submit?

You'll submit a single PDF as a form of a report. For each dataset, you will provide 10 most interesting or surprising findings (or "insights") with details and screenshots. Your "insights" can include important surprises or issues (such as skewed data distributions, critical fairness issues) as well as responses to your analysis questions. Each finding will consist of a title and 2-4 sentence descriptions, and screenshots. Provide sufficient detail so that anyone could read through your report and understand what you've learned. You are free, but not required, to annotate your images to draw attention to specific features of the data.

Do not submit a report cluttered with everything little thing you tried. Submit a clean, succinct report that highlights the most interesting, insightful observations. You don't need to tell us how the tool works -- we already know that. Think of this like a report to your manager who wants to know what the datasets look like and how the model worked.

The structure of the report will be:

1. Dataset 1

• Which dataset?

• Three initial questions

• 10 most interesting findings

2. Dataset 2

• Which dataset?

• Three initial questions

• 10 most interesting findings

Reference no: EM132677769

Questions Cloud

Compare the pros and cons of vertical integration for tesla : Compare the pros and cons of vertical integration for Tesla with respect to acquiring lithium
Geographically dispersed teams collaborate effectively : How do geographically dispersed teams collaborate effectively?
Policy requirements of the government and health care sector : Discuss the differences in policy requirements of the government and health care sectors. The Health Insurance Portability and Accountability Act
Explain at least three roles of the data definition wizard : Explain at least three roles of the Data Definition Wizard, and describe for each role how the auditor will use the Data Definition Wizard feature
Perform an exploratory analysis of your dataset : You will perform an exploratory analysis of your dataset and results from ML models using What-If Tool. You can either use their web demo if you use their.
Find concept of economies of scope : Use the concept of economies of scope to discuss how this initiative can contribute to Apple's overall performance.
Calculate the balance in cash account at the end of March : During March the business purchases equipment on account for $25,000; Calculate the balance in the cash account at the end of March
What is the value of the ending inventory at lifo : Assuming that the perpetual inventory method is used and cost are computed at the time of each withdrawal, what is the value of the ending inventory at LIFO?
Describe some of the negative impacts management information : Are there any potential disadvantages to using management information systems? Be sure to come up with potential negative impact.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd