Designing useful visualization and data mining solutions

Assignment Help Other Subject
Reference no: EM133150212

Assignment: Analytics Report

Overview

The purpose of this task is to provide students with practical experience in writing a data analytical report to provide useful insights, pattern and trends in a chosen dataset in the light of a set of tasks required within this document. This dataset will be chosen from the UC Irvine Machine Learning Repository1. This activity will give students the opportunity to show innovation and creativity in applying the WEKA data mining software, and designing useful visualization and data mining solutions presented as an analytics report.

Project Details

You will use an analytical tool (i.e. WEKA) to explore, analyse and visualise a dataset of your choosing. An important part of this work is preparing a good quality report, which details your choices, content, and analysis, and that is of an appropriate style.
The dataset should be chosen from the following repository:

UC Irvine Machine Learning Repository

The aim is to use the data set allocated to provide interesting insights, trends and patterns amongst the data. Your intended audience is the CEO and middle management of the Company for whom you are employed, and who have tasked you with this analysis.

Tasks

Task 1 - Data choice. Choose any dataset from the repository that has at least five attributes, and for which the default task is classification. Transform this dataset into the ARFF format required by WEKA.

Task 2 - Background information. Write a description of the dataset and project, and its importance for the organization. Provide an overview of what the dataset is about, including from where and how it has been gathered, and for what purpose. Discuss the main benefits of using data mining to explore datasets such as this. This discussion should be suitable for a general audience. Information must come from at least two appropriate sources be appropriately referenced.

Task 3 - Data description. Describe how many instances does the dataset contain, how many attributes there are in the dataset, their names, and include which is the class attribute. Include in your description details of any missing values, and any other relevant characteristics. For at least 5 attributes, describe what is the range of possible values of the attributes, and visualise these in a graphical format.

Task 4 - Data preprocessing. Preprocess the dataset attributes using WEKA's filters. Useful techniques will include remove certain attributes, exploring different ways of discretizing continuous attributes and replacing missing values. Discretizing is the conversion of numeric attributes into "nominal" ones by binning numeric values into intervals2. Missing values in ARFF files are represented with the character "?"3. If you replaced missing values explain what strategy you used to select a replacement of the missing values. Use and describe at least three different preprocessing techniques.

Task 5 - Data mining. Compare and contrast at least three different data mining algorithms on your data, for instance:. k-nearest neighbour, Apriori association rules, decision tree induction. For each experiment you ran describe: the data you used for the experiments, that is, did you use the entire dataset of just a subset of it. You must include screenshots and results from the techniques you employ.

Task 6 - Discussion of findings. Explain your results and include the usefulness of the approaches for the purpose of the analysis. Include any assumptions that you may have made about the analysis. In this discussion you should explain what each algorithm provides to the overall analysis task. Summarize your main findings.

Task 7 - Report writing. Present your work in the form of an analytics report.

Attachment:- Analytics Report.rar

 

Reference no: EM133150212

Questions Cloud

What dollar amount would it report as cost of goods sold : The Pudi Company had direct materials costs of $228,000, direct labor costs of $452,000, what dollar amount would it report as cost of goods sold
What is total equity at the end : The following year the company recorded $182,300 in revenues, $174,100 in expenses and no dividends. What is total equity at the end of 2020
What is the expected price of the stock three years from now : Chaffin Co.'s stock price is $50 per share, and its expected year-end dividend is $4.00 a share (D1 = $4.00). What is the expected price of the stock
What is the net present value of the investment : You can assume that the investment has no salvage (i.e., ending) value at the end of its 3-year useful life. What is the net present value of the investment
Designing useful visualization and data mining solutions : Applying the WEKA data mining software, and designing useful visualization and data mining solutions presented as an analytics report
What is the estimate of investors required rate of return : The current risk free rate is 2.0%, the overall market required return is 6.0%, What is the estimate of investors required rate of return
Differentiate between a sunk cost and an opportunity cost : Differentiate between a sunk cost and an opportunity cost. What role does managerial accounting play in managerial decision making process
What is the ideal lead time to despatch meeting papers : What is the ideal lead time to despatch meeting papers to ensure participants have enough time to read them prior to the meeting
What will be the company repayment be : Calculate the number of the 15-year coupon bonds that SONO Ltd. issued to raise the $30 million four years ago. What will be the company's repayment be

Reviews

Write a Review

Other Subject Questions & Answers

  Use the textbook andor online sources to locate and capture

use the textbook andor online sources to locate and capture three works of religious art. all three pieces should be

  What is gender division of labor of typical american family

What is the gender division of labor of a typical American family's Thanksgiving meal? How does that compare with the gender division of labor

  Discuss about the e-activity

Identify at least three (3) types of exploitation that prevalently occur over the Internet. Next, give your opinion of whether or not you believe that the current primary federal laws are efficient in combatting the types of Internet exploitation ..

  Describe the change models in brief

Select and compare two of the following change models: environment-industry-organization contingency, organizational life-cycle, action research model.

  Define ways that preschool policy on compassionate treatment

You have been learning about key areas of growth and development related to prosocial behaviors. Describe three ways that a preschool policy on compassionate treatment and care of animals that visit or live in classrooms would foster this growth a..

  Is there a realistic alternative to the current policy

Is there a reasonable and realistic alternative to the current policy? Are their financial or logistical or time-related obstacles to realizing

  Discuss what has been demonstrated to be successful

Develop strategies for addressing each challenge based on what has been demonstrated to be successful. Cite your resources

  The key elements in the tourism policy and planning process

Identify the key elements in the tourism policy and planning process,  Critically review the changing approaches to tourism policy and planning

  Physiology of electric shock

Write a brief paragraph on the physiology of electric shock. Include explanations of organs affected, electric current levels that are dangerous,

  Discuss the nature of conventional military threat

Discuss the nature of conventional military threat. how does the theory of "Realism" apply to nation-state perceptions and responses to this particular threat?

  Explains and addresses the big five personality types

Explains and addresses the big five personality types. The paper includes strengths and weaknesses of each of the big five personality types. The paper includes a discussion on how personality types can influence job performance.

  Discussing the business model

Discussing the business model, nature of digital disruption, and market environment of a particular domain of the digital economy

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd