NIT6160 Data Warehousing and Mining Assignment

Assignment Help Other Subject
Reference no: EM132572682

NIT6160 Data Warehousing and Mining - Victoria University

Assignment

The goal of this project is to applying association rule mining, classification and clustering methods on the Mushroom and groceries data sets. For detailed information about the mush room data set, refer to the Machine Learning Repository provided by the University of California, Irvine. You can download and read more about the data there.

The groceries Dataset
Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer's basket. That is exactly what the Groceries Data Set contains: a collection of receipts with each line representing 1 receipt and the items purchased. Each line is called a transaction and each column in a row represents an item.

Task 1: Data Pre-processing
Read the data in R. There are many ways to read in csv tables in R. For more details, please refer to data import/export in R

For the clustering experiments, the column for class labels need to be removed. Refer to lecture Module 10 to see how to do so.

Verify if any other pre-processing is beneficial for the analysis. For example, replacing missing values, attribute range normalization, converting numerical or string to nominal values etc.

Task 2: Data Mining

• Association Rule Mining experiments: Using R to explorer "association rules" on the groceries dataset. Try out different algorithms. Visualize the result you found. Report any interesting association rules discovered in the experiments and explain why they are interesting.

• Classification experiments: Using to construct classifiers on the mushroom dataset. Randomly split the data set in the training and test data set (80% v.s. 20%). Select at least one classifier from each of the following two categories of classifiers: Tree-based models, Bayes classifiers, and Rule-based classifiers. Compare the result of the chosen classifiers.

• Clustering experiments: Using R explorer clusters on the mushroom dataset. Select and compare two clustering algorithms from R (e.g. k-means v.s. density-based). Use R to visually explore the resulting clusters.

• For all the above experimentations, try different parameter settings to fine tune the outcome. In principle select methods that work well on the given data set.

Task 3: Prepare a report
Your report should contain the following:

• Theoretical Discussion: Limited to two pages discussing about data preprocessing steps, the motivation for selecting a particular method, and how the parameters are chosen.

• Results: Include results and screenshots of the above experimentations.

• Discussion and error analysis: Try to interpret the results of your model. Discuss intuitions or hypothesis that can be obtained by visual inspections of the resulting classes or clusters. Mention about assumptions if any, discuss issues that might have affected the model's performance.

• References: If you are using information from other sources apart from R manual and official website, you should cite them.

Attachment:- Data Warehousing and Mining.zip

Verified Expert

The project consists of 2 data sets 1) Mushroom data 2) Groceries Data set 1) The mushroom dataset consists of a total of 23 feature variables. 22 independent variables and 1 dependent variable. In this project, we predicted the edibility of mushrooms (dependent variables) depending upon the various parameters of independent variables. 2) Grocery dataset consists of 10,000 traction data of customers of a grocery shop. In this project, we analyze the data by use of Association Rule Mining technique so that we can see the various trends of shopping.by use of this we can see the pair of items which are brought together. This prediction can be used to increase the sale of a particular item by giving various offers --

Reference no: EM132572682

Questions Cloud

Find deductible losses for tax purposes do not include : For individual taxpayers, deductible losses for tax purposes do not include? Personal casualty or theft losses./ Investment losses
Which of the statements is correct for canadian company : Which of the statements is correct for Canadian company? Jamal is considered a part-time resident of Canada for the 4 weeks he spends in Canada.
What the term away from home means : Travel expenses must be incurred by a taxpayer while away from home. To the IRS, What the term "away from home" means?
What year can klaxon deduct the bonus paid to susanne : It was paid on February 11, 2019. In what year can Klaxon deduct the bonus it paid to Susanne? Susanne is a calendar-year, cash method taxpayer.
NIT6160 Data Warehousing and Mining Assignment : NIT6160 Data Warehousing and Mining Assignment Help and Solution, Victoria University - Assessment Writing Service
Compute what amount the balances of the deferred liability : Compute what amount the balances of the deferred liability and deferred tax asset will increase or decrease for the year to 30 June 2020.
When tax deductions are taken under the cash method : Under the cash method of tax accounting, tax deductions are taken when? There is a fixed and determinable liability./Expenditures are made
Which of the methods of accounting for research : Which of the following methods of accounting for research and experimental costs paid or incurred in connection with a trade or business?
What tax planning and compliance expenses incurred : What tax planning and compliance expenses incurred by individuals are? Deductible if properly incurred as a business expense on Schedule C.

Reviews

Write a Review

Other Subject Questions & Answers

  Discuss the criteria for alcohol use disorder

Discuss at least 2 of Tim's behavior and/or attitudes that relate to the criteria for Alcohol Use Disorder

  Transportation revolution-market revolution

Discuss America's transportation revolution that took place during the first half of the 1800s, including the development of roads, canals, railroads, and steamboats. How did they contribute to the "market revolution?" Elaborate

  What business employees are fiduciaries

Is there a difference between owing a fiduciary duty and following rules or guidelines set by an employer to be a good employee and to act.

  Construct your argument using premises and conclusions

One of the greatest threats to human and non-human species is climate change and environmental degradation from pollution. Use the utilitarianism theory.

  Describe the most effective advanced filtering interface

Describe the most effective advanced filtering and search interface. Argue why the interface you chose is more effective than others, and describe who benefits.

  What is the view known as utilitarianism

Utilitarianism is an important view in moral philosophy. Write a short essay on Utilitarianism in which you deal with the following questions: What is the view known as Utilitarianism

  Post an explanation of the role of family preservation

Post an explanation of the role of family preservation in child welfare. Then, explain whether research supports the assumption that foster care is harmful.

  Five most important challenges faced by hit today

1) What are the five most important challenges faced by HIT today, and why?

  In what areas do you feel you are most effective

In a 1-2 paragraph response, reflect on your own style of communication. In what areas do you feel you are most effective?

  Testable hypothesis for your research question

Write a testable hypothesis for your research question. What constructs is your research question investigating?

  What is your understanding of the word utopia

What is your understanding of the word "utopia"? After answering that, do a web search and report what you find. Do only a cursory search (i.e., don't go deep into any websites). Search on images.google.com as well.

  What could taken now to reduce chances of tragedy occurring

What could be taken now to reduce the chances of a similar tragedy occurring in the future? Which do you think is most likely to be effective, and why?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd