Describe how data mining

Assignment Help Basic Computer Science
Reference no: EM132533055

1. Suppose that you are employed as a data mining consultant for an Internet search engine company. Describe how data mining can help the company by giving specific examples of how techniques, such as clustering, classification, association rule mining, and anomaly detection can be applied.

2. Identify at least two advantages and two disadvantages of using color to visually represent information.

3. Consider the XOR problem where there are four training points: (1, 1, -),(1, 0, +),(0, 1, +),(0, 0, -). Transform the data into the following feature space:

Φ = (1, √ 2x1, √ 2x2, √ 2x1x2, x2 1, x2 2).

Find the maximum margin linear decision boundary in the transformed space.

4. Consider the following set of candidate 3-itemsets: {1, 2, 3}, {1, 2, 6}, {1, 3, 4}, {2, 3, 4}, {2, 4, 5}, {3, 4, 6}, {4, 5, 6}

Construct a hash tree for the above candidate 3-itemsets. Assume the tree uses a hash function where all odd-numbered items are hashed to the left child of a node, while the even-numbered items are hashed to the right child. A candidate k-itemset is inserted into the tree by hashing on each successive item in the candidate and then following the appropriate branch of the tree according to the hash value. Once a leaf node is reached, the candidate is inserted based on one of the following conditions:

Condition 1: If the depth of the leaf node is equal to k (the root is assumed to be at depth 0), then the candidate is inserted regardless of the number of itemsets already stored at the node.

Condition 2: If the depth of the leaf node is less than k, then the candidate can be inserted as long as the number of itemsets stored at the node is less than maxsize. Assume maxsize = 2 for this question.

Condition 3: If the depth of the leaf node is less than k and the number of itemsets stored at the node is equal to maxsize, then the leaf node is converted into an internal node. New leaf nodes are created as children of the old leaf node. Candidate itemsets previously stored in the old leaf node are distributed to the children based on their hash values. The new candidate is also hashed to its appropriate leaf node.

How many leaf nodes are there in the candidate hash tree? How many internal nodes are there?

Consider a transaction that contains the following items: {1, 2, 3, 5, 6}. Using the hash tree constructed in part (a), which leaf nodes will be checked against the transaction? What are the candidate 3-itemsets contained in the transaction?

5. Consider a group of documents that has been selected from a much larger set of diverse documents so that the selected documents are as dissimilar from one another as possible. If we consider documents that are not highly related (connected, similar) to one another as being anomalous, then all of the documents that we have selected might be classified as anomalies. Is it possible for a data set to consist only of anomalous objects or is this an abuse of the terminology?

Reference no: EM132533055

Questions Cloud

Relationship between database and information system : What is the relationship between a database and an information system? What impact does this relationship have on database design?
Make the bank reconciliation statement at april : Make the bank reconciliation statement at April 30,The information is for SeaTrans Ltd. in April, Cash balance per bank, April 30, $14,606.73
Liabilities of incoming and outgoing partners : Mention Four (4) liabilities of incoming and outgoing partners
What is the amount of cost of merchandise sold : What is the amount of cost of merchandise sold?Purchases $28,000,Purchases discounts $800,Sales returns and allowances 750
Describe how data mining : Describe how data mining can help the company by giving specific examples of how techniques, such as clustering, classification, association rule mining,
Find the cost of merchandise purchased is equal to : Find The cost of merchandise purchased is equal to? A company using the periodic inventory system has the following account balances
Prepare the bank reconciliation statement at april : Prepare the bank reconciliation statement at April 30. Electronic receipts from customers in payment of the amount $ 6,787.18.
Make adventure manufacturing ltd journal entry for sales : During the year of 2019, Adventure Manufacturing Ltd. Make Adventure Manufacturing Ltd.'s journal entry for sales, cash collection and bad debt expense.
Mobile platform vulnerabilities : There are many mobile platform vulnerabilities listed in the readings from this week. Which do you feel is the greatest threat to users?

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Describe the types of object relationships

Describe the types of object relationships and how they might be used to find objects when conducting textual analysis

  Relevant behavioral theory

How can organizations reduce insider threats ? Support your answer with an example of a relevant behavioral theory.

  What is parallel start-up cutover strategy

1. What is Parallel Start-Up cutover strategy? And what are the pros and cons associated with it?

  Computer protection system

Do you think the school and the IT department should be at fault as well since their computer protection system was expired and this seemed like a fate long coming?(300 words) with reference

  Challenging part of network security

Based on what you have learned in this course, what do you think will be the most challenging part of network security for you?

  Ensure better data quality for data mining techniques

Discuss the importance of preprocessing the datasets to ensure better data quality for data mining techniques.

  Myth of mac devices

From the second e-Activity, explain whether or not you believe that the myth of Mac devices being more secure than Windows devices is becoming history, and justify your answer. Further, indicate one (1) main reason why you believe this myth still ..

  Detect david alleged industrial espionage

What steps might have been taken to detect David's alleged industrial espionage? What steps might have been taken to prevent his perpetrating such an offense?

  Discuss significant research by group,

Presenting group's hypothesis. Discuss significant research by group, and consensus of research and applicable law

  Explain how a firm can increase profits

Explain how a firm can increase profits even when the market price of its output is falling. Your explanation should refer to marginal productivity theory

  Compute the price elasticity of demand

Compute the price elasticity of demand between these two points. Would you expect total revenues to rise or fall? Explain.

  Program to solve a simple payroll calculation

Write a program to solve a simple payroll calculation. Find the amount of pay given, hours worked, and hourly rate. (The formula to calculate payroll is pay = hourly rate * hours worked.) Use these values to test the calculation: Problem 1A (hours..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd