How many pairs are counted on the third pass

Assignment Help Basic Computer Science
Reference no: EM131212144

Using the assumptions of Exercise 22.2.4, suppose we run a three-pass Multistage Algorithm on the dataset. Assuming that on the second pass there are again 100,000 buckets, and the hash function distributes pairs randomly among the buckets, answer the following questions, all in terms of s the ratio of the support threshold to the number of baskets.

a) Approximately how many frequent buckets will there be on the second pass?

b) Approximately how many pairs are counted on the third pass?

Exercise 22.2.4

Consider running the PCY Algorithm on the data of Exercise 22.2.3, with 100,000 buckets on the first pass. Assume that the hash function used distributes the pairs to buckets in a conveniently random fashion. Specifically, the 499,500 little-little pairs are divided as evenly as possible (approximately 5 to a bucket). One of the 100,000 big-little pairs is in each bucket, and the 4950 big-big pairs each go into a different bucket.

a) As a function of s, the ratio of the support threshold to the total number of baskets (as in Exercise 22.2.3), how many frequent buckets are there on the first pass?

b) As a function of s, how many pairs must be counted on the second pass?

Exercise 22.2.3

Imagine that there are 1100 items, of which 100 are "big" and 1000 are "little." A basket is formed by adding each big item with probability 1/10, and each little item with probability 1/100. Assume the number of baskets is large enough that each item set appears in a fraction of the baskets that equals its probability of being in any given basket. For example, every pair consisting of a big item and a little item appears in 1/1000 of the baskets. Let s be the support threshold, but expressed as a fraction of the total number of baskets rather than as an absolute number. Give, as a function of s ranging from 0 to 1, the number of frequent items on Pass 1 of the A-Priori Algorithm. Also, give the number of candidate pairs on the second pass.

Reference no: EM131212144

Questions Cloud

Individual-text file : Store ten student names and their individual score in a text file such as Notepad. There will be one score per student. Write a C# program using Microsoft® Visual Studio® to retrieve the names and the scores.
Identify the technology barriers to the company : Identify the hard and soft technology used for both the domestic and global environments. This is not about computers or software; see lesson plan for details and remember to incorporate critical thinking.
Give the number of candidate pairs on the second pass : Give, as a function of s ranging from 0 to 1, the number of frequent items on Pass 1 of the A-Priori Algorithm. Also, give the number of candidate pairs on the second pass.
Write an essay the narrate an event from your life : Write an essay the narrate an event from your life. Does it the clearly state what the rest of the paragraph is about? Remember: only one point, idea, day, incident, etc. per paragraph.
How many pairs are counted on the third pass : As a function of s, the ratio of the support threshold to the total number of baskets (as in Exercise 22.2.3), how many frequent buckets are there on the first pass?
What are temporary tables : What are temporary tables? When are they useful? Justify with an example.
Compare and contrast minoan and mycenaean art : Explain how prehistoric Minoan and Mycenaean art and architecture may reveal contact with ancient civilizations from Egypt and the Near East.
Limit check and a length check : So what is the difference between a limit check and a length check e.g in the case of creating agoogle account page.
Logical database design and physical database design : Differentiate between logical database design and physical database design. Show how this separation leads to data independence.

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Identifies the cost of computer

identifies the cost of computer components to configure a computer system (including all peripheral devices where needed) for use in one of the following four situations:

  Input devices

Compare how the gestures data is generated and represented for interpretation in each of the following input devices. In your comparison, consider the data formats (radio waves, electrical signal, sound, etc.), device drivers, operating systems suppo..

  Cores on computer systems

Assignment : Cores on Computer Systems:  Differentiate between multiprocessor systems and many-core systems in terms of power efficiency, cost benefit analysis, instructions processing efficiency, and packaging form factors.

  Prepare an annual budget in an excel spreadsheet

Prepare working solutions in Excel that will manage the annual budget

  Write a research paper in relation to a software design

Research paper in relation to a Software Design related topic

  Describe the forest, domain, ou, and trust configuration

Describe the forest, domain, OU, and trust configuration for Bluesky. Include a chart or diagram of the current configuration. Currently Bluesky has a single domain and default OU structure.

  Construct a truth table for the boolean expression

Construct a truth table for the Boolean expressions ABC + A'B'C' ABC + AB'C' + A'B'C' A(BC' + B'C)

  Evaluate the cost of materials

Evaluate the cost of materials

  The marie simulator

Depending on how comfortable you are with using the MARIE simulator after reading

  What is the main advantage of using master pages

What is the main advantage of using master pages. Explain the purpose and advantage of using styles.

  Describe the three fundamental models of distributed systems

Explain the two approaches to packet delivery by the network layer in Distributed Systems. Describe the three fundamental models of Distributed Systems

  Distinguish between caching and buffering

Distinguish between caching and buffering The failure model defines the ways in which failure may occur in order to provide an understanding of the effects of failure. Give one type of failure with a brief description of the failure

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd