What is the expected number of guests that end up

Assignment Help Computer Engineering
Reference no: EM132164251

1. The aptly-named Average Hotel has 100 rooms, each belonging to one of 100 guests. After an evening soiree, all of the guests (not thinking straight) randomly select a room to sleep in for that night. Multiple guests might end up in the same room.

(a) What is the expected number of guests that end up returning to their own hotel room?

[We are expecting: A number, and please show your work.]

(b) What is the expected number of guests that end up in a room with exactly one other person? (Hint: you may find it easier to count by rooms instead of guests.)
[We are expecting a mathematical expression like(6)(8)or a number like48, and please show your work.]

2. LetUkbe the universe of all strings consisting ofknumeric digits. (0000,0123, and9898are part of universeU4butb000,012,9!9!are not.) Letuidenote theithdigit of a stringu?Ukwhere 0=i

LetHkbe a family ofkhash functions mapping universeUkto values{0,1,2, . . . ,9}whereh0?Hk hashes all strings according to their first digit.

(For all stringsuwhereu0= 0,h0(u) = 0; for all strings uwhereu0= 1,h0(u) = 1; for all stringsuwhereu0= 9,h0(u) = 9.) Likewise,h1?Hkhashes all strings according to their second digit. Generally, for all stringsu?Ukwhereui=x,hi(u) =xfor 0=i

(a) What is an example of a maximally-sized subsetU3such thatH3is universal for the subset?

[We are expecting: An example subset.]

(b) WouldHkbe a good family of hash functions (where "good" is defined as universal) to useforUkfork=3?
[We are expecting: A short explanation (2-3 sentences) that answers the question.]

3.(Plagiarism detection) Hash functions are extremely good at what they do. Unsurprisingly, there are many fancier data structures that can be built on top of them. In this problem we will motivate and explore the idea of a "Bloom Filter," which is one example of a fancier structure built on top of hash functions.

Suppose you are hired by someone to make a plagiarism detection software for internal use so as to avoid any potentially embarrassing allegations of plagiarism. Specifically, your goal is to make a lightweight (i.e. fast, and relatively low-memory) piece of software that will take a sentence and output one of the following messages: 1) "potentially problematic, please rewrite", or 2) "fresh like an ocean breeze."

Suppose your goal is the following: if the input sentence is something that you have already seen, you output "potentially problematic" (with probability 1), and if the input is something new, you want to output "fresh" with probability at least 0.99 (its alright if you have a few false-alarms).

(a) (1 pt.) First, you decide to use a hash table. You will make a has table that maps a piece of text to a bucket, then scrape the web for all English sentences, and hash each one to your table. Given a new sentence, you will check to see if it hashes to an empty bucket-if so, you will output option "fresh" otherwise you will output "potential plagiarism." Suppose there are 1 billion unique sentences online. How many buckets will your hash table need to have to have the desired functionality?

[We are expecting: A number (to the nearest order of magnitude) and one to two sentences of justification.]

(b) (2 pt.) You decide that is a little too much space usage, and consider the following approach: you choose 10 hash functions,h1,...,h10that each map sentences to the numbers 1 though 10 billion. You initialize an arrayAof 10 billion bits, initially set to 0. For each sentencesthat you encounter, you computeh1(s),h2(s),...,h10(s), and set the corresponding indices ofAto be 1 (namely you setA[h1(s)]?1, A[h2(s)]?1, . . .). Argue that after processing the 1 billion unique sentences, you expect a (1-1/(10 billion))10 billion˜0.37 fraction of the elements to be 0.

For this part, feel free to assume that thehiare "idealized" hash functions that map each keys to a uniformly random bucket.
[We are expecting: A paragraph with your argument.]

(c) (2 pt.) Now, given a sentences, to check if it might be plagiarized, you compute the 10 hashes of s, and check ifA[h1(s)] =A[h2(s)] =. . .=A[h10(s)] = 1. If so, you output "potential problem," otherwise you output "fresh." Prove that ifsis actually in your set of 1 Billion sentences, that you will output "potential problem" with probability 1, and that ifsisnotin your set of 1 Billion sentences, you will output "fresh" with probability˜1-(1-0.37)10˜0.99.

Again, feel free to assume that the hash functions are "idealized," and that the claim of the previous part holds, namely that after processing the 1 Billion sentences, there are 3.7 billion indices in the arrayAwith value 0.

[We are expecting: Informal mathematical justifications for each of the bounds.]

Reference no: EM132164251

Questions Cloud

Determining the training needs of an organization : Discuss what you would consider as the one most significant roadblock companies are likely to face in implementing the ANSI/AIHA Z10 standards
What is the optimum frame size for frame relay transmission : What is the optimum frame size for frame relay transmission in terms of throughput and delay?
What is the cause for pilot fatigue : Aviation Management Research Proposal Assignment - Dissertation Title - Cause of Pilot fatigue in the general aviation sector of the Indian sub-continent
Describe a good application of a linked list data structure : Describe a good application of a linked list data structure. When is a linked list a good representation of a stack or a queue?
What is the expected number of guests that end up : What is the expected number of guests that end up in a room with exactly one other person?
Explain the relationship between object and class : 1. Extract the question given below with a few sentences: a) Explain the relationship between object and class.
Determine the average time a truck waits : Determine the average time a truck waits for the batch plant and the % of time that the batch plant is idle. What is the largest number of trucks
Define a struct books with fields name : Define a struct 'books' with fields name (string type and size 20) and author (string type and size 20).
Design a training plan based on the findings in your tna : Design a training plan based on the findings and training outcomes revealed in your TNA.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Identify the key problems and issues in the case study

Identify the key problems and issues in the case study. Formulate and include a thesis statement, summarizing the outcome of your analysis in 1-2 sentences.

  How would you implement the different types of glass

You are the Executive Safety Officer (ESO) and was tasked to ensure that the facility is secure. In this assignment discuss.

  Describe methods for increasing the awareness of the aup

Describe methods for increasing the awareness of the AUP, and other policies, within the organization.

  What is the difference between a sequential control

post a 200 to 300-word response to the following questionwhat is the difference between a sequential control structure

  How concepts using the osi model as a framework

Your manager has asked you to describe the frame format of a typical Ethernet packet. Prepare a written report. Be sure to discuss Manchester encoding, 4B/5B encoding, 8B/10B encoding, the cable grades required for different speeds of Ethernet, an..

  Examine the practicality of building multiple interface

Examine the practicality of building multiple interface options for diverse populations, rather than building one interface that meets the needs of the majority of end users.

  Describe a concrete implementation of the mergeable heap ADT

Describe a concrete implementation of the mergeable heap ADT that achieves O(logn) performance for all its operations.

  Compare and contrast steganography and cryptography

Compare and contrast steganography and cryptography. Why steganography and how does it work? List examples of suitable carriers of steganographic payloads.

  Make sure each blog post is connected to a particular user

Make sure each blog post is connected to a particular user. Make sure all posts are publicly accessible but only registered users can add posts.

  What disadvantage of insertion sort does shell sort overcome

What is the advantage of selection sort over all the other methods we studied? What disadvantage of insertion sort does Shell sort overcome?

  Create a web based multimedia presentation for a topic

COMP607 Visual Effects and Animation - Create a graphic composition using various graphics techniques and export to target delivery formats

  Write a program to break a number into its individual digits

Write a program to break a number into its individual digits, for example to turn 1729 into 1, 7, 2 and 9. It is easy to get last digit of a number n as n % 10.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd