Implement a bloom filter

Assignment Help Other Subject
Reference no: EM132209554

Assignment 1:

Task 1:

1. Why is it dangerous to return a reference when you must return an object?

2. What is an implicit interface in the context of compile-time polymorphism?

3. Why should you prefer range member functions to their single-element coun- terparts?

4. Why should you avoid in-place modification of keys in a set?

5. Why should predicate functions be pure?

6. Why should you avoid the default capture modes of lamdbas?

7. What do std::move and std::forward do?

8. How are the strings of a std::set<std::string*> sorted? How you would make them sorted lexicographically? Provide the necessary code.

Task 2:

For this task we provide a graph.hpp file with a defined interface for an unweighted directed Graph class storing its vertices in an adjacency list. Edges connecting vertices to themselves are not allowed. Your task is to implement the functions of the interface and to write tests for each public function using the Catch test framework. Perform your tests not only on trivial graphs but also on larger graphs with at least 6 vertices and 9 edges. We expect you to test each function with valid and invalid input. In total, you should implement at least 15 checks.

Task 3:

In this task you should implement the Munkres algorithm that has a runtime com- plexity of O(n3) and solves the assignment problem. Appropriate pseudo-code that may help guiding your implementation can be found in Algorithm 2 in the scientific publication by Riesen & Bunke. The article is available on our moo- dle. The corresponding code templates for this assignment sheet contain the files munkres_algorithm.cpp/.hpp. Please provide an implementation for the algorithm having the signature Matrix<int> run_munkres_algorithm(Matrix<int> c) in munkres_algorithm.cpp. The source file task3.cpp already contains a simple test-case, which is described below. You may use this file to further test your implementation using different and probably more complex assignment problems.

Input matrix:

250

400

350

400

600

350

200

400

250

Output matrix:

0 1 0

0 0 1

1 0 0

Task 4:

In this task you will implement a Bloom filter. It is a space-efficient probabilistic data structure that is used to test whether an element belongs to a set. False positive matches can occur but not false negatives. Due to its properties elements can only be added but not removed. The more elements the Bloom filter contains, the higher gets the false positive rate. Create a templated class template<typename Key, unsigned int m, typename Hash> BloomFilter that accepts as template parameter the class of the element it will store, the num- ber of bits it uses m and the hash function to use for your Key class. Use the MurmurHash function provided in the murmurhash.hpp. Your hash function object should provide the following operator:

 uint32_t operator()(int key, unsigned int seed) const

The class should be stored in bloom_filter.hpp. As data structure for the bit array use std::bitset. Your class should provide the following functions:

1. BloomFilter(unsigned int k), k is the number of hash functions
2. BloomFilter(std::initializer_list<Key>, unsigned int k)
3. template<typename It> BloomFilter(It first, It last, unsigned int k)
that inserts all elements defined by the iterator range.
4. bool insert(const Key&) that inserts the key and returns if the insertion was successful. If the element was already in the filter then it returns false.
5. bool contains(const Key&) const that returns if the element is present.
6. template<typename It> double false_positive_rate( It positive_begin, It positive_end,
It negative_begin, It negative_end) const
see false positive rate
7. double space_ratio(uint64_t num_elements) const, ratio of the space con-
sumed by your filter in comparison to the minimum size required for num_elements of the Key class
8. uint64_t approx_size() const that approximates the number of elements in the Bloom filter via the formula (k is the number of hash functions and X is the number of bits set to one):

n* = -m/k.ln(1 - x/m)

To determine the number of bytes used to represent e.g. one int use the sizeof operator.

Task 5:

Write a program that takes an adjacency matrix on the standard input and outputs all maximal cliques on the standard output. The adjacency matrix can be read in with the matrix class of assignment

2. Implement the Bron-Kerbosch algorithm to detect the maximal cliques.

Sample input:

0 1 0 0 1 0
1 0 1 0 1 0
0 1 0 1 0 0
0 0 1 0 1 1
1 1 0 1 0 0
0 0 0 1 0 0
Sample output:

{0, 1, 4}
{1, 2}
{2, 3}
{3, 4}
{3, 5}
Sample input2:

0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 1 1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 1 1 1 0 0 0
0 0 0 1 0 0 1 0 0 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 1 1 0 1 1 0 0 0
0 0 0 0 0 0 1 1 0 1 0 1 0 0 0
0 0 0 0 0 0 1 1 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
Sample output2:

{0, 1, 2}
{1, 3}
{3, 4, 5}
{3, 6, 7}
{6, 7, 9, 10, 11}
{8, 9}
{12, 13, 14}

Assignment 2:

Task 1 :

In this task you should output a topological ordering for a given directed graph. You should read the adjacency matrix from the standard input and output the ordering on the standard output. The output should list the node indices as comma and space separated list. To perform the topological sorting implement Kahn's algorithm and use a std::priority_queue to store the nodes without incoming edges. If the graph is not a directed acyclic graph output an error message on the standard error stream.

Sample input:

0 1 0 0 0
0 0 1 1 0
0 0 0 0 1
0 0 1 0 1
0 0 0 0 0
Sample output:
0, 1, 3, 2, 4

Task 2 :
A group of K students wants to buy N gifts from a store. The seller wants to have as many new customers as possible, so he increases the price for customers buying there multiple gifts.
The price P of gift gi for a customer that has already bought x gifts is Pgi = (x+1)•ci,
with ci begin the original cost for gift gi.

Your task is to find the minimum cost for the group to buy N gifts (in any order) and print it out on the standard output.
The input will be provided on the standard input stream. In the first line we will provide the number N of gifts to buy and the size K of the group. The second line will contain the cost (c0, ..., ci, ..., cN-1) for each gift gi as space-separated floating points.

Sample input:

3 2
2.5 5 6.2

Sample output:

16.2

Task 3 (15 P.):

Given are n cities. Each city ci is connected to its neighboring city ci+1 and the distance between ci and cj is |i - j|.

To enable faster delivery of Christmas presents, beaming stations have been built in some of the cities. However, due to the large energy consumption they have not been turned on yet. Once enabled, each beaming station allows to teleport anything up to a distance < k.

Your task is to find the minimum number of stations that must be enabled to ensure that all cities are within the beaming radius of at least one beaming station, so that their inhabitants receive their gifts in time. Print the minimum number found on the standard output. If there is no solution to the problem print -1.

The input will be provided on the standard input stream. In the first line we will provide the number n of cities and the range k of the beaming stations. The second line will consist of n space-separated 0s and 1s, the kth digit denoting whether the kth city has a beaming station (1) or not (0).

Sample input:
6 2
0 1 1 1 1 0

Sample output:

2

Task 4:
The visitor design pattern allows to decouple algorithms from objects they operate on. It allows adding new virtual functions without changing the classes of these ob- jects. In this task we provide three classes used to model trees (Node, InternalNode, Leaf). Your task is to implement three different visitors on a tree modeled by the classes:

1. Create a LeavesSumVisitor which returns the sum of the leaves in the tree over the getResult member function.

2. Create a RedNodesProductVisitor which returns the product of red nodes over the getResult member function.

3. Create an AdvancedVisitor which returns the absolute difference between the sum of the values of non-leaf nodes at even depth and the sum of the values of black leaf nodes.

Your visitors should implement the TreeVisitor interface provided in visitors.hpp. Your program takes on the standard input a tree and outputs on the standard output the result of the LeavesSumVisitor, the RedNodesProductVisitor and of the AdvancedVisitor, each on a new line. If the input is invalid print an error message on the standard error stream and terminate.

The first line of the input contains the number of nodes n of the tree. The second line contains n space-separated integers denoting the values of the nodes. The third line contains n space-separated strings (Black or Red) denoting the color of the nodes. In the remaining lines the connections of the nodes are described by two space-separated integers on each line, denoting the source and destination node.

The indexing of the nodes starts at 0, and 0 is always the root of the tree. The root is at depth 0. Keep in mind that if a tree has only 1 node this node is a leaf (and the root) and not an internal node.

Sample input:
5
4 7 2 5 12
Red Black Red Red Black 0 1
0 2
2 3
2 4
Sample output:
24
40
15

Task 5 (15 P.):
For this task you should compute gene clusters in which each gene is at most at distance d of at least one other gene of the cluster and lying on the same chromosome. To compute the distance between two genes compare their middle positions.

The genes will be passed as tab-separated text file as first argument. The allowed distance will be passed as second argument and the output file as third argument. The input file consists of three columns: chromosome start stop, start and stop represent a closed interval and start at one. The output file consists of four columns, i.e. the same as the input file and one additional column containing the cluster index. To assign each gene to a cluster, add a new column to the input file: cluster=i, with i being the cluster index (starting at 1). The clusters should be sorted in ascending coordinate order, i.e., with chr1 coming before chr2 (and before chrX and chrY), then according to their starting position. The order of the genes in the output file should be the same as in the input file (this means that the cluster indices will most of the time not be outputted in sorted order).

Sample command:
./task5 sample_input.txt 2000 sample_output.txt

Sample input:

chr2 1300 10800
chr1 21000 24000
chr1 100 21000
chr1 250 18000
Sample output:
chr2 1300 10800 cluster=3
chr1 21000 24000 cluster=2
chr1 100 21000 cluster=1
chr1 250 18000 cluster=1

Reference no: EM132209554

Questions Cloud

Discuss current research that links patient safety outcomes : Discuss current research that links patient safety outcomes to ADN and BSN nurses. Based on some real-life experiences, do you agree or disagree.
What are the primary drivers of information systems design : Key performance indicators: What are the primary drivers of information systems design?
The effect of family control on corporate performance : The objective of this work is to analyze whether family controlled companies are more profitable than non-family firms.
Identify if author has a bias and explain with examples : Use two tools of analysis for better understanding of the Constitution. Choose a Clause or Amendment and analyze.
Implement a bloom filter : Implement the Bron-Kerbosch algorithm to detect the maximal cliques - determine the number of bytes used to represent e.g. one int use the sizeof operator
What is the difference between a DNP and a PhD in nursing : What is the difference between a DNP and a PhD in nursing? Which of these would you choose to pursue if you decide to continue your education to the doctoral.
Minimum of discussing direct and indirect cost : Minimum of discussing direct and indirect cost. Make sure to include the items below:
Analyze and critique the theory and practice of the politics : Analyze and critique the theory and practice of the politics and governments of the United States and California.
Determine how you compete in the current job market : How will increasing your level of education affect how you compete in the current job market? How will increasing your level of education affect your role.

Reviews

len2209554

1/7/2019 2:30:09 AM

Please make sure that you follow all rules stated in the general remarks PDF in the moodle. The number of barbells for each task describes the ex- pected difficulty. The maximum of number of barbells per task will be 4.

Write a Review

Other Subject Questions & Answers

  What are the prevalent health concerns in this community

What are the prevalent health concerns in this community? How will demographics and health reform likely impact the organization

  Select and discuss two victimization theories

Within the Discussion Board area, write that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce..

  Write a formal business letter from your company

Describe what the first steps will be in helping them to resolve their issue. Explain why it's important that these be the first steps and what your role will be in helping them to implement these first steps.

  Outlines how the history and practices of islam shape

Critically outlines how the history and practices of Islam shape the lives of Muslims today in a short paragraph. Compare Islamic culture with other cultures

  Writing assignments

Writing Assignments: Format policy, as outlined in the Student Handbook . Good papers should demonstrate knowledge of the course materials as well as the ability to apply the theories and concepts learned to your organizational situation.

  Sociologists reject the biological definition of race

What are some important reasons as to why sociologists reject the biological definition of race?

  Explain the growth process of managed care

Explain the growth process of managed care. How have Medicare and Medicaid played a role in this growth

  Organization positioning and differentiation strategies

Discuss each organization's positioning and differentiation strategies. Be sure to include the similarities and differences between these two organizations.

  Reflecting on general education

Reflecting on General Education

  Utility of the walker and mullins research

The assigned articles for this week contribute an appealing expansion to the utility of the Walker and Mullins research. What are the basic propositions offered?

  Explain the gap in the research

Explain the gap in the research and how the author's research informs the problem. Post the link to the article.

  What strategies could all of us adopt to minimize barriers

Minimizing Barriers for Low Literacy Patients. What strategies could all of us adopt to minimize barriers and misunderstanding for low literacy patients?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd