Implement a bloom filter

Assignment Help Other Subject
Reference no: EM132209554

Assignment 1:

Task 1:

1. Why is it dangerous to return a reference when you must return an object?

2. What is an implicit interface in the context of compile-time polymorphism?

3. Why should you prefer range member functions to their single-element coun- terparts?

4. Why should you avoid in-place modification of keys in a set?

5. Why should predicate functions be pure?

6. Why should you avoid the default capture modes of lamdbas?

7. What do std::move and std::forward do?

8. How are the strings of a std::set<std::string*> sorted? How you would make them sorted lexicographically? Provide the necessary code.

Task 2:

For this task we provide a graph.hpp file with a defined interface for an unweighted directed Graph class storing its vertices in an adjacency list. Edges connecting vertices to themselves are not allowed. Your task is to implement the functions of the interface and to write tests for each public function using the Catch test framework. Perform your tests not only on trivial graphs but also on larger graphs with at least 6 vertices and 9 edges. We expect you to test each function with valid and invalid input. In total, you should implement at least 15 checks.

Task 3:

In this task you should implement the Munkres algorithm that has a runtime com- plexity of O(n3) and solves the assignment problem. Appropriate pseudo-code that may help guiding your implementation can be found in Algorithm 2 in the scientific publication by Riesen & Bunke. The article is available on our moo- dle. The corresponding code templates for this assignment sheet contain the files munkres_algorithm.cpp/.hpp. Please provide an implementation for the algorithm having the signature Matrix<int> run_munkres_algorithm(Matrix<int> c) in munkres_algorithm.cpp. The source file task3.cpp already contains a simple test-case, which is described below. You may use this file to further test your implementation using different and probably more complex assignment problems.

Input matrix:

250

400

350

400

600

350

200

400

250

Output matrix:

0 1 0

0 0 1

1 0 0

Task 4:

In this task you will implement a Bloom filter. It is a space-efficient probabilistic data structure that is used to test whether an element belongs to a set. False positive matches can occur but not false negatives. Due to its properties elements can only be added but not removed. The more elements the Bloom filter contains, the higher gets the false positive rate. Create a templated class template<typename Key, unsigned int m, typename Hash> BloomFilter that accepts as template parameter the class of the element it will store, the num- ber of bits it uses m and the hash function to use for your Key class. Use the MurmurHash function provided in the murmurhash.hpp. Your hash function object should provide the following operator:

 uint32_t operator()(int key, unsigned int seed) const

The class should be stored in bloom_filter.hpp. As data structure for the bit array use std::bitset. Your class should provide the following functions:

1. BloomFilter(unsigned int k), k is the number of hash functions
2. BloomFilter(std::initializer_list<Key>, unsigned int k)
3. template<typename It> BloomFilter(It first, It last, unsigned int k)
that inserts all elements defined by the iterator range.
4. bool insert(const Key&) that inserts the key and returns if the insertion was successful. If the element was already in the filter then it returns false.
5. bool contains(const Key&) const that returns if the element is present.
6. template<typename It> double false_positive_rate( It positive_begin, It positive_end,
It negative_begin, It negative_end) const
see false positive rate
7. double space_ratio(uint64_t num_elements) const, ratio of the space con-
sumed by your filter in comparison to the minimum size required for num_elements of the Key class
8. uint64_t approx_size() const that approximates the number of elements in the Bloom filter via the formula (k is the number of hash functions and X is the number of bits set to one):

n* = -m/k.ln(1 - x/m)

To determine the number of bytes used to represent e.g. one int use the sizeof operator.

Task 5:

Write a program that takes an adjacency matrix on the standard input and outputs all maximal cliques on the standard output. The adjacency matrix can be read in with the matrix class of assignment

2. Implement the Bron-Kerbosch algorithm to detect the maximal cliques.

Sample input:

0 1 0 0 1 0
1 0 1 0 1 0
0 1 0 1 0 0
0 0 1 0 1 1
1 1 0 1 0 0
0 0 0 1 0 0
Sample output:

{0, 1, 4}
{1, 2}
{2, 3}
{3, 4}
{3, 5}
Sample input2:

0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 1 1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 1 0 1 1 1 0 0 0
0 0 0 1 0 0 1 0 0 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 1 1 0 1 1 0 0 0
0 0 0 0 0 0 1 1 0 1 0 1 0 0 0
0 0 0 0 0 0 1 1 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
Sample output2:

{0, 1, 2}
{1, 3}
{3, 4, 5}
{3, 6, 7}
{6, 7, 9, 10, 11}
{8, 9}
{12, 13, 14}

Assignment 2:

Task 1 :

In this task you should output a topological ordering for a given directed graph. You should read the adjacency matrix from the standard input and output the ordering on the standard output. The output should list the node indices as comma and space separated list. To perform the topological sorting implement Kahn's algorithm and use a std::priority_queue to store the nodes without incoming edges. If the graph is not a directed acyclic graph output an error message on the standard error stream.

Sample input:

0 1 0 0 0
0 0 1 1 0
0 0 0 0 1
0 0 1 0 1
0 0 0 0 0
Sample output:
0, 1, 3, 2, 4

Task 2 :
A group of K students wants to buy N gifts from a store. The seller wants to have as many new customers as possible, so he increases the price for customers buying there multiple gifts.
The price P of gift gi for a customer that has already bought x gifts is Pgi = (x+1)•ci,
with ci begin the original cost for gift gi.

Your task is to find the minimum cost for the group to buy N gifts (in any order) and print it out on the standard output.
The input will be provided on the standard input stream. In the first line we will provide the number N of gifts to buy and the size K of the group. The second line will contain the cost (c0, ..., ci, ..., cN-1) for each gift gi as space-separated floating points.

Sample input:

3 2
2.5 5 6.2

Sample output:

16.2

Task 3 (15 P.):

Given are n cities. Each city ci is connected to its neighboring city ci+1 and the distance between ci and cj is |i - j|.

To enable faster delivery of Christmas presents, beaming stations have been built in some of the cities. However, due to the large energy consumption they have not been turned on yet. Once enabled, each beaming station allows to teleport anything up to a distance < k.

Your task is to find the minimum number of stations that must be enabled to ensure that all cities are within the beaming radius of at least one beaming station, so that their inhabitants receive their gifts in time. Print the minimum number found on the standard output. If there is no solution to the problem print -1.

The input will be provided on the standard input stream. In the first line we will provide the number n of cities and the range k of the beaming stations. The second line will consist of n space-separated 0s and 1s, the kth digit denoting whether the kth city has a beaming station (1) or not (0).

Sample input:
6 2
0 1 1 1 1 0

Sample output:

2

Task 4:
The visitor design pattern allows to decouple algorithms from objects they operate on. It allows adding new virtual functions without changing the classes of these ob- jects. In this task we provide three classes used to model trees (Node, InternalNode, Leaf). Your task is to implement three different visitors on a tree modeled by the classes:

1. Create a LeavesSumVisitor which returns the sum of the leaves in the tree over the getResult member function.

2. Create a RedNodesProductVisitor which returns the product of red nodes over the getResult member function.

3. Create an AdvancedVisitor which returns the absolute difference between the sum of the values of non-leaf nodes at even depth and the sum of the values of black leaf nodes.

Your visitors should implement the TreeVisitor interface provided in visitors.hpp. Your program takes on the standard input a tree and outputs on the standard output the result of the LeavesSumVisitor, the RedNodesProductVisitor and of the AdvancedVisitor, each on a new line. If the input is invalid print an error message on the standard error stream and terminate.

The first line of the input contains the number of nodes n of the tree. The second line contains n space-separated integers denoting the values of the nodes. The third line contains n space-separated strings (Black or Red) denoting the color of the nodes. In the remaining lines the connections of the nodes are described by two space-separated integers on each line, denoting the source and destination node.

The indexing of the nodes starts at 0, and 0 is always the root of the tree. The root is at depth 0. Keep in mind that if a tree has only 1 node this node is a leaf (and the root) and not an internal node.

Sample input:
5
4 7 2 5 12
Red Black Red Red Black 0 1
0 2
2 3
2 4
Sample output:
24
40
15

Task 5 (15 P.):
For this task you should compute gene clusters in which each gene is at most at distance d of at least one other gene of the cluster and lying on the same chromosome. To compute the distance between two genes compare their middle positions.

The genes will be passed as tab-separated text file as first argument. The allowed distance will be passed as second argument and the output file as third argument. The input file consists of three columns: chromosome start stop, start and stop represent a closed interval and start at one. The output file consists of four columns, i.e. the same as the input file and one additional column containing the cluster index. To assign each gene to a cluster, add a new column to the input file: cluster=i, with i being the cluster index (starting at 1). The clusters should be sorted in ascending coordinate order, i.e., with chr1 coming before chr2 (and before chrX and chrY), then according to their starting position. The order of the genes in the output file should be the same as in the input file (this means that the cluster indices will most of the time not be outputted in sorted order).

Sample command:
./task5 sample_input.txt 2000 sample_output.txt

Sample input:

chr2 1300 10800
chr1 21000 24000
chr1 100 21000
chr1 250 18000
Sample output:
chr2 1300 10800 cluster=3
chr1 21000 24000 cluster=2
chr1 100 21000 cluster=1
chr1 250 18000 cluster=1

Reference no: EM132209554

Questions Cloud

Discuss current research that links patient safety outcomes : Discuss current research that links patient safety outcomes to ADN and BSN nurses. Based on some real-life experiences, do you agree or disagree.
What are the primary drivers of information systems design : Key performance indicators: What are the primary drivers of information systems design?
The effect of family control on corporate performance : The objective of this work is to analyze whether family controlled companies are more profitable than non-family firms.
Identify if author has a bias and explain with examples : Use two tools of analysis for better understanding of the Constitution. Choose a Clause or Amendment and analyze.
Implement a bloom filter : Implement the Bron-Kerbosch algorithm to detect the maximal cliques - determine the number of bytes used to represent e.g. one int use the sizeof operator
What is the difference between a DNP and a PhD in nursing : What is the difference between a DNP and a PhD in nursing? Which of these would you choose to pursue if you decide to continue your education to the doctoral.
Minimum of discussing direct and indirect cost : Minimum of discussing direct and indirect cost. Make sure to include the items below:
Analyze and critique the theory and practice of the politics : Analyze and critique the theory and practice of the politics and governments of the United States and California.
Determine how you compete in the current job market : How will increasing your level of education affect how you compete in the current job market? How will increasing your level of education affect your role.

Reviews

len2209554

1/7/2019 2:30:09 AM

Please make sure that you follow all rules stated in the general remarks PDF in the moodle. The number of barbells for each task describes the ex- pected difficulty. The maximum of number of barbells per task will be 4.

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd