Probabilistic analysis of hash functions

Assignment Help Data Structure & Algorithms
Reference no: EM13933865

Probabilistic Analysis of Hash Functions

In this assignment, you will write and evaluate 5 different hash functions with whose input keys are names. Your evaluation should be based on popular American first names. You can create a person's name from any two "first names" in the 2000 census link supplied below.

https://www.ssa.gov/OACT/babynames/decades/names2000s.html

To evaluate each hash function, you should determine the value each input key maps to. A counter representing this value should be incremented. For a good hash function, a histogram representing the count for each value should be uniformly distributed. Also your program should run a χ2 test on the resulting output to determine the randomness of the distribution of the hash function.

Assignment submission should include your hash functions, a histogram representing the distribution of each function's output, and an analysis of the χ2 test results for each function. Determine which one of your custom hash functions should be used to hash American first names.

Your submission should include 4 different hashing functions, using techniques such as we demonstrated in class (folding, adding, squaring, etc...). You will have a hash table array of a prime number of slots (101, for example) If you run a couple five hundred names through a hash function, you should get a fairly uniform distribution in the table, no too many with 0 or 1, and not too many with 10 or more keys hashing to a location.

You will run each hash function 10 times, and perform the χ2 test which measures if it is uniform. 8 to 9 "yes" results will verify the hashing function is good. 6-7 or fewer indicates the function is probably not uniform.

A successful program may have 2 or 3 good functions and 1 or 2 poor functions. χ2is computed by the following formula

χ2  = [Σ0≤i<r(fi - N/r)2]/[ N/r]

If χ2 is in the range of r ± 2√r, we conclude that distribution is indeed random. Otherwise it may not be. If you will generate 500 numbers in the 0-100 range. Then N=500 and r=101.

Reference no: EM13933865

Questions Cloud

Relevant models and theoretical perspectives : Consider using relevant models and theoretical perspectives to make your analysis. For example, you might find that one successful organisation has executed its marketing strategy by applying Ansoff's Matrix in a disciplined manner, whereas anothe..
Case study sunset grill at blue : Case Study Sunset Grill at Blue Read the attached case and answer the below questions; 1. What is Sunset Grille's service concept? Evaluate their sustainability.
Increasing fees elastic or inelastic : Is demand for courses at the universities that did not increase their fees elastic or inelastic with respect to universities that did increase their fees? What is the importance of this degree of elasticity?
Record the payment of the merchandise in requirement c : Record the payment of the merchandise in Requirement c in a horizontal statements model like the one shown above.
Probabilistic analysis of hash functions : Probabilistic Analysis of Hash Functions - In this assignment, you will write and evaluate 5 different hash functions with whose input keys are names. Your evaluation should be based on popular American first names
Historical significance of the source : This is the key part of assignment. Explain the historical significance of the source. How does the source aid in the understanding of the time period. How does it compliment the textbooks coverage of the event, person, or place? Do not report ver..
How natural environment has affected or influenced health : Describe your own restorative environment(s). Based on what you have learned what factors contribute to the environment being restorative to you?"
Policies government can implement to achieve economic growth : Discuss the policies which a government can implement to achieve economic growth. To what extent do you agree that economic growth is always beneficial to a country?
Largest indoor amusement park in the world : On average for the Northern Gate 60% of customers arrives batched as follow 2 at a time, 20% 3 at a time and 20% arrives individually. Analysis shown that 70% of arrivals from both gates will play with FormulaRossagame Which is the fastest Roller..

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Definitions and discussion on best-average-worst case

Definitions and discussion (0-complexity of algorithms discussed: best-average-worst case, doubly linked list, trees, binary trees, binary search trees, AVL, and b-tree.

  What is the time complexity of the algorithm

what is the time complexity of the method and what is the time complexity of the algorithm - what is the time complexity of the binarySearch

  Dynamic-programming algorithm for rod-cutting problem

Consider a modification of the rod-cutting problem in which, in addition to a price pi for each rod, each cut incurs a fixed cost of c. Give a dynamic-programming algorithm to solve this modified problem.

  Write a program to implement a linear linked list

Write a C/C++ program to implement a singly linear Linked List

  Create a pda with 2 stacks

Create a PDA with 2 stacks. The first stack is preloaded with data (example below), the data input consists of 1 & 0 as well. Your PDA should process the input data, adding the binary string to the values in the first stack and storing the result in ..

  Design an algorithm that asks for the user for the number

Design an algorithm that asks for the user for the number of fixed-price items to order, adds sales tax and flat-rate shipping, and displays the result.

  Multilayer protocol in the sense of the osi reference model

The president of company A decides that company A should work with company B to develop a new product. The president tells her legal department to look into the idea, and they in turn ask the engineering department for help.

  Write code to implement the expression

Write code to implement the expression: A= (B+C) * (D+E) on 3-, 2-, 1- and 0- address machines. In accordance with programming language practice, computing the expression should not change the values of its operands. Show all instructions.

  Irected graph g = (v,e) in which edges that leave the source

Suppose that we are given a weighted, directed graph G = (V,E) in which edges that leave the source vertex s may have negative weights, all weights are nonnegative, and there are no negative-weight cycles. Argue that Dijkstra's algorithm correctly fi..

  In this programming assignment you will implement an open

in this programming assignment you will implement an open hash table and compare the performance of four hash functions

  Write an algorithm to count nodes in a linked list

storage pool and that there is a special null value. Write an algorithm to count the nodes in a linked list with first node pointed to by first."

  Implementing a simple spell checking program

Implementing a simple spell checking program using binary search trees. One of the most-used applications of computers today is checking spelling. In this question, you will load a large dictionary (approximately 173,529 words) into a binary searc..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd