Probabilistic analysis of hash functions

Assignment Help Data Structure & Algorithms
Reference no: EM13933865

Probabilistic Analysis of Hash Functions

In this assignment, you will write and evaluate 5 different hash functions with whose input keys are names. Your evaluation should be based on popular American first names. You can create a person's name from any two "first names" in the 2000 census link supplied below.

https://www.ssa.gov/OACT/babynames/decades/names2000s.html

To evaluate each hash function, you should determine the value each input key maps to. A counter representing this value should be incremented. For a good hash function, a histogram representing the count for each value should be uniformly distributed. Also your program should run a χ2 test on the resulting output to determine the randomness of the distribution of the hash function.

Assignment submission should include your hash functions, a histogram representing the distribution of each function's output, and an analysis of the χ2 test results for each function. Determine which one of your custom hash functions should be used to hash American first names.

Your submission should include 4 different hashing functions, using techniques such as we demonstrated in class (folding, adding, squaring, etc...). You will have a hash table array of a prime number of slots (101, for example) If you run a couple five hundred names through a hash function, you should get a fairly uniform distribution in the table, no too many with 0 or 1, and not too many with 10 or more keys hashing to a location.

You will run each hash function 10 times, and perform the χ2 test which measures if it is uniform. 8 to 9 "yes" results will verify the hashing function is good. 6-7 or fewer indicates the function is probably not uniform.

A successful program may have 2 or 3 good functions and 1 or 2 poor functions. χ2is computed by the following formula

χ2  = [Σ0≤i<r(fi - N/r)2]/[ N/r]

If χ2 is in the range of r ± 2√r, we conclude that distribution is indeed random. Otherwise it may not be. If you will generate 500 numbers in the 0-100 range. Then N=500 and r=101.

Reference no: EM13933865

Questions Cloud

Relevant models and theoretical perspectives : Consider using relevant models and theoretical perspectives to make your analysis. For example, you might find that one successful organisation has executed its marketing strategy by applying Ansoff's Matrix in a disciplined manner, whereas anothe..
Case study sunset grill at blue : Case Study Sunset Grill at Blue Read the attached case and answer the below questions; 1. What is Sunset Grille's service concept? Evaluate their sustainability.
Increasing fees elastic or inelastic : Is demand for courses at the universities that did not increase their fees elastic or inelastic with respect to universities that did increase their fees? What is the importance of this degree of elasticity?
Record the payment of the merchandise in requirement c : Record the payment of the merchandise in Requirement c in a horizontal statements model like the one shown above.
Probabilistic analysis of hash functions : Probabilistic Analysis of Hash Functions - In this assignment, you will write and evaluate 5 different hash functions with whose input keys are names. Your evaluation should be based on popular American first names
Historical significance of the source : This is the key part of assignment. Explain the historical significance of the source. How does the source aid in the understanding of the time period. How does it compliment the textbooks coverage of the event, person, or place? Do not report ver..
How natural environment has affected or influenced health : Describe your own restorative environment(s). Based on what you have learned what factors contribute to the environment being restorative to you?"
Policies government can implement to achieve economic growth : Discuss the policies which a government can implement to achieve economic growth. To what extent do you agree that economic growth is always beneficial to a country?
Largest indoor amusement park in the world : On average for the Northern Gate 60% of customers arrives batched as follow 2 at a time, 20% 3 at a time and 20% arrives individually. Analysis shown that 70% of arrivals from both gates will play with FormulaRossagame Which is the fastest Roller..

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Create a data flow diagram of the current system

Create a data flow diagram of the current system. Create a system flowchart of the existing system. Analyze the internal control weaknesses in the system.

  Project1 install mysql dbms and dblanguage connector

project1. install mysql dbms and dblanguage connector software on your machine2. create world database using mysql

  Process in which cpu must undertake to read a value from me

On the von Neumann, describe the process that the CPU must undertake to read a value from memory and to write a value from memory and to write a value to memory in terms of what is put into the MAR, MBR, address bus, data bus, and control bus

  Write algorithm to calculate the median using queries

Calculate the median using as few queries as possible. Provide an algorithm which determines the median value using at most O(lg n) queries.

  Produce a tree adt

Combine the code you have written for Binary Search Trees (BSTs) and Red Black 'Rees (RBTs) in previous labs to produce a tree ADT which can be either a simple BST or a self-balancing RBT.

  Neural and tree learning on continuous attributes

Compare and contrast the roles of these numbers in the two models and compare and contrast the methods of learning these numbers in the two models.

  Creating a random file of the signs

Create a random file of the signs of all angles from zero degrees to ninety degrees. Make every entry accurate to three places. Write a program that will show the sign of any angle typed on the keyboard.

  Devise a linear-time algorithm to count the parallel edges

Parallel edge detection: Devise a linear-time algorithm to count the parallel edges in a graph. Write the algorithm in pseudo-code.

  Complete proof that the graph bgais indeed a buffer graph

Complete the proof that the graph BGa (defined in the proof of Theorem 5. 13) is indeed a buffer graph, i. e., for each path P E P there exists a guaranteed path with image P.

  Identify classes, functions, and algorithms

Detailed requirements. Using guidance provided in the text, (specifically chapters 12 and 13) develop your detailed requirements. Develop as many as possible but you must cover some detailed requirements for each of your high level requirements.

  Find the mean number of rounds per contention period

Two CSMA/CD stations are each trying to transmit long documents. After each frame is sent, they contend for the channel using the binary exponential backoff algorithm.

  Calculating an arithmetic mean, median and mode

Calculate an arithmetic mean, median, and mode for up to fifty test scores. The information are contained in a text file. To determine the median, first sort the array.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd