How many pairs are counted on the third pass

Assignment Help Basic Computer Science
Reference no: EM131212144

Using the assumptions of Exercise 22.2.4, suppose we run a three-pass Multistage Algorithm on the dataset. Assuming that on the second pass there are again 100,000 buckets, and the hash function distributes pairs randomly among the buckets, answer the following questions, all in terms of s the ratio of the support threshold to the number of baskets.

a) Approximately how many frequent buckets will there be on the second pass?

b) Approximately how many pairs are counted on the third pass?

Exercise 22.2.4

Consider running the PCY Algorithm on the data of Exercise 22.2.3, with 100,000 buckets on the first pass. Assume that the hash function used distributes the pairs to buckets in a conveniently random fashion. Specifically, the 499,500 little-little pairs are divided as evenly as possible (approximately 5 to a bucket). One of the 100,000 big-little pairs is in each bucket, and the 4950 big-big pairs each go into a different bucket.

a) As a function of s, the ratio of the support threshold to the total number of baskets (as in Exercise 22.2.3), how many frequent buckets are there on the first pass?

b) As a function of s, how many pairs must be counted on the second pass?

Exercise 22.2.3

Imagine that there are 1100 items, of which 100 are "big" and 1000 are "little." A basket is formed by adding each big item with probability 1/10, and each little item with probability 1/100. Assume the number of baskets is large enough that each item set appears in a fraction of the baskets that equals its probability of being in any given basket. For example, every pair consisting of a big item and a little item appears in 1/1000 of the baskets. Let s be the support threshold, but expressed as a fraction of the total number of baskets rather than as an absolute number. Give, as a function of s ranging from 0 to 1, the number of frequent items on Pass 1 of the A-Priori Algorithm. Also, give the number of candidate pairs on the second pass.

Reference no: EM131212144

Questions Cloud

Individual-text file : Store ten student names and their individual score in a text file such as Notepad. There will be one score per student. Write a C# program using Microsoft® Visual Studio® to retrieve the names and the scores.
Identify the technology barriers to the company : Identify the hard and soft technology used for both the domestic and global environments. This is not about computers or software; see lesson plan for details and remember to incorporate critical thinking.
Give the number of candidate pairs on the second pass : Give, as a function of s ranging from 0 to 1, the number of frequent items on Pass 1 of the A-Priori Algorithm. Also, give the number of candidate pairs on the second pass.
Write an essay the narrate an event from your life : Write an essay the narrate an event from your life. Does it the clearly state what the rest of the paragraph is about? Remember: only one point, idea, day, incident, etc. per paragraph.
How many pairs are counted on the third pass : As a function of s, the ratio of the support threshold to the total number of baskets (as in Exercise 22.2.3), how many frequent buckets are there on the first pass?
What are temporary tables : What are temporary tables? When are they useful? Justify with an example.
Compare and contrast minoan and mycenaean art : Explain how prehistoric Minoan and Mycenaean art and architecture may reveal contact with ancient civilizations from Egypt and the Near East.
Limit check and a length check : So what is the difference between a limit check and a length check e.g in the case of creating agoogle account page.
Logical database design and physical database design : Differentiate between logical database design and physical database design. Show how this separation leads to data independence.

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Fun of role-play is to get into character

The fun of role-play is to get into character! Be the character! Talk like the character! Feel like you are the character! The scenario is deciding on the energy plan for Liechtenstein. Discuss the method you will use to get into your character.

  How to treat e as a weak entity set

Modify the E-R diagram of Figure 7.27b to introduce constraints that will guarantee that any instance of E, A, B,C, RA, RB, and RC that satisfies the constraints will correspond to an instance of A, B,C, and R. c. Modify the translation above to h..

  Explain intrinsic or extrinsic factors

Review your classmates' posts, and respond to at least two of your peers. Select at least one peer who noted different intrinsic or extrinsic factors than yours. Why do you think their factors are different

  Quarter in a sinking fund earning

If inflation holds at 5.2% per year for 5 years, what will be the cost in 5 years of a car that costs $16,000 today? How much will you need to deposit each quarter in a sinking fund earning 8.7% per year to purchase the new car in 5 years?

  What is the purpose of earned value

What is the purpose of "earned value"? How would the progress of the project as a whole be calculated?

  Resolve using an entity or an associative entity

Give an example of many-to-many relationship. Resolve using an entity or an associative entity. Which did you use? Why?

  Reasons and advantages of expanding into different nations

Explain fully the advantages of expanding across borders??12. Distinguish in detail the reasons and advantages of expanding into different nations through mergers and acquisitions, as opposed to greenfield investments.??13.

  Why are multiple levels of cache needed in computer

Modern CPU chips have one, two, or even three levels of cache on chip. Why are multiple levels of cache needed? Suppose that a CPU has a level 1 cache and a level 2 cache, with access times of 1 nsec and 2 nsec, respectively.

  How many automobiles are to be described

How many autos do you want?: 2 Enter make: Honda Enter color: Blue Enter make: Chevy Enter color: Red You have a Blue Honda. You have a Red Chevy.

  Identify and discuss the technologies that have gone

Identify and discuss the technologies that have gone from analog to digital. What was the reasoning for each conversion? Include a question that responding students can reply to within your initial post(s)

  What are the overall benefits of having professional codes

What are the overall benefits of having professional codes? Do the benefits outweigh the disadvantages of not having them? Find a current case involving ethical standards.

  Find a video on how wireless attacks are conducted

Find a video on how wireless attacks are conducted. Summarize the video in your own words, and discuss why an investigator should care about this information. Include the link to the video.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd