DATA MINING, Basic Statistics

please break this problem down to laymen term so that I understand how you arrived at the answer.

1. AllElectronics caries 1000 products, P1, … P1000. Consider customers Ada, Bob, and Cathy such that Ada and Bob purchase three products in common, P1, P2, and P3. For the other 997 products, Ada and Bob independently purchase seven of them randomly. Cathy purchases 10 products, randomly selected from the 1000 products. In Euclidean distance, what is the probability that dist(Ada, Bob) > dist(Ada, Cathy)? What if Jaccard similarity (Chapter 2) is used? What can you learn from this example? (Problem 11.2, Page 539-)

Book:
Data Mining: Concepts and Techniques, 3rd Edition
Problem 11.2, Page 539
Posted Date: 2/19/2013 3:51:34 PM | Location : United States







Related Discussions:- DATA MINING, Assignment Help, Ask Question on DATA MINING, Get Answer, Expert's Help, DATA MINING Discussions

Write discussion on DATA MINING
Your posts are moderated
Related Questions
define kurtosis with relevant examples

any of your writer able to use the database given to generate the null, alternative..etc.. into a power point presentation

Fenn Museum, a nongovernmental not-for-profit organization, had the following balances in its statement of functional expenses: Education $300,000 Fundraising 250,000 Management an

A study done at the University of Maryland (cited in Weiss, Introductory Statistics, 7th ed, 2005) measured the body temperatures of 93 healthy humans. On the frequency- histogram

1. Generate a large dataset ( at least 1000 observations) µ with a known mean, µ between 20 and 40 and variance s = 9 using rnorm(). identify it as D. Use the functions mean() and

Define of capital account reflecting the funds invested in an entity. Capital account is termed as the account reflecting the funds invested in the entity by the stockholders or pa

if "profit" maximisation is biased towards maximising the interest of only one stakeholder group, would you expect that over time there will be less emphasis on profit and more emp

Quartile Deviation The range as a measure of dispersion discussed above has certain limitations. It is based on two extreme items and it fails to take account of the scatter withi


Question 1: (a) The grouping of organisational activities (usually into ‘departments' or larger ‘divisions') will be done in different ways. Outline the criteria which can be u