How many records would you expect would be removed

Assignment Help Basic Statistics
Reference no: EM131114153

1. A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed?

2. Given a database table containing weather data as follows:

Outlook

Temperature

Humidity

Windy

Class: Play

Sunny

Hot

High

False

No

Sunny

Hot

High

True

No

Overcast

Hot

High

False

Yes

Rainy

Mild

High

False

Yes

Rainy

Cool

Normal

False

Yes

Rainy

Cool

Normal

True

No

Overcast

Cool

Normal

True

Yes

Sunny

Mild

High

False

No

Sunny

Cool

Normal

False

Yes

Rainy

Mild

Normal

False

Yes

Sunny

Mild

Normal

True

Yes

Overcast

Mild

High

True

Yes

Overcast

Hot

Normal

False

Yes

Rainy

Mild

High

True

No

Where Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response).

a. Compute the prior probability

P(PLAY='Yes') =
P(PLAY='No') =

b. Compute the conditional probability

P(Outlook='Sunny'|PLAY='Yes') =
P(Outlook='Sunny'|PLAY='No') =

P(Temperature = ‘Mild'|PLAY='Yes') =
P(Temperature = ‘Mild'|PLAY='No') =

P(Humidity = ‘High'| PLAY='Yes') =
P(Humidity = ‘High'| PLAY='No') =

P(Windy = ‘False'| PLAY='Yes') =
P(Windy = ‘False'| PLAY='No')=

3. Using naïve Bayes classification method to classify the following unknown record and to indicate whether to play or not.

(Outlook = ‘Sunny', Temperature = ‘Mild' , Humidity = ‘High' , Windy = ‘False')

4. Association Rule Mining:

Given a transaction database for mining association rule as follows:

Database D

TID

Items

100

A C D

200

B C E

300

A B C E

400

B E

Please useApriorialgorithm to mine association rules with minimum support count = 2.

(Please show the derivation process step by step with candidate itemsets.)

Reference no: EM131114153

Questions Cloud

How these elements contribute to the central ideas of play : Review the stage directions and, in your discussion post, identify the most important aspects of the setting.Then, consider how these elements contribute to the central ideas of the play
How many gates would such a system require : Develop a two-dimensional addressing system using a 6-to-64 decoder, a 64-word×128- bit matrix, and 16-input multiplexers. How many gates would such a system require?
How would the results be used to make a diagnosis : Explain what physical exams and diagnostic tests would be appropriate and how the results would be used to make a diagnosis. List five different possible conditions for the patient's differential diagnosis, and justify why you selected each.
Determine the value of the company shares : The average growth of dividends for the past five years is expected to persist in the foreseeable future. You are required to determine the value of the company's shares after payment of the dividend of 2004.
How many records would you expect would be removed : A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. About how many records would you expect would be removed?
Explain the implied volatility : Find the price of a six month european call option on a non-dividend paying stock with a strike price of 20 when the current stock price is 18, the risk free rate is 6% per annum and the volatility is 30 per annum. Use the Black scholes merton mod..
Describe the two families in the film : Describe the two families in the film (ie the names of the family, people in household, jobs held, current financial situation,etc) - Did race impact the families lives? Explain
Minimum average collection period : The minimum average collection period required to approve the cash discount plan is _________days?
Show a block diagram of an srff connected to store 1 bit : Using 4 SRFFs obtain the block diagram for an SISO shift register.

Reviews

Write a Review

Basic Statistics Questions & Answers

  Grams and a standard deviation

Weights of newborn babies in the United States are normally distributed with a mean of 3425 grams and a standard deviation of 495 grams. A newborn weighing less than 2200 grams is considered to be at risk. What fraction of newborn babies are in th..

  Assume that the hay fever rate in each age group has an

a random sample of 16 communities in western kansas gave the following information for the rate of hay fever per 1000

  What the percentage of nights that the number of women

If you record data for a year, and find that the mean number of women in shelters is 250, with a standard deviation of 75, and the distribution is normal, answer this. If the city's shelters have a capacity of 350, will that be enough places for the ..

  Type of damage occur after grabbing a door frame

What type of damage could occur if you grabbed a door frame with your left arm stepped with your right leg the leg shoots out from under you and you bounce down hitting the edge.

  Determining z-score and proportion

Toby's trucking company determined that the distance traveled per truck per year is normally distrinuted, with a mean of 50.0 thousand miles and a standard deviation of 12.0 thousand miles.

  Based on the sample results can you conclude that there was

suppose a random sample of 100 u.s. companies taken in 2005 showed that 21 offered high-deductible health insurance

  Probability of random letter selection

Many newspapers carry a certain puzzle in which the individual must unscramble letters to form words. How many ways can the letters of YARNLE be arranged?

  Probability of selecting sample has mean attention span

What is the probability of randomly selecting a sample of 30 scores which has a mean attention span equal to or greater than 17 minutes.

  Electronic circuit board fails randomly with failure rate

An electronic circuit board fails randomly with failure rate of λ=0.00021 /h. When it fails it is replaced by a new board.

  Impact of eastern european growth

Impact of Eastern European Growth The managers of Loyola Corp. recently had a meeting to discuss new opportunities in Europe as a result of the recent integration among Eastern European countries.

  If population standard deviation is known to be ten days

What is the 99% confidence interval, if the population standard deviation is known to be 10 days? What is the 99% confidence interval, if the sample standard deviation is 10 days?

  A memory researcher shows each study participant a list of

a memory researcher shows each study participant a list of words on a computer with some participants seeing words that

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd