Response frequency and the variable monetary

Assignment Help Basic Statistics
Reference no: EM131892831

ASSIGNMENT

In the current assignment we apply some of the tools to analyze the data. The data was collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes their blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The current assignment involves data collected on a random sample of 748 donors. The data was obtained from the UCI Machine Learning Repository.

The file "transfusion.csv" contains the data. The file can be found here. The file contains 5 variables:
- recency = The number of months since the last donation. (numeric)
- frequency = The total number of donations. (numeric)
- monetary = Total blood donated (in c.c.). (numeric)
- time = The number of months since the first donation. (numeric)
- march2007 = An indicator. Indicates those that donated blood in March, 2007. (factor)
In the assignment we consider the last four variables.

Comparing Two Samples
Consider "frequency" as a response and "march2007" as an explanatory variable. Plot the relation between the two variables, test the equality of the expectation in the two sub-samples and the equality of the variance. Repeat the same analysis for the case where the response "frequency" is replaced by the log-transformed response: "log(frequency)". In Tasks 1-3 you are asked to describe the results of the analysis.

Linear Regression
In Tasks 4-7 you are asked to conduct an analysis similar to the analysis of Tasks 1-3. The difference is that the numerical variable "time" is used as the explanatory variable. The model of linear regression assumes that the expectation of the response is a linear function of the explanatory variable. Another assumption of the model is that the variance of the response is constant for each value of the explanatory variable. Frequently, however, one may observe an increase in the variance for larger values of the explanatory variable. Replacing the response by the log-transformed response is a commonly used method to overcome this difficulty. The analysis that involves the log of the response can be carried out via the replacement of the response "frequency" in the formula by the transformed response "log(frequency)".

The Relation Between Two Variables
The final Task 8 involves the investigation of the relation between the response "frequency" and the variable "monetary".

Tasks

Comparing Two Samples:

1. Apply the function "plot" to the formula that relates the response "frequency" to the explanatory variable "march2007" in order to produce the two box-plots of the response. Redo the plotting with "frequency" replaced by "log(frequency)". The distribution of the variable "log(frequency)" is:

__ More symmetric, __ Less symmetric compared to the distribution of the variable "frequency".

Mark the most appropriate option and attach the R code that produces the two plots:

2. Mark the null hypotheses that you reject with a significance level of 5% and those that you do not reject:

(Reject/Don't Reject) H0: The expectation of "frequency" is the same in the two subsets,

(Reject/Don't Reject) H0: The expectation of "log(frequency)" is the same in the two subsets.

Explain your answer:

3. Mark the null hypotheses that you reject with a significance level of 5% and those that you do not reject:

(Reject/Don't Reject) H0: The variance of "frequency" is the same in the two subsets,

(Reject/Don't Reject) H0: The variance of "log(frequency)" is the same in the two subsets.

Explain your answer:

Linear Regression:

4. Apply the function "plot" to the formula that relates the response "frequency" to the explanatory variable "time" in order to produce the scatter plot. Add the regression line to the plot. The variability of the variable "frequency, for larger values of the explanatory variable, is:

__ Smaller, __ Larger, __ Constant.

Mark the most appropriate option and attach the R code that produces the two plots:

5. Mark the null hypotheses that you reject with a significance level of 5% and those that you do not reject:

(Reject/Don't Reject) H0: The slope of "time" in the regression line of the response "frequency" is equal to zero,

(Reject/Don't Reject) H0: The slope of "time" in the regression line of the response "log(frequency)" is equal to zero.
Explain your answer:

6. The 95%-confidence interval of slope of "time" in the regression line of the response "log(frequency)" is:
Lower end = ____, Upper end = ____.

Attach the R code that produces the confidence interval:

7. The regression line between "time" as an explanatory variable and "log(frequency)" as a response is:
__ Increasing, __ Decreasing, __ Constant.

Mark the most appropriate option and explain your answer:

The Relation Between Two Variables:

8. Apply the function "plot" to the formula that relates the response "frequency" to the explanatory variable "monetary" in order to produce the scatter plot. Add the regression line to the plot. The points in the scatter plot are:

__ All on the same line, __ Show a linear trend but are not on the same line, __ Don't show a linear trend.

Mark the most appropriate option and attach the R code that produces the plot:

Attachment:- Data.rar

Reference no: EM131892831

Questions Cloud

Options for employees to communicate : Is it possible for there to be too many options for employees to communicate?
Discuss what your role would be in this case : Discuss what you as a case worker would do in this situation.Discuss what your role would be in this case?
How the government was trying to shape public opinion : Immigration History Research Center- What does this map suggest about how the government was trying to shape public opinion during the 1930s?
How ikea used focus groups to improve new retail stores : The text describes how IKEA used focus groups to improve its new retail stores. Choose a service. As a marketing manager, how would you go about using focus.
Response frequency and the variable monetary : Relation between the response frequency and the variable monetary - Mark the most appropriate option and attach the R code that produces the two plots
Question is group think an issue for unge : Question is group think an issue for Unge? Explain in detail why or why not?
Role of accounting in organizational structure : What is the role of accounting in organizational structure? Please explain.
How can technology make a manufacturing assembly task : How can technology make a manufacturing assembly task more motivating? How might it make the task less motivating? What theories from this chapter did you use.
What was the amount of the penalty charge : If the tax was due on Apr 15 but was paid on July 17, what was the amount of the penalty charge?

Reviews

len1892831

3/8/2018 5:44:47 AM

For the assignment you should complete the following 8 tasks. Tasks 1-3 refer to the problem of comparing two samples and Tasks 4-7 refer to regression analysis. In Task 8 the relation between two variables is investigated. Your answers should be short and clear. We recommend that you copy and paste the tasks below into the form titled "Submit your Assignment using this Form". You can then write you answers to the tasks in the designated positions that are marked in the text:

Write a Review

Basic Statistics Questions & Answers

  Weight of tootsie rolls during manufacture

Discuss the factors which might cause variation in the weight of Tootsie Rolls during manufacture.

  You notice in your introductory psychology class that more

you notice in your introductory psychology class that more women tend to sit up front and more men sit in the back.

  According to the us bureau of labor statistics 75 of the

question 1 according to the u.s. bureau of labor statistics 75 of the women 25 through 49 of age participate in the

  Correlation coefficient vanishes

Prove that if X1 and X2 are jointly normally distributed random variables whose correlation coefficient vanishes then X1 and X2 are independent.

  Criticize the given definitions - elusory

Criticize the following definitions in light of the eight rules for lexical definitions:- "Elusory" means elusive.

  Compute the median monthly rent for students

Education and Child Development Students at Stetson University plan to ask administrators to build more on-campus housing on the DeLand campus.

  Bureau analysis of undergraduate students

A credit bureau analysis of undergraduate students' credit records found that the average number of credit cards in an undergraduate's wallet was 4.09

  Confidence interval-mean amount left by all groups

Find a 90% confidence interval for the mean amount left by all groups.

  To carry out a validation study

To carry out a validation study, an I/O psychologist is developing a regression equation from data collected from those hired two years ago

  What is the probability that it will not rain today

A tv weat man reports a 35% chance of rain today. What is the probability that it will not rain today? Explain

  Describing distribution the reaction times of 500 randomly

describing distribution the reaction times of 500 randomly selected drivers measured under standard conditions.a. how

  What is the probability that she or he is a democrat

In a local election in which everyone voted, 60 Republicans voted for the Democratic candidate, and 50 Democrats voted for the Republican candidate. If a randomly chosen community member voted for the Republican, what is the probability that she ..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd