Find the correlation coefficient

Assignment Help Engineering Mathematics
Reference no: EM13822895

Question 1: Baseball is a sport that generates a lot of data, which fans use to try to predict the factors that lead to successful teams. One fan compiled the team batting average and the team percentage of games won for the 14 American League teams at the end of a recent season. The presumption is that a team with a greater batting average should win more games. Supposing that these data represent a random collection of observations of these two measures, let's explore whether batting average can predict winning percentage. The data are stored in the fileBaseball.xls.

  1. Plot the data, and comment on what you observe.

  2. Find the correlation coefficient.

  3. Find the coefficient of determination R2, for this data, and interpret its meaning.

  4. Find the sample regression line, and interpret the meaning of the coefficients of your equation.

  5. Is there evidence at a 5% level of significance, that batting average can be used to predict winning percentage?

Question 2: Physicians are recommending more exercise for patients, especially those who are overweight. One benefit of regular exercise is thought to be a reduction of bad cholesterol. To study the relationship, a doctor selected a sample of patients who did not do regular exercise, and measured their cholesterol level. She then started the patients on a program of exercise, and asked them to record the number of minutes per week that they exercised. After 4 months, she re-measured their cholesterol levels. The data are contained in the file Cholesterol.xls.

  1. Plot the data. Does it appear that amount of exercise and cholesterol level change is related?

  2. Determine the regression equation relating cholesterol reduction to amount of exercise, and find a 95% confidence interval for the intercept. Provide a brief and meaningful written interpretation of the coefficients and the confidence interval.

  3. Can we conclude that exercise affects the change in cholesterol level of the exerciser?

  4. How well does the linear model fit this data? Justify.

Question 3: Hardwood trees are harvested in a selective manner for the manufacture of fine furniture. Environmental groups are concerned that as few trees are selected for cutting as possible while companies feel that they need a certain amount of wood for manufacturing. To help each group predict the volume of lumber in a selected tree, various measurements are made before the tree is cut. Unfortunately, volume is not easily determined before harvesting.

Two common measurements made before cutting down the tree are DBH (the diameter of the tree at breast height, 4.5 feet off the ground) and the height of the tree measured with sighting instruments. After the tree is harvested the volume of lumber may be measured.

Both groups believe that a regression model relating volume to diameter and/or height will be helpful. The data file below gives the diameters, heights, and volumes of 31 trees harvested in the Allegheny National Forest in Pennsylvania.

The data are contained in the file Wood.xlsx.

  1. Estimate the two simple regression models and the multiple regression model that is appropriate for these data.

  2. Which model would you recommend that the two groups use? Why?

  3. A tree with a height of 72 and a diameter of 15.9 has just arrived at the mill. What volume can be expected?

Question 4: Lotteries are important sources of revenue for governments and charities. Many people have criticized lotteries, however, as taxes on the poor and uneducated. To explore the issue, a sample of 100 adults was asked how much they spend on lottery tickets and a number of socio-economic variables. The study was meant to test the following beliefs:

I. Relatively uneducated people spend more on lotteries that do educated people.

II. Older people spend more on lotteries than do younger.

III. People with more children spend more that people with fewer.

IV. Relatively poor people spend a greater proportion of their income on lotteries that the better off.

The file Lottery.xls contains data for the 100 respondents on the amount spend on lottery tickets as a percentage of household income, number of years of education, age, number of children and personal income (in thousands of dollars).

  1. Develop a regression model relating lottery expenditures to all of the variables.

  2. Test each of the four beliefs at the 5% level using your model. What conclusions can you draw?

Question 5: A large corporation was recently accused of discriminating against female managers. A random sample of 100 managers from the firm found that the mean annual salary of the 38 female managers in the sample was $76,189, and the mean annual salary of the 62 male managers was $97,832. This looks like pretty damning evidence of discrimination. The CEO of the corporation was indignant, claiming that the firm followed a strict policy of equal pay for equal work, and that maybe some other factor or factors were responsible for the perceived differences. He has asked you to look into this, and you were able to find the number of years of education and years of experience for each member of the sample. These data are contained in the file Discrimination.xls, which records the member's gender/sex as a 0 for males and a 1 for females.

  1. Do these data taken as a whole explain a substantial amount of the variation in salaries among these managers?

  2. What is your best estimate of the systematic difference between male and female salaries?

  3. Does it appear that gender/sex is a significant factor in the differences in salary in this sample?

Reference no: EM13822895

Questions Cloud

What is the investor return for the year : If an investor purchases a share of stock for $300, collects a dividend during the year equal to $35 a share, and sells the stock at the end of the year for $289, what is the investor's return for the year
Discuss how these factors relate to the group vulnerability : Reflecting on your experiences and knowledge gained in previous courses, discuss how these factors relate to the group's vulnerability.
Identify and describe the pair of ecosystems : Identify and describe the pair of ecosystems. Describe three species in each of your ecosystems, including at least one plant and one animal from each. Evaluate each of these species based on its intrinsic value, its instrumental value, and its un..
Describe two problems facing marine fisheries : Describe two problems facing marine fisheries. Choose three regulations or economic incentives and explain how they could foster sustainable marine fisheries
Find the correlation coefficient : Question 1: Baseball is a sport that generates a lot of data, which fans use to try to predict the factors that lead to successful teams. One fan compiled the team batting average and the team percentage of games won for the 14 American League tea..
What are the various multidisciplinary departments : What are the various multidisciplinary departments (teams) included in your facilities? Who comprise the target population being served by the various programs provided by your chosen facilities? What are the major staffing and human resource issues ..
Describe the public health roles of the cdc and the who : Describe the public health roles of the CDC and the WHO. What are the four categories of human environmental hazards? Give examples of each.
Facts and concepts important to the occupational safety : Explain important laws, codes, and regulations related to occupational safety and health and the environment. Recommend appropriate means for controlling safety, health, and environmental hazards.
Construct a percentage histogram : Construct a percentage histogram and percentage polygon. Construct a cumulative percentage distribution.

Reviews

Write a Review

Engineering Mathematics Questions & Answers

  Problems based on multiple regression analysis

explain why the typical hypothesis that analysts want to test is whether a particular regression coefficient (B) is equal to zero (H0: B = 0) versus whether that coefficient is not equal to zero (H1: B ≠ 0).

  For the composite areas shown first determine the centroids

for the composite areas shown first determine the centroids and second determine the moment of inertia with respect to

  Culminating quantitative research report

For this assignment you are to write a culminating quantitative research report on the concepts and topics that you learned in this course. For this paper, you need to critique two or more research papers/journals that use quantitative research me..

  Discuss the interaction effect with a significant level

When discuss the interaction effect with a significant level of 0.05, we see there is _____

  Determine to introduce a new product or not

Discuss how you would proceed if you had to determine to introduce a new product or not. Include the appropriate statistical testing.

  How are the laws of supply and demand illustrated

Construct a graph showing supply and demand in the tablet case market and how are the laws of supply and demand illustrated in this graph? Explain your answers.

  Power of crowds in a business setting-strengths

Based on your intuition, what is your overall impression of using the power of crowds in a business setting-strengths/weaknesses, good applications/bad applications, etc.?

  Draw the pruned quad tree that stores

Write a short account commenting on these two methods, explaining why the alternative approach is likely to be considerably quicker for a list of 100 names.

  Find the solution of the exact differential equation

Find the solution of the exact differential equation and separable differential equations

  Fourier series

Behaviour of the functions at their end and midpoints points to suggest features that increase the convergence and those that are bad for convergence.

  Question regarding the sampling design

Using the scenario and two variables your learning team developed for the Week 2 Business Research Project Part 1 assignment, create a paper of no more than 700 words in which the goal is to submit a random sampling plan in such detail that anothe..

  Wirte a correct alternative hypothesis

What differentiates a Z test statistic for a population from the z statistic for sampling of the mean? Why difference. Consider a normal population

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd