Find a suitable regression model

Assignment Help Basic Statistics
Reference no: EM13114804

Question 1.

The text file "xydata5" contains the variables x1, y1, x2 and y2. For this question, hand in your R commands, output and comments together.

(a) (i) Use appropriate R output to find a suitable regression model to predict y1 using x1 as an explanatory variable. Comment on the output used and justify your choice of model.

(ii) Produce the summary output and give the estimated equation for your chosen model.

Use your equation to calculate the fitted value and the residual for an observation with values of x1 = 14 and y1 = 11.6.

(iii) What is the predicted value of y1 when x1 = 14?

(b) (i) Use appropriate R output to find a suitable regression model to predict y2 using x2 as an explanatory variable. Comment on the output used and justify your choice of model.

(ii) Produce the summary output and give the estimated equation for your chosen model. Use your equation to calculate the fitted value and the residual for an observation with values of x2 = 10 and y2 = 1.75.

(iii) What is the predicted value of y2 when x2 = 10?

• For each of these questions we want you to do the full analysis in R and write a Report. Your answers should be in three parts.

First: Technical Notes on the analysis. Refer to the Appendix of your Lecture Work Book and to the Case Studies for examples of Technical Notes. You do not need to quantify confidence intervals in these notes. Some of the information in the Technical Notes may be
repeated in the Executive Summary.

Second: Executive Summary of the main findings of the analysis. Refer to the Appendix of your Lecture Work Book and to the Case Studies for examples of Executive Summaries.

Third: R output. Include all necessary R output used in answering all questions together as an appendix at the end of your assignment. These are for the markers to refer to if you make any mistakes in your analysis, so they can consider giving partial credit. There are no marks allocated for the R output, all the marks are for the Technical Notes and Executive Summary.

Please try to keep this section as small as possible - use the layout.20x() command to save space for multiple plots. Remember: When you cut and paste R output into a word processor, you should use a "fixed" font such as Courier.

• These questions require the same type of Technical Notes and Executive Summaries as the final exam.

Question 2.

Data were collected from an experiment that was conducted to assess the effects of height and other experimental conditions on the weight of wood from young poplar trees. All comparisons should be made to the control treatment. The resulting data is stored in the text file "poplar", which contains the following variables:

Weight the dry weight of wood (in kg)

Height the height of the poplar tree (in metres)

Treatmt experimental conditions that the tree was subjected to:

1 = control
2 = fertilizer
3 = irrigation
4 = fertilizer and irrigation

Question 3.

Data on 102 male and 100 female athletes were collected at the Australian Institute of Sport. It was of interest to predict athletes' lean body mass (LBM) using physical attributes and the sport they played. It was of particular interest to compare lean body mass of swimmers to that of players of all other sports. The resulting data is stored in the text file "aussiesport", which contains the following variables:

Lean the athlete's lean body mass

Height the athlete's height (in cm)

Weight the athlete's weight (in kg)

Skin the sum of the athlete's skin folds

BMI the athlete's body mass index (weight/height)

Sport the sport played:

b.ball = basketball
field = field events
gym = gymnastics
netball = netball
row = rowing
swim = swimming
tennis = tennis
track = running
w.polo = water polo

Analyse the data, hand in your R output and write both Technical Notes and an Executive Summary for the analysis. Your Technical Notes should describe each step of the model building process.

Reference no: EM13114804

Questions Cloud

Illustrate what is the speed at which the belt of the ramp : Starting from rest with an acceleration of 0.37 m/s2, he covers the same distance as the ramp does, but in one-fourth the time. Illustrate what is the speed at which the belt of the ramp is moving?
Explain how the owner uses capital and human resources : Explain how the owner uses capital and human resources to ensure revenue or control cost in each scenario. What would determine the lowest and highest scenarios of performance?
Management-bill of material : One end item A requires three component parts: B, C, and D. The bill of material indicates that for each completed A, 3 units of B, 2 units of C, and 1 unit of D are required.
Explain all seven steps and include the p-value : Assume the distribution of ages of homes follow an approximately normal distribution. Show all the seven steps and include the p-value.
Find a suitable regression model : Produce the summary output and give the estimated equation for your chosen model and find a suitable regression model to predict y2 using x2 as an explanatory variable. Comment on the output used and justify your choice of model.
Explain how high does the top of the ice block float above : The density of the sea water is 1025 kg/m3. The density of ice is 917 kg/m3. Explain how high does the top of the ice block float above the water level?
Economic lot order quantity model : What would be the order size for Company A in the given scenario that would minimize total annual cost by using the economic order quantity model?
Draw the mechanism of peptide hydrolysis by bromelin : The enzyme Bromelin is a cysteine protease from pineapple that utilizes a similar mechanism to serine proteases except the serine is replaced by a cysteine. Draw the mechanism of peptide hydrolysis by Bromelin.
Illustrate what wavelength of light will first-order dark : The first-order bright fringe is a distance of 4.84 from the center of the central bright fringe. For illustrate what wavelength of light will the first-order dark fringe be observed at this same point on the screen?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Multiple choice question based on regression

Assume the least squares equation is Y' = 10 + 20X. What does the value of 20 indicate?

  T-critical value and standard error of the sample mean

The sample mean is 50, the t-critical value is 1.96, and the standard error of the sample mean is 2.

  Estimate the proportion of the marine corp

A much larger sample would be needed to estimate the proportion of the U.S. Army's 2.4 million active duty and reserve personnel who approved the repeal than would be needed to estimate the proportion of the Marine Corp's 200,000 personnel who fel..

  Anova for grade of gasoline to use

Suppose that an automobile manufacturer designed a radically new lightweight engine and wants to recommend the grade of gasoline to use.

  Pearson or product-moment correlation

Could you explain the Pearson or product-moment correlation? If we were to use a partial correlation, would you want to look at the relationship between two variables while removing the effect of one or two other variables?

  Prediction equation for the price of a laptop

Find the prediction equation for the price of a laptop using rating and features. What is the value of the residual standard deviation?

  Normal probability-guarantee period of printers

Quality control studies for Leaky Jet Computer Printers show the lifetime of the printer follows a normal distribution with a mean of 4.5 years and a standard deviation of 0.85 years.

  Affirmative action programs on american corporations

Jasmine is an African American graduate student. She is conducting research on the effects that affirmative action programs have had on American corporations.

  Confidence interval of the true mean age

In a study of 24 criminals convicted of antitrust offenses, the average age was 55 years, with a standard deviation of 7.6 years. Construct a 98% confidence interval of the true mean age.

  Determine standardized score

ACT math scores for 2007 were Normally distributed with mean 21.0 and standard deviation 5.1. What is Eleanor's standardized score? Round your answer to 2 decimal places.

  Confidence interval for the mean scores

Below are scores for 10 randomly selected students on each exam. 95% confidence interval for the mean scores on?

  Sampling distribution and probability

The mean television viewing time for Americans is 15 hours per week (Money, November 2003). Suppose a sample of 40 Americans is taken to further investigate viewing habits.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd