Find effect of additional pack of cigarettes on birthweight

Assignment Help Applied Statistics
Reference no: EM132388719

Questions -

1. The value of R2 in a multiple regression cannot be high if the estimates of the regression slope coefficients are shown to be insignificantly different from zero on the basis of t tests of significance, since in that case most of the variation in y must be unexplained, and hence the value of R2 must be low. True, false, or uncertain. Explain.

2. Suppose that average worker productivity at manufacturing firms (avgprod) depends on two factors: average hours of training (avgtrain) and average worker ability (avgabil)

avgprod = β0 + β1avgtrain + β2avgabil + u

Assume that this equation satisfies the Gauss-Markov assumptions.

If grants for training programs have been given to firms whose workers have less than average ability, so that avgtrain and avgabil are negatively correlated, what is the likely bias in β1~, obtained from the simple regression of avgprod on avgtrain?

3. Consider a regression equation that relates infant birth weight to cigarette smoking and family income

(bwght)^ = β0^1^cigs + β2^faminc,

where

bwght = child birth weight, in ounces

cigs = number of cigarettes smoked by the mother while pregnant, per day

faminc = annual family income, in thousands of dollars

Estimating the equation yields:

(bwght)^ = 116:97 (1.05) = .463 (.092) cigs + .093 (.029) faminc,

where standard errors are in parantheses below the estimated coefficients.

(a) State precisely the estimated effect of cigarettes on birth weight based on the results from the model above.

(b) Suppose that instead of ounces, birth weight were measured in pounds. Can you use the results from the regression equation above to find the effect of an additional cigarette on birth weight in pounds (lbs.)? (Hint : 1 lb = 16 ounces)

(c) Now suppose we wanted to find the effect of an additional pack of cigarettes on birth-weight (in ounces again). A pack contains 20 cigarettes. Use the results from the estimated equation above to find the effect of an additional pack of cigarettes on birthweight (in ounces).

(d) How would the coefficient on family income change if family income were measured in terms of dollars, rather than in terms of thousands of dollars (e.g., suppose you defined a new variable famincdol = 1000 x faminc, what would be the coefficient on famincdol?)

(e) When switching from ounces to pounds in our measurement of birthweight- or from thousands of dollars to dollars in our measurement of family income-will the t statistics associated with our esimated coefficients or the R-squared of the regression change?

4. Suppose we want to estimate the model:

yi = β0 + β1xi1 + β2xi2 + β3xi2 + · · · + βkxik, for i = 1, 2, . . . , n

We can write this model in matrix notation as

y(n x 1) = X (nx(k+1)) β(k+1)x1 + u(nx1)

(a) What assumptions are required for unbiasedness of OLS?

(b) In matrix notation, demonstrate the unbiasedness of the (k x 1) OLS estimator β^ using the assumption that:

X'((k+1)xn) u(nx1) = 0((k+1) x 1)

For each step, state any necessary assumptions.

(c) Now derive the formula for the variance of β^ using matrix notation.

Data

Obs GPA ACT [ GPA Residuals

Obs

GPA

ACT

(GPA)^

Residuals

1

3

21

 

 

2

3.4

24

 

 

3

3

26

 

 

4

3.5

27

 

 

5

3.6

28

 

 

6

3

25

 

 

7

2.7

25

 

 

8

2.7

22

 

 

(a) Compute the variance of ACT.

(b) Estimate the relationship between GPA and ACT using OLS, obtaining the intercept and slope estimates in the equation:

(GPA)^ = β0^ + β1^ACT

c) Calculate the predicted values, (GPA)^, and residuals, u^, for each observation, and enter them in the table above.

(d) Compute the SST (i.e., total sum of squares, or the total variance in the dependent variable). What proportion of the variation in GPA does ACT explain?

5. Below are the results from two regressions of wage on education, experience, tenure, and age.

Unrestricted Model:

log(wage)^ = 5.48(.123) + .075(.007) educ + .013(.003)tenure + .018(.134)exper - .0001(.0006) exper2

SSR = 140, SST = 165, R2 = .155, n = 935

where SSR is the sum of squared residuals and SST is the total of sum of squares: i=1n = (yi - y-).

Restricted Model:

 log(wage)^ = 5.83(.006) + .061(.006)educ + .016(.003) tenure

SSR = 143, SST = 165, R2 = .136, n = 935

Let tcα and denote the one-tailed critical values from the t distribution at significance level, α (hint: you can deduce the two-tailed values from these):

tcα=.05 = 1.65

tcα=.025 = 1.96,

Use the following critial value of the F distribution for any F-tests in this problem: Fcα=.05 = 3.

For the questions below, specify what test that you would conduct, state the null and alternative hypotheses, and compute the relevant test statistic.

If you do not have all the information you need to compute the test statistic, indicate what information is missing. You should still write out the formula for the test statistic.

(a) Is there evidence at the 5% level that the effect of experience on log(wage) is quadratic?

(b) Test at the 5% significance level whether exper and expersq are individually significant (i.e., different from zero). Then test whether they are jointly significant (again at the 5% significance level).

(c) Test whether the effect of experience is equal to the effect of tenure in the first (unrestricted) model (i.e., test only whether the effect of exper and tenure are equal; ingore exper2).

d) Is there evidence at the 5% significance level that tenure in the first (unrestricted) model is strictly positive?

(e) In the first model, what is the effect of 5 additional years of experience on log(wage)?

(f) (2 points) Based just on the signs of the estimated coefficients on exper and expersq (and ignoring for the moment whether or not they are statistically different from zero), how would you characterize the effect of experience on log(wage) over the sample distribution of exper?

(i) initially positive but diminishing

(ii) initially positive but increasing

(iii) initially negative and diminishing (becoming more negative)

(iv) initially negative and increasing (becoming less negative)

(v) constant

6. Use the following table (attached) to answer the questions below. Note that each column of the table represents a separate regression model of wage on different subsets of explanatory variables that include education, experience, tenure, and IQ. In each model, the dependent variable is wage (not log(wage)). Blank cells in a column indicate that the given variable was omitted from that particular model. For example, the first column shows the results of a model of wage on educ; the second column shows the results of a model of wage on educ and experience, etc. Observations (N) and R2 of each model are reported at the bottom.

(a) Use the formula for omitted variables bias to explain why we see the coefficient on educ increase when we add exper to the model.

(b) Explain why we see a large decrease in the magnitude of the estimated coefficient on educ, β^educ, when we add IQ (a measure of ability or intelligence) to the model (compare the effect of educ in columns (3) and (4)).

(c) Explain why the standard error of the estimate of β^educ gets larger as we add more variables to the model.

(d) Suppose someone had only run the regression equations shown in columns (1) and (4) and concluded that omitting IQ from the regression of wage on education induced only a small bias, pointing to the decline in the coefficient on wage from 60.21 to 57.20. Is that a fair comparison? What is a better comparision to gauge the bias due to omitting IQ from the simple regression model?

(e) Suppose that you produce a histogram of the residuals from the model in column (4) that show that the distribution of the residuals does not appear to be normally distributed (i.e., follow a bell-shaped distribution). A fellow student claims that because of this, any hypothesis testing would be invalid. How would you respond? Do you have any suggestions to address the apparent violation of the normality assumption?

(f) The variable wage measures total monthly earnings, in dollars. Suppose we had another variable, hours, that measured hours worked per month, and added that to our regression model. How would the ceteris paribus interpretation of the coefficient on education, β^educ, change in this new model relative to the model that omitted hours?

7. Use the diagram to answer the following questions. Each circle represents the total variation in the respective variable.

1050_figure.png

(a) Indicate the variation used in OLS regression to estimate β^x in the following model:

y = β0 + βxx + u

(b) Indicate the variation used in OLS to estimate β^x in the following model:

y = β0 + βxx + βzz + u

(c) What variation represents σ2, the error variance of the regression, in the model in part (a)?

(d) What happens to the variation represented by the red area in the OLS regression in part (b)? Why?

(e) What area(s) represent 2, the error variance of the regression, in the model in part (b)?

(f) What area(s) represent the SSE (or the explain variation) from the regression in part (b)?

(g) What area represents the error variance, or unexplained variation, of the following regression:

x = γ0zz + η

where η is the error term.

(h) The area representing the error variance from the regression in part (g) also represents the residuals of the regression of x on z. What area represents the information used for the slope estimate when you regress y on the residuals from the regression of x on z, η^? What area is this equivalent to? (Hint: Think about the "partialling out" result.)

Attachment:- Assignment File.rar

Reference no: EM132388719

Questions Cloud

What is cooperative breeding : What is cooperative breeding? What is multiple care-giving? What is attachment? How do these articles and video affect our understanding of attachment.
Compare critical thinking and creative thinking : Compare critical thinking and creative thinking. Specify differences and similarities.
How ethnicity may influence human services practices : People often try to categorize one another based on single factors such as place of origin, religion, sexual orientation, gender, language, tribal affiliation.
Ask jenny about how she is dealing with enquiries : Ask Jenny about how she is dealing with enquiries, bookings and the promotions of this project. (list questions you can ask)
Find effect of additional pack of cigarettes on birthweight : A pack contains 20 cigarettes. Use the results from the estimated equation above to find the effect of an additional pack of cigarettes on birthweight
Human activity is responsible for climate change : Why do some scientists refute the claim that human activity is responsible for climate change?
Give 3 examples of aspects of a staff induction : Give 3 examples of aspects of a staff induction that you should be keeping track of in order to monitor whether the project is progressing towards
Why is alcohol considered most influential drug on society : The drugs described in this chapter have different effects in the CNS. why is alcohol considered the most influential drug on society?
Explain the research methodology that was used in the study : This assignment provides you with an opportunity to analyze a real-world, peer-reviewed psychology journal article. You should find an article containing.

Reviews

Write a Review

Applied Statistics Questions & Answers

  Hypothesis testing

What assumptions about the number of pedestrians passing the location in an hour are necessary for your hypothesis test to be valid?

  Calculate the maximum reduction in the standard deviation

Calculate the maximum reduction in the standard deviation

  Calculate the expected value, variance, and standard deviati

Calculate the expected value, variance, and standard deviation of the total income

  Determine the impact of social media use on student learning

Research paper examines determine the impact of social media use on student learning.

  Unemployment survey

Find a statistics study on Unemployment and explain the five-step process of the study.

  Statistical studies

Locate the original poll, summarize the poling procedure (background on how information was gathered), the sample surveyed.

  Evaluate the expected value of the total number of sales

Evaluate the expected value of the total number of sales

  Statistic project

Identify sample, population, sampling frame (if applicable), and response rate (if applicable). Describe sampling technique (if applicable) or experimental design

  Simple data analysis and comparison

Write a report on simple data analysis and comparison.

  Analyze the processed data in statistical survey

Analyze the processed data in Statistical survey.

  What is the probability

Find the probability of given case.

  Frequency distribution

Accepting Manipulation or Manipulating

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd