Determines child test score in kindergarten

Assignment Help Econometrics

Reference no: EM13793166

Question 1. Suppose the process that determines child test score in Kindergarten is given by

test score_i = β₀ + β₁preschool_i + β₂income_i + ∈_i

where β₁, β₂ > 0 and preschooli is a continuous measure of "preschool quality.η

Preschool quality can be purchased with cash, and is purchased in cash according to the linear model

preschool_i = α₀ + α₁income_i + η_i

You should interpret both of these linear models as structural models of human behavior. In other words, if you gave a random family another dollar, the family would indeed purchase α₁ more units of preschool quality, and the family would purchase other children's stuff that has β₂ additional effect on test scores. This means that the total effect on test scores of giving the family another dollar is.

(a) α₀β₁
(b) α₁
(c) β₂ + β₁α₁
(d) α₁β₁ + β₂

Question 2. Let's try to simulate model in Stata. This will generate a fake dataset of 1000 students for this problem assuming that α₀ = β₀ = 0, α₁ = 0.5, and β₁ = β₂ = 1.

(You can easily play around by modifying these parameter assumptions.)

Regress testscore on preschool. As an estimator for β₁, this regression's coefficient is

(a) biased upward

(b) biased downward

Question 3. A big city government is thinking about implementing a program that will raise a family's preschool quality by 1 unit. An analyst uses the regression you ran in the previous question to make the case that this program is a highly effective way to improve student test scoresin particular it is better than direct income supports.

You counter that students who go to good preschools probably come from wealthier backgrounds, so at the very least the analyst should be controlling for family income. The analyst retorts. "these kinds of families wouldn't spend a dime of their own income on preschool!η
The analyst is proposing the testable null hypothesis that at least for the families he is considering,

(a) β₂ = 0
(b) β₁ = 0
(c) α₁ = 0
(d) α₀ = 0

Question 4. Were the analyst's claim true, his estimator for β₁ would be

(a) unbiased and consistent

(b) biased and inconsistent

Question 5. You gather appropriate data and show that even in the analyst's population, α₁ is significantly positive. The analyst then suggests that now that we have the nice data you collected, why don't we run the regression

test score_i = β₀ + β₁preschool_i + β₂income_i + ∈_i

and then check whether β₁ exceeds β₂c, where c is the cost of a unit of government-provided preschool in dollars?

He reasons that if β₁ > β₂c, the government should provide preschool since the test score return per dollar for preschool exceeds that for income supports, otherwise the government should provide income supports.

Why might the analyst be wrong?

(a) The estimator for β₁ is inconsistent because of the inclusion of income on the RHS of the regression

(b) One effect of income supports may be to increase family preschool purchases, thus the effect of income supports is probably greater than β₂ (too-many variables bias).

Multicollinearity.

Question 6. Now we want to know whether income supports during preschool or before preschool are better.

We consider the model

test score_i = β₀ + β₁income_before_i+ β₂income_during_i+ ∈_i

Try part II of the simulation. There, income_during is income_before plus a very small income change.

Regress test score on income_before and income_during. You'll notice that at N = 100 observations, the confidence intervals for these parameters are very loose. Try N = 1000 observations by changing the -set obs 100-code to -set obs 1000-. Then try N = 10000. Notice that you need a ton of observations before the confidence intervals really start to tighten. Why?

(a) Hard to distinguish income_before from income_during.

(b) The OLS estimators for β1 and β2 are inefficient.

(d) Hard to distinguish income_during from the constant.

Heteroskedasticity and weighted least squares.

The above analysis assumed that you could get individual-level data on income, preschool enrollment, and Kindergarten test scores. You might be able to do this in a survey dataset like the ECLS-K, but more generally you might find yourself working with averages at the school district level for instance.

Question 7. We might consider weighting by total district population/enrollment in the previous regressions because

(a) larger school districts are more important than smaller school districts

(b) the dependent and independent variables, which are averages, will be estimated more precisely for larger school districts

(d) all of the above

Question 8. In the regression following this question, r_1 is fall K reading test score, incthous_1 is household income in thousands in fall K, and age_1 is the child's age in months at the time of taking the test.

Richard's friend Kyle is 3 months younger than him, but attended the same school and was in the same cohort, and their families have about the same total household income. If they took this exam on the same day in Kindergarten, what do you expect Kyle's score would be relative to Richard's? (For your information, a standard deviation of r_1 is about 10.)

(a) about 1.14 scaled points more (or 11.4% of a standard deviation)
(b) about 0.38 scaled points less (or 3.8% of a standard deviation)
(c) about 0.38 scaled points more (or 3.8% of a standard deviation)
(d) about 1.14 scaled points less (or 11.4% of a standard deviation)

Question 9. Download the dataset kindergarten_version2.dta from the course website.

Generate a new variable being a child's growth on the math exam score from Fall K to Spring K, m_2 minus m_1. Generate a new variable equal to the child's age growth in month, age_2 minus age_1. Regress math growth on age growth and enroll_1. Use robust standard errors.

Let β_E be the coefficient for enroll_1 in this regression. Consider the hypothesis test
H₀ : β_E = 0
H_A : β_E < 0

What's the smallest significance level at which you can reject the null hypothesis? (Careful -this is a one-tailed test!)

(a) about 1.2%
(b) about 11.8%
(c) about 5.9%
(d) about 4.1%

Question 10. The ECLS-K uses a complicated sampling scheme, and to account for this the National Center for Education Statistics (NCES) includes sampling weights sample_weight which they recommend we use in estimation.

Re-run your previous regression using these sample weights (put " [w=sample_weight]η before the comma in your regression.)

With this new specification, what's the smallest significance level at which you can reject the null hypothesis?

(a) about 0.1%
(b) about 10.4%
(c) about 5.2%
(d) about 3.4%

Question 11. Finally, the ECLS-K is a clustered sample. This means that the NCES first samples schools and then samples students within schools. This sampling approach violates OLS assumption 2. simple random sample, since the NCES is not "shaking up the whole countryη and drawing children at random. Because child outcomes are probably positively correlated within school, the standard errors are likely overstated.

One very general (and in many ways "hands-freeη) way to control for this is to use "cluster-robust" standard errors. As the name implies, these standard errors are robust to heteroskedasticity, and also take into account within-cluster correlation.

Try it. replace the "robust" option in your current regression with "vce(cluster schlid)". This will tell Stata to calculate cluster-robust standard errors, where the clusters are school IDs.

With this new specification, what's the smallest significance level at which you can reject the null hypothesis?

(a) about 100%
(b) about 15%
(c) about 30%
(d) about 10%

Question 12. Open the kindergarten_version2.dta dataset, and plot a histogram of income_1. There are some crazily large incomes. We know that OLS and other expectation-based analyses do not behave well when there are very large outliers. What to do?

One approach is to log very right-skewed variables like this. Apparently the income_1 variable is never less than 1, so this will work in this case. gen logincome_1 = log(income_1). A histogram of logincome_1 is much closer to normal, especially in the upper tail (you can assess this using -qnorm-, as you learned in the last assignment.)

Recall that the test scores were also very right-skewed. Log the math test score, creating a new variable logm_1. Then regress log reading score on log income. In other words, fit the model

log(math score) = β₀ + β₁log(income) + ∈

The standard approach to interpret this regression is to differentiate both sides w.r.t. income, treating math score as a function of income.

d/ dincome log(math score) = β d/dincome log(income)

I'm guessing it makes sense to assume ∈ is not a function of income under OLS assumption 1.

If you play around with this expression you'll get

%Δmath score = β₁%Δincome

%Δmath score/%Δincome = β₁

thus β₁ is interpreted as an elasticity. Which is lovely and very economic.

This approach (using differentiation) has for some reason never confronted me as intuitive, because I cannot see it with discrete changes in income using the original conditional expectations model. Nevertheless, you should remember that in a log-log regression like this, we give β₁ the interpretation of an elasticity. it's the % change in the outcome variable expected from a 1% increase in the RHS variable. For example if β₁ = 3, then a 3% increase in math score is expected from a 1% increase in income.

According to your estimates, a 1% increase in income is associated with about a

(a) 0.12% increase in math test score

(b) 1.2% increase in math test score

(d) 1.9% increase in math test score

Reference no: EM13793166

Questions Cloud

Highest risk for renal failure issues : How does the birth control pill work to prevent pregnancy? Besides pregnancy, what other conditions might the "pill" be used to treat?

Health promotion information issues : List two resources you might turn to and explain why you think they would be helpful.

China eventual democratization : What are the arguments for and against the democratization of China? Do you believe China's eventual democratization is inevitable? Why or why not?

Articles of confederation and the constitution : Read the Articles of Confederation and the Constitution at the National Archives link.

Determines child test score in kindergarten : Determines child test score in Kindergarten and this means that the total effect on test scores of giving the family another dollar is

What fears or anxieties does the novel dead until dark evoke : What fears or anxieties does the novel Dead Until Dark evoke? What message does this novel send? This may be an old fear that reaches back into history.

Growth of athens and the persian wars : Chapters: Write a typed 1.5 single spaced typed (minimum) historical analysis/summary: The Growth of Athens and the Persian Wars

What is international regime : What is international regime? What are its purported functions and roles in international political economy? Is this a reality, myth, or something else? Support your argument and provide one example of international regime.

Naming conventions differ by operating system. : What precautions should be taken before defragmenting a disk?

User Account

All Pages