What are possible omitted variables

Assignment Help Basic Statistics
Reference no: EM131307484

Be sure to read each question carefully and answer all parts. For all Stata questions, be sure to provide the log output (which should be edited and commented so that it is easier to grade). 

Q1. Find an article THAT YOU ARE INTERESTED IN, any article (or book chapter etc.), from other classes, from the news, from research journals, policy briefs etc. that has statistics (in the broad sense-it could have anything we have learned in class, or it could have something "statistical" outside of the scope of 631) in it. 

a. Print it out/ copy it and attach it.

b. Either: 1. Talk about something in this article that this class has made understandable for you, explaining to what you are referring, what part of the class has clarified it for you, and what you think it means.

2. Talk about something in this article that you do not understand, explaining to what you are referring and what you do not understand about it.  Ask questions that you think might help clarify what you do not understand for you.

Q3. Use a graphical analysis (which often looks more impressive than just doing statistical analysis) to support your answer to the following question: You are concerned with staff wasting time over lunch.  Government employees in your branch are allowed a 60 minute lunch break.  For a week you monitor employees (without their knowledge) and how long they spend at lunch.  You consider a lunch break of longer than 65 minutes unacceptable and a waste of tax-payer resources.  Do your employees have a problem with long lunch breaks?  (Note:  you may want to enter your data in twice and compare to make sure you have no data entry errors.  Also be sure to explain how your graphical analysis supports your answer.)  What statistics could you run to support your graphical analysis?  Run them.

Minutes Spent at Lunch

64           93           66           68

60           65           63           85

78           86           73           77

69           63           64           87

93           61           80           65

82           75           60           64

62           84           75           63

63           61           70           67

76           73           72           91

80           70           89           82

***Copy and paste STATA results for each question for question 4.

Q4. Download the GSS dataset from e-campus if you do not already have this dataset.  This is a shortened version of the full dataset which was downloaded from the NORC website.  Documentation can be found here: https://gss.norc.org/Get-Documentation. (Use STATA)  

a. Exploring

i. Explore the dataset with Stata using the tools you have learned.  As with all questions, show your output.  (Note that if your output is too lengthy, before printing your solutions you may clip out the middle part with a note that you have done so.)  

ii. For what years is this dataset available in the shortened version I uploaded?

b. Is this dataset longitudinal, panel, cross-section, repeated cross section, case study, some combination (if so, of what), or some other form of dataset?

c. Cut the dataset so you only have data for the year 2012 left.

d. Look carefully through the variables and variable descriptions. You have been asked to compare various statistics by gender and age.

i. Pick an outcome that you are interested in from the list of variables available.  Make sure that it is not missing from the dataset.

ii. Cut the dataset down so that the only variables remaining are for year, sex, age, and your variable(s) of interest.

iii. Save this smaller set as a new dataset.

iv. Explore this new dataset. 

e. How is your outcome of choice coded?

i. Is it coded in a way that will make sense for analysis? 

ii. If it is coded in a way that will make sense for analysis, say N/A for this part.  If it is not:

1. Can it be coded it in a way that it will make sense for analysis (hint:  you may want to turn a categorical variable into a binary variable or a continuous variable)? 

2. Do so if it can (otherwise you may want to choose another outcome, and move back to part d). 

3. Explain any assumptions you made when you changed this variable (or made a new variable from the old one).

f. Test to see if males and females act differently with respect to your variable. Be sure to include your hypothesis tests and whether or not your results are significant.  In your opinion, is the magnitude of the difference big?

g. Create a variable for older.  Explain how you define "older" vs. "younger," and any other assumptions that you make.  [Hint:  be EXTRA careful with missing values]

h. Are your results different for gender (from f) if you look only at older people?  Be sure to include your hypothesis tests and whether or not your results are significant.

i. Reopen the original dataset.

i. If you have not already, create a .do file that walks through steps d through h (so that it does not cut the original dataset by year).

ii. Cut the dataset so the year is 2000.  If your outcome variable in your .do file does not exist for the year you have picked, choose another outcome that does and modify your .do file accordingly.

iii. Do your .do file on the dataset for the year 2000.  [As always, show your output.]

iv. Are your results on the hypothesis tests from 4g and 4h different for the year 2000 compared to 2012?  If so, why might they be different?

***Show STATA output for question 5

Q5. Using the GSS

a. Pick two variables of your choice,

b. Make sure they are in an appropriate format for a scattergram

c. Create a scattergram using STATA. 

d. Repeat the scattergram using the jitter option. 

e. Repeat again with the sunflower option. 

f. How are these plots different?

Q6. You have been given a large budget, have successfully bribed your local IRB and have access to a local prison population- just kidding!  You actually have access to hundreds of psychology undergraduate students.  You have been asked to evaluate whether or not listening to classical music while studying for Psych 101 benefits students' midterm test scores.  [Note that you must answer the question in the context of the problem-it is not sufficient to just copy from your class notes.]

a. What is the best way to answer this question? 

b. Using the procedure you have learned in class, formally guide your (not-bribed) IRB committee through the steps you would take to answer this question, including the pros and cons of each step if there are any.

c. What kind of statistical analysis would you use at the end? 

Q7. You want to know the relationship between number of US troops per citizen in an occupied foreign city and the civilian death rate in those cities.  Your statistical team comes back with the following information:  "Using data from all US occupied foreign cities in the past 10 years, we ran the following regression:  Y = B X + alpha, where Y is the civilian death rate (measured from 0 to 100) and X is the ratio of US troops to citizens times 100 (also measured from 0 to 100).  We found that B = 19 and the standard error on B is 5.1.  Alpha is .3.  The R2 on this regression is .35."

a. What is the t-statistic for X?  Is B significant? (Use STATA if necessary)

b. What does B mean in this case?

c. Your newly hired analyst points out, "Your R2 is only .35.  Therefore we should ignore this regression because the fit is really low."  Is (s)he right?  What should you explain to him/her?

d. What are possible omitted variables?

e. After you have explained part c to your analyst, (s)he recommends that, based on this regression, you remove all US troops from all occupied cities.  Why does (s)he recommend this?  Should you take his/her advice?  Why or why not?

f. How might you better answer this question?  (Note:  the actual numbers on the Iraq war alone from Brookings Institute show a small negative sign on B.)

Q8. You are working for the department of public health.  Your supervisor has recently had a bad experience involving NoDoze and is sure that caffeine is evil.  (S)he tells you to find all the literature you can on the evils of caffeine so that your office can make a public health announcement against the stuff.

a. Should you ignore the literature on the potential benefits of caffeine?  Why or why not?  (Give both moral and immoral answers.)

b. You find a medical science article on heart attacks and caffeine intake.  It runs the regression Y = BX + alpha on a group of 70,000 men aged 45 to 65 over a period of 5 years, where X is the number of cups of coffee a person drinks each day and Y is the number of heart attacks the man has had during those 5 years.

i. B = .0000002, and the reported t-statistic is 47.  What does this mean in words?  Is this relationship significant?  Is this an important relationship? 

ii. What are possible omitted variables?

c. You find another medical science article on the effect of caffeine on fetal growth.  It follows 10 mothers throughout their pregnancies and finds the following:  Y = BX + alpha where X = number of cups of coffee the mother drinks each day over two cups and Y = fetal birth weight in pounds. 

i. B = -2, t = .9.  What does this mean in words?  Is the relationship significant?  Is it important?

ii. What are possible omitted variables?

iii. Does this regression justify looking for other articles on this topic?  Why or why not?  Would your answer be the same if n =10,000?

***See attached file for question nine.

Q9. Use the Stata command ttesti to answer the MB&B ttest problem of your choice from chapters 11, 12, or 14 (make sure it is a problem that can be solved using ttesti, not ttest or sampsi).  Provide hypothesis and log file output and make it clear which question you are addressing.  (Yes, you may use a problem that you have already solved by hand and/or has solutions in the back of the book.)

Q10.  Give (real or hypothetical) examples of the following.  Do not use either the examples that I gave you from your class notes or from Wikipedia:

a. A situation where someone is led astray by misuse of the Representativeness Heuristic

b. A situation where someone is led astray by misuse of the Availability Heuristic

c. A situation where Framing could change someone's answer to a survey question.

Attachment:- Assignment.rar

Reference no: EM131307484

Questions Cloud

Can this segment be mistaken by alice''s computer : initiating a SYN + ACK segment from Bob. Can this segment be mistaken by Alice's computer as the response to the new SYN segment? Explain.
When paraphrasing a passage for your paper you should : When paraphrasing a passage for your paper, you should? If you do not get any hits when typing in your search terms into the search tab, you.
Discuss the role that individual played as agent of social : Discuss the role that individual played as an agent of socialization and explain how that individual affected your socialization and your values, beliefs, or goals.
Single subject designs and experimental designs : Describe the similarity between single-subject designs, case studies, and time-series designs.- Explain why single-subject designs are often preferred to traditional group designs for clinical research.
What are possible omitted variables : You find another medical science article on the effect of caffeine on fetal growth.  It follows 10 mothers throughout their pregnancies and finds the following:  Y = BX + alpha. B = -2, t = .9. What does this mean in words?  Is the relationship sig..
Can this old ack be confused with the ack segment bob : Can this old ACK be confused with the ACK segment Bob is expecting from Alice?
Describe how the level and trend of behavior can be used : Describe how the level and trend of behavior can be used to define a pattern of behavior in a graph showing the data from one phase of a single-subject design.
How flow control can be achieved at the sender site in tcp : Using Figure 24.19, explain how flow control can be achieved at the sender site in TCP (from the sending TCP to the sending application). Draw a representation.
How did each agent shape and influence your life : Identify and describe Kohlberg's three stages of moral development and explain how each stage applies to your own personality formation. Do you agree with Kohlberg, who suggested that the third stage is difficult for many people in our society to ..

Reviews

len1307484

12/10/2016 5:58:24 AM

I'm attaching the HW assignment (Quant_2016) as well as a supplemental document (Quant problems for question 9.) You can pick any of the questions that deal with ttest from any of the problems that are copied. You need to answer one. You need to do all questions which are not cross checked. For Question 9 you can choose any one.

Write a Review

Basic Statistics Questions & Answers

  Statistics-probability assignment

MATH1550H: Assignment:  Question:  A word is selected at random from the following poem of Persian poet and mathematician Omar Khayyam (1048-1131), translated by English poet Edward Fitzgerald (1808-1883). Find the expected value of the length of th..

  What is the least number

MATH1550H: Assignment:  Question:     what is the least number of applicants that should be interviewed so as to have at least 50% chance of finding one such secretary?

  Determine the value of k

MATH1550H: Assignment:  Question:     Experience shows that X, the number of customers entering a post office during any period of time t, is a random variable the probability mass function of which is of the form

  What is the probability

MATH1550H: Assignment:Questions: (Genetics) What is the probability that at most two of the offspring are aa?

  Binomial distributions

MATH1550H: Assignment:  Questions:  Let’s assume the department of Mathematics of Trent University has 11 faculty members. For i = 0; 1; 2; 3; find pi, the probability that i of them were born on Canada Day using the binomial distributions.

  Caselet on mcdonald’s vs. burger king - waiting time

Caselet on McDonald’s vs. Burger King - Waiting time

  Generate descriptive statistics

Generate descriptive statistics. Create a stem-and-leaf plot of the data and box plot of the data.

  Sampling variability and standard error

Problems on Sampling Variability and Standard Error and Confidence Intervals

  Estimate the population mean

Estimate the population mean

  Conduct a marketing experiment

Conduct a marketing experiment in which students are to taste one of two different brands of soft drink

  Find out the probability

Find out the probability

  Linear programming models

LINEAR PROGRAMMING MODELS

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd