Reference no: EM132376521
R Stats Assignment - This is a statistics assignment using R code.
- The assignment requires answers to all the SEVEN questions.
- The assignment deals with Causality - research methods include Randomized Control Trials (RCTs) and Observation Studies.
- The correct research design needs to be identified for questions 3, 5 and 6.
- Please ensure that your working directory is set correctly and that the data file ('chechen.csv') is in the right location on your computer. The chechen.csv data file is attached separately.
ASSIGNMENT -
Analyze the relationship between indiscriminate violence and insurgent attacks using data about Russian artillery fire in Chechnya from 2000 to 2005 (during the Second Chechen War). The Russian military doctrine of Harass & Interdict implies randomness of targeting, i.e. using violence indiscriminately.
Theory: Indiscriminate violence increases insurgent attacks by creating more cooperative relationships between citizens and insurgents. On the contrary, indiscriminate violence can be effective in suppressing insurgent activities.
Dataset: The dataset was constructed around 159 events in which Russian artillery shelled a village. For each such event the village where the shelling took place was recorded and whether it was in Grozny, how many people were killed, and the number of insurgent attacks 90 days before and 90 days after the date of the event. This data is augmented by observing the same information for a set of demographically and geographically similar villages that were not shelled during the same time periods.
The names and descriptions of variables in the data file 'chechen.csv' are as follows:
Name - Description
'village' - Name of village
'groznyy' - Variable indicating whether a village is in Groznyy (1) or not (0)
'fire' - Whether Russians struck a village with artillery fire (1) or not (0)
'deaths' - Estimated number of individuals killed during Russian artillery fire or NA if not fired on
'preattack' - The number of insurgent attacks in the 90 days before being fired on
'postattack' - The number of insurgent attacks in the 90 days after being fired on
Note that the same village may appear in the dataset several times as shelled and/or not shelled because Russian attacks occurred at different times and locations.
Question 1 - How many villages were shelled by the Russian military? How many were not? (The 'unique' function, which returns a set of unique values from the input vector, may be useful here).
Question 2 - Were artillery attacks on Groznyy more lethal than attacks on villages outside of Grozny? (Groznyy was the epicenter of much fighting during both Chechen Wars, and much of the city was destroyed by Russian artillery fire.) Conduct the comparison in terms of the mean and median.
Question 3 - Compare the average number of insurgent attacks for observations describing a shelled village and the others. Also, compare the quartiles. Would you conclude that indiscriminate violence reduces insurgent attacks? Why or why not? What kind of research design is this, and what is its key assumption?
Question 4 - Considering only the pre-shelling periods, what is the difference between the average number of insurgent attacks for observations describing a shelled village and observations that do not? Why do you need to conduct this comparison? What does the result suggest to you about the validity of comparison used for the previous question?
Question 5 - Create a new variable called 'diffattack' by calculating the difference in the number of insurgent attacks in the before and after periods. Among observations describing villages that were shelled did the number of insurgent attacks increase after being fired on? Give a substantive interpretation of the result. What kind of research design is this, and how might it address concerns about confounding? What is its key assumption?
Question 6 - Compute the mean difference in the 'diffattack' variable between observations where villages were shelled and those where they were not. Does this analysis support the claim that indiscriminate violence reduces insurgency attacks? Is the validity of this analysis improved over the analyses you conducted in the previous questions? Why or why not? Again, what kind of research design is this? Specifically, explain what additional factor this analysis addresses when compared to the analyses conducted in the previous questions.
Question 7 - Conduct the three research designs from above while excluding Grozny from the analyses. How many fewer observations do these analyses use compared to those above? Why might you want to do these additional analyses? Do they weaken or strengthen your conclusions in the previous answers.
Textbook - Quantitative Social Science - An Introduction by KOSUKE IMAI. ISBN 978-0-691-16703-9.
Attachment:- R Stats Assignment Files.rar