Reference no: EM132724047 , Length: word count:1500
Statistics for Finance Project
Complete the tasks below:
Question 1. Using Capital IQ1, download data for at least 50 US firms for the period 2010-2019. Present a table of summary statistics for all the variables used in the project (including the components of Tobin's Q). Make sure to include: mean, standard deviation, min, max, 25% percentile, 50% percentile, 75% percentile, number of firms, number of firm-year observations. Note that it could be the case that a particular firm has data available for 10 years, whilst another firm has data available for, for example, only 5 years. As long as you have a minimum of 5 observations per firm this is ok. Label this table: Table1: Summary Statistics2 and include a legend at the end of the table that defines each variable; e.g. Price denotes Day Close Price; Equity denotes Total
Common Equity, etc.
Question 2. Firm size is measured as log(Total Assets). Performance is measured with Tobin's Q (total assets plus market value of equity less book value of equity divided by total assets; where market value of equity equals price per share times the total number of shares outstanding)3. Choose the largest and the smallest firm for which you have 10 years of data. Is average performance statistically different between these two firms? Answer yes or no and show two different ways in which you could reach this conclusion. Make sure to show all the details of your tests and present a table for each test. Label these tables: Table2A: Differences in Performance_1 and Table2B: Differences in Performance_2
Question 3. You will run a cross-sectional regression. Therefore, compute time averages for every variable for each firm. You will end up with 50 cross-sections. Run a simple regression analysis to assess whether larger firms are associated with better performance. Note that you should run this regression using 50 observations. Label this table: Table 3: Simple-regression
Question 4. Discuss whether the coefficient on size is statistically significant at the 5% level and interpret the coefficient.
Question 5. Is the beta estimator for size unbiased? Explain
Question 6. Using SIC codes add industry dummy variables to your regression. Label this table: Table 4: Industry controls.
Explain what these dummy variables are controlling for.
Question 7. Add the following variables to the regression (don't forget also to include the industry dummy variables):
a. a sensible explanatory variable of your choice (you may need to look at some academic papers to make a good choice and include references in Appendix 2)
b. A sensible dummy variable of your choice (you may need to look at some academic papers to make a good choice and include references in Appendix 2)
Include the table with results. Label this table: Table 5: Multiple Regression Model
Question 8. Discuss whether the variable chosen in 7a. above is statistically significant at the 5% level and interpret your result.
9. Discuss whether the variable chosen in 7 b. above is statistically significant at the 5% level and interpret your result. (50 words max).
Question 10. Using an F-test assess whether the industry dummies are jointly significant. Show the F-statistic, critical value and discuss your results (50 words max).
Question 11. Compare the coefficient of size in Table 2 vs that of Table 5. Explain why they are different (100 words max)
Question 12. Compare the R-square of Table 3 to that of Table 5. Which model is better? (100 words max)
Question 13. Your variable of interest is size. Even after running your multiple regression model, your model is likely to suffer from endogeneity. Give an example of an omitted variable that could lead to a positive bias and include a brief explanation. (100 words max)
Question 14. Suppose you add firm's average number of employees to the multiple regression model. Would you expect your main results to change? Explain. (100 words max)
Question 15. In Lecture 4 pg. 63 we used an example based on the binomial distribution to explain the Central Limit Theorem (CLT). Make up your own example and show that the CLT works (originality will be rewarded). Feel free to include graphs or figures if this adds value to your explanation. (200 words max)
Question 16. Using daily prices for 2019 for the largest of your 50 firms (you do not need to include this variable in the descriptive statistics or any of the questions above), what is the predicted price for January 1st 2020 based on the random walk model? Clearly show how you estimated this price and discuss whether this is a good prediction. (200 words max)
Attachment:- Statistics for Finance Project.rar