Reference no: EM131504107 
                                                                               
                                       
Instruction to do Data Analysis and Statistical Modelling Assignment
Step 1: Find or collect a Dataset
For this project, you must find some  sort of published, existing data. Possible sources include: almanacs,  magazine, journal articles, textbooks, web resources, athletic teams,  newspapers, reference materials, campus organizations, professors with  experimental data, electronic data repositories, the sports pages or  collect your own data from fellow students, neighbours or friends.
The dataset you select must have at  least 25 cases. It also must have at least two categorical variables and  at least two quantitative variables. Choose or collect a dataset that  interests you!
Step 2: Analyse Your Data!
See the description below of what analysis should be included. Use technology to automate calculations and graphs.
Step 3: Write Your Report
Cut and paste all relevant computer  output with your analysis. Be sure to include both computer output and  your discussion of that output in every case. As you discuss each  analysis, be sure to interpret what you are finding in the context of  your particular data situation. Include all of the following.
- Introduction: How did you find or  collect your data? (If you found the data, give a clear reference. If  you collected the data, describe clearly the data collection process you  used.) What are the cases? What are the variables? What population do  you believe the sample might generalize to? Is the sample data from an  experiment or an observational study? Include a copy of the dataset.
- Analysis of One Quantitative Variable:  For at least one of the quantitative variables, include summary  statistics (mean, standard deviation, five number summary) and at least  one graphical display. Are there any outliers? Is the distribution  symmetric, skewed, or some other shape?
- Analysis of One Categorical Variable:  For at least one of the categorical variables, include a frequency table  and a relative frequency table.
- Analysis of One Relationship between  Two Categorical Variables: Analyse your own data for a chi-square test  for association between the two Categorical Variables. State the  hypotheses of the test. Conduct the test, showing all details such as  expected counts, contribution of each cell to the chi-square statistic,  degrees of freedom used, and the p-value. State a clear conclusion in  context. If the results are significant, which cells contribute the most  to the chi-square statistic? For these cells, are the observed counts  greater than or less than expected? Whether or not the results are  significant, describe the relationship as if you were writing an article  for your campus paper. If the results are significant, can we infer a  causal relationship between the variables?
- Analysis of One Relationship between a  Categorical Variable and a Quantitative Variable: Include a  side-by-side histogram and describe it. Does there appear to be an  association between the two variables? If so, describe it. Also, use  some summary statistics to compare the groups.
- Analysis of One Relationship between  Two Quantitative Variables: For at least one pair of quantitative  variables, include a scatterplot and discuss it.
- Conclusion: Briefly summarize the most interesting features of your data.
Topic or Resource Suggestions
Use one  of these or come up with your own idea or find your own source. There  are many sites reporting frequency counts from survey results.
-  Frequency of smoking (never, occasionally, frequently), gender for  students, age of the student and number of years smoking etc.
-  Academic division (business, accounting, TESOL,...), whether the student  has a Mac, PC, or neither, for students, age and number of trimesters  completed.
- Whether a person plans to vote in the next election,  political party affiliation (yes or no), age and number of years  affiliated with the party.