Basic descriptive statistics and graphs

A health agency is conducting an evaluation of all the hospitals in its region. The main aim of this assignment is to produce a report summarising the characteristics of the data. The agency needs a report that studies the number of admissions, type of control, and type of service at each of these hospitals.

The data collected is contained in a file called 'hospitals.xls' which has been placed in Moodle.  Make sure you check with your lab tutor how to obtain this data.  For this assignment, you need to use Excel to conduct the statistical analyses.

The columns of the file contain the following information:






Hospital Number



Type of control (1 = government, non-federal;

   2 = nongovernment, not-for-profit;

   3 = for-profit;

   4 = federal government)



Type of service (1 = general medical; 2 = psychiatric)



Number of beds



Number of admissions

Before you begin any analysis you must take a random sample of 140 hospitals from the 200 provided in the data set.  To do this, use the Random Sample Generator available on Moodle (Random Sample Generator-sem1_15). Make sure you check with your tutor how to obtain a random sample of 140 hospitals using this generator. Your answers to the assignment tasks below are based on your sample of 140 hospitals. Make sure you keep a safe copy of your sample, since you cannot use the Random Sample Generator to reproduce the first sample.

The main aim of this assignment is to investigate the effect of various variables on the number of admissions in hospitals. Each of the tasks below will guide you through various descriptive statistical analyses that will assist you to formulate a conclusion about the main aim. Your responses to these tasks should be represented in a well written manner with appropriate statistical analyses. Appropriate computer output should be incorporated within the body of each response.

Assignment Tasks:

Task 1: Basic Descriptive Statistics and Graphs

For the variable Admissions, obtain some basic descriptive statistics as well as a histogram and a boxplot. Include the output from the plots and descriptive statistics in your assignment. Briefly describe the shape, location and spread of the distribution of Admissions with appropriate linking to the descriptive statistics. That is, state which central measure and measure of spread would be best to use to describe the distribution and the reason(s) why.Make comments on the presence of any outliers (for example, what effect would they have on the summary statistics).

Task 2: Side-by-Side Boxplots

Produce a side-by-side boxplot that visually depicts the relationship between Admissions and Service. The boxplots should show the distribution of the number of admissions for both the general medical and psychiatric hospitals. Include the output in your assignment. Describe what is shown by the boxplots in terms of the shape, spread and location of each distribution.

Task 3: Pivot Tables:

Construct a pivot table to compare the type of service with the type of control.Place control in the rows and service in the columns and a count of hospitals in the body of the table.

Obtain a second pivot table by converting the frequencies in each column into relative frequencies in each column (right click on any number in the table, then choose Show Values as % of column totals).

Include both tables in your assignment. You need to compare the percentage of the 4 types of control within each of the general medical and psychiatric hospitals. Discuss whether the types of control differ within the two types of service by referring to the two pivot tables.

Task 4: Conclusion

From the tasks above, you should have some idea of the variables in the hospital data set and how some of them relate to others. In this section you should write a short paragraph summarising your main findings.

