Write a program to correctly import the data

Assignment Help Applied Statistics
Reference no: EM132324318

Programming Project -

1. The data set 'WeightChanges2.xlsx' contains weights of 50 patients measured over a 12 month period. All patients have an initial weight measurement (weight0), but not all are measured the same number of times over the 12 month period.

a. Write a program to correctly import the data as an .xls file and create a SAS data file, named 'weight_mult1'. Create the following new variables:

i) 'months', giving the number of nonmissing months, after the initial month, for which each patient has a weight measurement. (Hint: Use a SAS function.)

ii) 'avg_weight', giving the average of weight1 through weight12 for each patient. Create a label 'average weight' for this variable.

iii) 'weight_class', which categorizes the patients into weight classes, based on their initial weight, as follows:

736_figure.png

Format this variable so that in all printed output, weight_class will print as 'low' for weight_class 1, 'med' for weight_class 2 and 'high' for weight class 3.

b. Write a program to compute the mean and standard deviation and number of nonmissing observations for 'avg_weight', separately for each of the 'weight_class' groups and create a single table containing the values, making sure the formatted values of weight_class are displayed. Also create a plot containing side-by-side boxplots for the groups. Create separate titles to display for the means table and plot, respectively. Finally, create a PDF file containing the results.

Submit the SAS program, log results and attach the pdf file.

2. Suppose for the data in #1 we are interested in only those patients with three or more weight measurements (after the initial measurement). Generate the same statistics you did in #1(b), for this group only, in two ways:

a) Create a new data file containing only patients with 3 or more measurements and use PROC MEANS to compute the statistics.

b) Leave the file created in #1 intact, and use a WHERE statement in the MEANS procedure to restrict analysis to patients with three or more measurements.

For part (a), turn in the program and the output from a PROC PRINT, showing the contents of the new file, and also the output from PROC MEANS. For part (b), turn in the program and the output from the MEANS procedure.

3. The data from #1 will need to be analyzed using a procedure that requires univariate data representation rather than the current multivariate representation. Modify the DATA step from #1 to create a data file called 'weight_uni', that has a univariate representation of the data (create a variable called 'time' to index the month of measurement. Use PROC CONTENTS to give a summary of the data file.

a) How can you tell from the results of PROC CONTENTS that your program appeared to work?

b) Use PROC PRINT to print the data for the first two patients only. (Note: it is not acceptable to print data for all patients and then print or "cut and paste" only the observations for patients 1 and 2!)

Turn in one program including all parts as well as the output from both PROC CONTENTS and PROC PRINT, and finally the contents of the Log window.

4. Write a SAS program to generate 1000 random values from a Student's t distribution with 2 degrees of freedom, as well as 1000 random values from a standard normal distribution. Generate descriptive statistics for each of the two simulated distributions, as well as side-by-side boxplots of the two simulated distributions (the boxplots must appear in the same plot).

(Hint: Generate both distributions in the same data step, and create a classification variable to identify the distribution.) Submit a copy of the program as well as the output.

Attachment:- Assignment Files.rar

Reference no: EM132324318

Questions Cloud

Describe the pros and cons of one method of transmission : Discuss the pros and cons of one method of transmission, such as Wireless Application Protocol. The response must be typed, single spaced, must be in times new.
What are the costs associated with the strategy : Discuss the importance of backups. What is the purpose of using RAID for continued operations? Also, what are the costs associated with this strategy?
Discussion about intentional cybersecurity attack : The water utility's Information Technology (IT) person did not receive an expected pay raise and decides to reprogram the SCADA system to shut off the high.
Prepare a draft related to your site using given details : Initial Draft - For this assignment, you're going to begin to work on your site. Based on your storyboard and client feedback (professor's comments).
Write a program to correctly import the data : STA 581 Programming Project - Write a program to correctly import the data as an .xls file and create a SAS data file, named 'weight_mult1'
Explain the importance of documentation in forensic analysis : Decide whether software-generated reports assist with this specific portion of the report writing process and provide a rationale for your response.
How the information could potentially be used as evidence : Describe the information that can be discovered in email headers and determine how this information could potentially be used as evidence in the investigation.
Calculate the likelihood equations : Principles of Statistical Inference - Write down the log-likelihood for the full model, calculate the likelihood equations and find the general form of the MLEs
Discuss recommendation using a corporate blog for branding : Identify and briefly discuss one recommendation that should be followed when using a corporate blog for branding, marketing, or public relations purposes.

Reviews

Write a Review

Applied Statistics Questions & Answers

  Two random variables with joint pdf

Let X1, X2 be two random variables with joint pdf f(x1, x2)=4x1x2, 0

  Obtain an output as in the tutorials

Obtain an output (as in the tutorials).

  How you present data for an interval dependent variable

Identify the specific statistic you would use to assess the relationship and the strength of association between each set of your variables.

  Calculate the coefficient of correlation

Calculate the coefficient of correlation (r) between FINAL CONSUMPTION EXPENDITURE and RETAIL TURNOVER PER CAPITA. Then, interpret it

  Prepare a linear regression equation in excel

create a Linear Regression (LR) equation in Excel, assuming all assumptions for linear regression have been met. Use the Excel template provided.

  An investor has the following investment portfolio

An investor has the following investment portfolio Z = a1X + a2 Y with a1 + a2 = 1. Where X and Y are random rate of returns of assets X and Y respectively and μx= μy = μ, σx = σy = σ. Assuming that these random variables are not independent find the..

  Large university has become a very big problem

Parking at a large university has become a very big problem.  University administrators are interested in determining the average parking time (e.g. the time it takes a student to find a parking spot) of it's students.  An administrator inconspicuous..

  Use descriptive statistics to summarize data

Collect data about hair salons clients vs. walk In's and product vs. services. Explain how you obtained a suitable sample of either qualitative or quantitative data. Review data for validity and reliability. Use descriptive statistics to summarize da..

  A child psychologist was interested in the difference in age

A child psychologist was interested in the difference in age (in years) between a boy and a girl when they first learn to ride a two-wheeled bicycle. The psychologist calculated a 99% confidence interval for the difference in age to be (-0.58, 0.71)...

  What is the interpretation of r-square

What is the interpretation of R-square (just use the latest output) and how to calculate correlation based on it?

  Find the probability that the time until the first sale

An average of 8.5 cars are sold per 10-hour day on Saturdays and Sundays in January and February.  A) On the first Saturday in February, the dealership opens at 9am.  Find the probability that the time until the first sale is more than 2 hours...

  The owner of a fish market has an assistant who has determin

The owner of a fish market has an assistant who has determined that the weights of catfish are normally distributed, with mean of 3.2 pounds and standard deviation of 0.8 pound. What percentage of samples of 4 fish will have sample means between 3.0 ..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd