Calculate and display the mean square error

Assignment Help Other Subject
Reference no: EM132221899

Project 1 -

Part I: Longitudinal data, sometimes referred to as panel data, track the same sample at different points in time. There are two common formats for longitudinal data, short and long format.

The short form is intuitive and good for presentation, but it's not suited for analysis, such as regression procedure. As we know, PROC REG or PROC GLM only takes in long format data, in which each variable should possesses one and only one column.

Dataset epilepsy.txt from Blackboard is recorded in the short form, where data from each time point have its own column given ID and Treatment.

1) Import the epilepsy data set.

2) Convert it into long format using DATA steps.

3) Since the baseline is an 8-week seizure count and the rest are 2-week counts, convert all seizure counts into weekly rate.

4) Create one table displaying average age and weekly seizure rate at baseline by treatment.

5) Create a scatter plot of age (x axis) vs weekly seizure rate at baseline (y axis) with colored dots based on treatment.

6) Run a regression model, PROC REG / PROC GLM, with Weekly rate as the response, and age, treatment and time as predictors*.

7) Create and display a data set containing the original and the predicted value for each patient

8) Calculate and display the mean square error (MSE).

*Due to the repeated measurement and the type of response, the proper model would be more complicated than basic linear regression but here we would ignore that since the purpose of this project is to practice.

Part II - We need use cross-validation method to test the predictive ability of our model, since it's not appropriate to use the data which model is built on to test the model.

For each patient i,

1) Modify the original data, by deleting his/her observation, so the model building process would not include this ith observation.

2) Build the model and output the estimated values, and save the predicted values of seizure count belonging to the ith patient.

3) Create a %macro to do (1) and (2), and use %do loop to repeat these steps for each patient.

Combine the results,

4) Create and display a data set containing the original and the predicted value for each patient.

5) Merge them with original response values.

6) Calculate and display the mean square error (MSE).

Please clean your final output by suppressing unnecessary output that are not asked. As usual, comment each statement you used.

Project 2 -

Part I (Redo Project 1):

1) Import the epilepsy data previously used in project 1.

2) Convert from short to long format in R.

3) Redo Part II of project 1 in R.

Part II (Plot):

Using the result obtained and ggplot2 package, to create a scatter plot of predicted values vs. original values, and have

1) ID numbers (1 to 59) as the markers.

2) The color of the markers depends on age.

3) Two panels based on treatment using facet_grid().

4) x and y variables labeled properly.

Attachment:- Assignment Files.rar

Reference no: EM132221899

Questions Cloud

Explain of two quantitative and qualitative measures : Explain of two quantitative and/or qualitative measures you will employ in your measurement strategy and why you selected them for your particular OIP.
Role overload requires hiring more highly trained workers : Personality is at the center of the four layers of diversity model? Role overload requires hiring more highly trained workers?
Critically assess the status of your term project : Critically assess the status of your term project when it has reached a milestone during execution.
Ethical dilemma-plasticallity right : Donna Canova is the environmental compliance manager for a small plastics manufacturing company Platicallity Inc.
Calculate and display the mean square error : Longitudinal data, sometimes referred to as panel data, track the same sample at different points in time. Calculate and display the mean square error
Differentiate ethnocentricity and polycentricity : What are the causes of the directional imbalance in the global freight? Differentiate ethnocentricity and polycentricity?
What assumptions are made in the computation : How can one determine the probability that a project will be completed by a certain date? What assumptions are made in the computation?
What is the definition of Nursing : Activity - Nursing: The Scope and Standards of Practice. What is the definition of Nursing? What is the definition of the "how" of nursing
Why is high validity more important than high reliability : Which of the following describes a basic reliability test? Why is high validity more important than high reliability?

Reviews

Write a Review

Other Subject Questions & Answers

  Identify and label the logical fallacies being used

Analyze the argument from Part I. Identify and label the logical fallacies being used in the argument. Be specific.

  What is the houses the electronic database in the cro

What is the houses the electronic database in the CRO? Environmentally conscious companies practice __________ to avoid further environmental degradation.

  Write a research plan for the research

ACC8000 Research in Accounting Practice - What problems have staff encountered in using the new system and what suggestions do staff have to improve the new system?

  Question regarding the contemporary controversial issues

Women's Reproductive Rights and Voting Rights and Immigration Reform are some of the United States most contemporary controversial issues.

  Describe how the digital revolution is shaping

Are vouchers and school choice part of the solution to equal access to a quality education for all? Why/Why not? Be specific.

  Advantages and disadvantages of twelve-step programs

What are the advantages and disadvantages of twelve-step programs? Are these programs effective? Please present scholarly evidence when answering this question.

  Discuss about the post given below

A psychological assessment report is created by psychology professionals to inform groups or individuals of the assessments appropriate for their current needs. This type of report also includes a summary of the services provided to these groups o..

  The authors explain the three theoretical approaches to

the authors describe the 3 theoretical approaches to the study of sociology structural-functionalism conflict theory

  Personalities of employees or potential employees

How would you rate yourself on each of the Big Five personality traits based on only the description of the trait? Would your friends agree with your self-ratings?

  When unmarried partners decide to move in together

When unmarried partners decide to move in together, there are

  How has the institution of marriage changed over time

When looking at the family unit, what changes have occurred in the structure of family? How is diversity a factor in current family makeup, as opposed to past family structure?

  Home in square feet and the list price

A real estate agent wants to determine if there is a relationship between the size of a home in square feet and the list price. She randomly selects 50 homes in her local market that are currently listed for sale and records two quantitative variable..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd