Modelling the shape and scale parameters problem

Assignment Help Other Subject

Reference no: EM132393122

Assignment -

Learning Outcomes:

1. Identify challenges in data analytics: be able to critically evaluate and select appropriate solutions.

2. Demonstrate an understanding of the core methods and algorithms used in data analytics.

3. Analyse and manipulate data sets to extract statistics and features and provide analytic insights.

4. Critically evaluate, select and employ appropriate tools, technologies and data models to provide answers to analytics questions.

All answer require a rigour explanation to support findings.

PART 1 - Demonstrate the need for smooth functions

The air quality data: The data set air-quality is one of the data frames available in R within the standard package datasets. It has the daily air quality measurements in New York, from May to September 1973.

R data file: air-quality in package datasets of dimensions 154 X 6 variables

Ozone: in ppb

Solar.R: in lang

Wind: in mph

Temp: in F

Month: Month (1-12)

Day: Day of month (1-31)

(a) Here we will use Ozone as the response variable and Solar.R, Wind and Temp as explanatory variables. (We will not consider Month and Day.) The data can be plotted using:

data(airquality)

plot(airquality[-c(5,6)])

Comment on the plot.

(b) Fit a standard regression model (i.e., with a normal distribution and constant variance) use the function Im().

(c) Extract fit summary (using summary()) and discuss about the coefficients and their standard errors. Use the function termplot() and comment on the term plot.

(d) Check and comment on the residuals using plot().

(e) Fit the same model using the gamlss() function, but note that the data set airquality has some missing observations (i.e. NA values). The gamls() function does not work with NA's, so before fitting the model the missing values need to be removed.

(f) Summarize the fitted gamls model using summary(). Plot the fitted terms using the corresponding function for gamlss called term.plot().

(g) Check the residuals using the plot() and wp() functions.

(h) Comment on the worm plot. Note the warning message that some points are missed out of the worm plot. Increase the limits in the vertical axis by using the argument ylim.all = 2 in wp().

(i) Since the fitted normal distribution seems not to be correct, try to fit different distributions (e.g. gamma (GA), Inverse Gaussian (IG) and Box Cox Cole and Green (BCCGo)) to the data. Compare them with the normal distribution using GAIC with penalty k = 2 (i.e. AIC).

(j) Has the model improved according to the AJC? Use term.plot() output to see the fitted smooth functions for the predictor of μ for your chosen distribution. Use plot() and wp() output to check the residuals.

PART 2 - Modelling the shape and scale parameters

The abdom data provide information on the abdominal data. Fit different response distributions and choose the 'best' model according to the GAIC criterion.

(a) Load the abdom data and print the variable names.

(b) Fit the normal distribution model, using pb() to fit P-spline smoothen for the predictors for μ and σ with automatic selection of smoothing parameters.

a. two-parameter distributions: GA, IG, GU, RG, LO,

b. three-parameter distributions: PE, TF, BCCG,

c. four-parameter distributions: BCT, BCPE.

(d) Apply pb() to all parameters of each distribution. Make sure to use different model names.

(e) Compare the fitted models stung GAIC with each of the penalties k=2, k=3 and k=log(length(abdom$y)).

(f) Check the residuals for your chosen model, say m, by plot(m) and wp(m).

(g) For a chosen model, say m, look at the total effective degrees of freedom edfAll(m), plot the fitted parameters, fittedPlot(m, x=abdom,$x), and plot the data by plot and fitted μ against x, lines.

(h) For a chosen model, examine the centile curves using centiles.

Attachment:- Assignment Files.rar

Reference no: EM132393122

Questions Cloud

Implications for children gender role development : Source and review an article on parent-child play and the implications for children's gender role development and gender role legacy.

What is triple encryption : What are the essential ingredients of a symmetric cipher? What are the two basic functions used in encryption algorithms? What is triple encryption?

Developing a strategy to update the communications processes : In this case study, you serve as the executive director for a local nonprofit organization in your city that serves the needs of homeless veterans.

Identifying strategic issues occurs after conducting swot : Identifying strategic issues occurs after conducting the SWOT (strengths, weaknesses, opportunities, and threats) analysis. Although planners will generally.

Modelling the shape and scale parameters problem : PART 2 - Modelling the shape and scale parameters. Fit the normal distribution model, using pb() to fit P-spline smoothen for the predictors

Theories for the novelty technology : you read THE LITERATURE REVIEW OF TECHNOLOGY ADOPTION MODELS AND THEORIES FOR THE NOVELTY TECHNOLOGY before completing the assignment.

Prevent any of the internal validity threats : QUESTION: Indicate whether you could redesign the study to correct or prevent any of the internal validity threats.

Most positive influence on employee performance : What leadership style has the most positive influence on employee performance and morale? How do we measure morale?

Developing a training to help prevent future violations : Imagine that you are a HR manager within that organization. You have been tasked with developing a training to help prevent future violations of the HR law.

User Account

All Pages