Analyze the goodness of fit of each model

Assignment Help Engineering Mathematics
Reference no: EM131264309

Bayesian regression problem

Best-fit model is not necessarily the best model. It is important to balance between a good fit to the data and model complexity. The purpose of this exercise is to illustrate this idea via a regression problem, which was discussed.

Table 1 (see appendix) contains 8 observations (values stored as x and y vectors in INPUT.mat). Use load ('INPUT.mat') in MATLAB to read in these values. This matlab file can be downloaded from CCLE course website /problem sets.

We can define all possible polynomial regression models as:

y = β0 + β1x + β2x2 + ? + βp xp + ε, where ε ~ Normal (0, σ2)

In this exercise, we consider seven possible models: p = 0, 1, 2,..., 6.

Our goal is to decide which one of the seven models is the "best" one to explain the observed data. For each model with specific parameters β, "goodness of fit" can be measured by the likelihood term P(y|β,M). Here, β is a vector of regression coefficients, and M indicates a polynomial regression model of order p. Model evidence is evaluated using P(y|M), which describes how likely the data are generated by a polynomial model.

(a) Analyze the goodness of fit of each model M by computing the likelihood for each model, based on the predefined regression coefficients, b provided in INPUT.mat (also listed in Table 2 for each model in appendix, see more detail in appendix).

In statistical terms, likelihood can be understood as how the data are generated/sampled from a model. The assumption that ε follows a normal distribution with zero mean and a constant standard deviation σ tells you the variation in data generation.

We can write the likelihood probability distribution as

yi ~ Normal(y ^i, σ2), where y ^I = b0 + b1xi + b2xi2 + ? + bp xip

Here, y ^i (called "y-hat") is the predicted value for the ith observation on y. We further assume that each data point is independently sampled. Then, for a particular regression model M with order p, the likelihood is given by

P(y|β, M)= i=18?(yi; y ^i, σ2)

?(x; μ, σ2) refers to the probability density function of the normal distribution (i.e., norm pdf function in MATLAB) with mean μ and standard division σ. Please use σ=5 for your likelihood calculation.

Since likelihoods across different models could differ by orders of magnitude, for a better illustration, it is more advantageous to plot the natural logarithm of the likelihoods instead of the raw likelihood values. Present a plot of log-likelihood against the orders of polynomial p. What trend do you observe from the log-likelihood plot? Which model gives you the "best fit"?

(b) Evaluate each model M by its model evidence P(y|M), which is given by

P(y|M)= -∞∫+∞P(y|β, M)P(β)dβ

Computing this integral analytically is hard. Instead, we use the discrete approximation:

-∞∫+∞P(y|β, M)P(β)dβ  ≈  1/N j=1N P(y|βj, M)

To simplify your calculation, we assume the prior P(β) to be a uniform distribution, i.e., βk~Uniform(A,B), where A = bk - 0.5, and B = bk + 0.5, for k = 0, 1, ..., p (p is the order of the polynomial regression of model M, bk is the kth value in Table 2 for each Model M).

Using sampling approach to implement the Bayesian model. Sample N sets of β values for each model M according to the prior distribution. For each sampled βj, compute P(y|βj, M), which is given by the likelihood equation. Use N = 500.

Present a bar chart of model evidence against the orders of polynomial p. Which model gives you the highest model evidence?

(c) Write a short paragraph to discuss which regression model is the best for this set of data.

Attachment:- Assignment.rar

Reference no: EM131264309

Questions Cloud

Compute the ultimate shear stress of the material from : The ultimate strength of a brittle material is 3000 psi in tension and 5000 psi in compression. Use these data to compute the ultimate shear stress of the material from Mohr's theory of failure.
Write message to scott that informs him of shipment delay : Write a message to Scott that informs him of the shipment delay. The message should explain that Lucia thinks that Scott should hold off on the corporate job and start the Herbert Street job instead.
Describe how to identify the best customers : Describe how to identify the best customers. - Explain the concept of data mining. Provide five examples of companies that are currently using data mining and explain why each is using it.
Determine the residual force in each bar : In Sample Problem 13.2, change the area of the steel bar to 1200 mm2. If the load P = 700 kN is applied and then removed, determine the residual force in each bar.
Analyze the goodness of fit of each model : Analyze the goodness of fit of each model M by computing the likelihood for each model, based on the predefined regression coefficients, b provided in INPUT.mat (also listed in Table 2 for each model in appendix, see more detail in appendix)
Explain the concept of a data warehouse : Briefly explain the concept of a data warehouse. In the context of a CRM framework, why is a data warehouse such an important tool?
Outline the process of capturing customer data : Evaluate how the company is using its Web site to gather customer data. - explain how those data would benefit your local Hard Rock operation.
Calculate average total cost at these different sales levels : How much economic profit can be achieved at each level of output? If price is $10.00 how much will be produced in the short run? Using the price of $4 to answer the previous questions. Calculate the Average Total Cost at these different sales levels.
Explain the long-run impact of immigration : Explain the long-run impact of immigration on those who oppose it in question (a) - What is the inflation rate in Korea? In Japan and What is the expected rate of depreciation in the Korean won relative to the Japanese yen?

Reviews

len1264309

11/3/2016 3:32:33 AM

Please look at Question 3, part b in the attached file. Deals with Bayesian sampling approach using MATLAB. Analyze the goodness of fit of each model M by computing the likelihood for each model, based on the predefined regression coefficients, b provided in INPUT.mat (also listed in Table 2 for each model in appendix, see more detail in appendix).

Write a Review

Engineering Mathematics Questions & Answers

  Prime number theorem

Dirichlet series

  Proof of bolzano-weierstrass to prove the intermediate value

Every convergent sequence contains either an increasing, or a decreasing subsequence.

  Antisymmetric relations

How many relations on A are both symmetric and antisymmetric?

  Distributed random variables

Daily Airlines fies from Amsterdam to London every day. The price of a ticket for this extremely popular flight route is $75. The aircraft has a passenger capacity of 150.

  Prepare a system of equations

How much money will Dave and Jane raise for charity

  Managing ashland multicomm services

This question is asking you to compare the likelihood of your getting 4 or more subscribers in a sample of 50 when the probability of a subscription has risen from 0.02 to 0.06.]  Talk about the comparison of probabilities in your explanation.

  Skew-symmetric matrices

Skew-symmetric matrices

  Type of taxes and rates in spokane wa

Describe the different type of taxes and their rates in Spokane WA.

  Stratified random sample

Suppose that in the four player game, the person who rolls the smallest number pays $5.00 to the person who rolls the largest number. Calculate each player's expected gain after one round.

  Find the probability density function

Find the probability density function.

  Develop a new linear programming for an aggregate production

Linear programming applied to Aggregate Production Planning of Flat Screen Monitor

  Discrete-time model for an economy

Discrete-time model for an economy

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd