Compute the Linear Regression parameters

Assignment Help MATLAB Programming

Reference no: EM132271449

Project - Statistics

In this project we will be using the statistics based commands available in MATLAB. Also a MATLAB file, called "StatsData.mat" is on the website. You will need to download it.

This project will involve writing programs that perform statistical analysis of data, establishing which sets are related and which are not. Also we will be doing an experiment that will demonstrate the concept of a Confidence Interval.

1. Regression and Correlation:

a) Using MATLAB, compute the Linear Regression (LR) parameters and the correlation coefficient (CC) between each of the first four rows and the fifth row from "StatsData.mtx". In other words, compute the LR parameters and CC between row1, and row 5, then between row2 and row 5, row 3 and row 5 and then finally between row 4 and row 5.

b) Select the two rows with the higher CC from the independent rows (row 1, row 2, row 3 and row 4). Then perform a multivariate regression for the data in the file "StatsData.mtx". The fifth row should be the dependent variable and the two selected rows are to be treated as the independent variables. Be sure to compute the Coefficient of Determination (CD) and then compute its square root, which is has the same magnitude as CC.

The selection of the two rows, does not need to be done in software, but rather can be "hard coded" into the program.

c) Based on the CC computed from the 5 cases of regression analysis performed, what can be said about the data?

What data or terms appear to be related to the dependent variable and which are not related?

Is this consistent with results of the case where two rows were used for regression?

2. Histograms, PDF's and Confidence Intervals:

a) Assuming that the mean and variance of the entire row is basically the same as the parameters of the hidden process, compute the 95% Confidence Interval (CI) for each row, in "StatsData.mat" assuming a sample size of 64, (Note sqrt(64) = 8).

b) Produce a histogram of the data in each row of the matrix. Setting the number of bins to the square root of the number of samples. Use the histograms to plot an estimate of the probability density of each row. Also plot the matching Gaussian distribution for the row, based on the mean and variance of the entire row.

c) Then compute the average for each 64 point subsection of each row. There will be 1000 of these 64 point subsections in each row. These will be referred to here after as the Short Interval Averages (SIA's). Compute the mean and variance of the 1000 SIA's and compare this to the predicted mean and variance for 64 point average. The mean of the SIA's should be the same as the mean for the row, while the variance should be the variance of the row divided by length of the short intervals (64).

d) Count the number of times the SIA's fall inside the 95% CI bound for the each row. Convert this to an estimate of probability and compare it to 95%. How well do they match, noting that some of the distributions are not Gaussian?

e) Finally compute the Squared-Sum-Difference (SSD) between the histogram estimate and the Gaussian PDF. The formula for which is given here.

SSD = _n=1∑^N(HE_n - PDF_n)²

where HE_n is the Histogram Estimate at bin n, and PDF_n is the PDF at bin n.

Based on the histogram plots, and the SSD, what type of distribution is each row, and can the SSD be used as a measure of how Gaussian a set of data is?

Attachment:- Assignment Files.rar

Reference no: EM132271449

Questions Cloud

Learning race in alliance management is unethical : Some argue that a learning race in alliance management is unethical. Others contend that a learning race is part and parcel of alliance relationships.

Gain control of a small firm : Which would be better, to gain control of a small firm that you then seek to expand

Discuss management and global outsourcing in business : Discuss management and global outsourcing in business in the article Sri Lankan Accountants Lure Global Outsourcers by Heather Timmons in New York Times

What did you take away from the excursion : Where are these experiences located (directional transitions)? To the right? Just above? The reader should be able to build the environment around you.

Compute the Linear Regression parameters : EECE 540, Project - Statistics. Using MATLAB, compute the Linear Regression (LR) parameters and the correlation coefficient (CC)

How will you approach your studies : How will you approach your studies? What types of resources are available to you, and how will you access and utilize them? Taking the opportunity to prepare.

How implicit bias might impact health care in united states : Discuss how implicit bias might impact health care in the United States. Support your Assignment with specific references to all resources used.

Explain diagonal communication and use examples : Explain diagonal communication and use examples to illustrate how it can facilitate communication within a company.

Essay about fast food or about social networks : discuss the points that you put in thesis statement that are 3 paragraphs. Then in the 4th paragraph you will discuss the opposite

User Account

All Pages