Assumptions in regression, Applied Statistics

Assumptions in Regression

To understand the properties underlying the regression line, let us go back to the example of model exam and main exam. Now we can find an estimate of a student's main exam points, if we also know his or her points on the model exam. As we have stated, a student with score of 85 in the model exam should receive points for the main exam in the vicinity of 75 to 95.

If we knew the model exam scores of all students along with their main exam scores, we would then have the population of values. The mean and the variance of the population of the model exam would be μx and σx2 and respectively. The measurements for the main exam points are  μy  and  σy2 .

The assumptions in regression are:

  1. The relationship between the distributions X and Y is linear, which implies the formula E(Y|X=x) = A + Bx at any given value of X = x.

  2. At each X, the distribution of Yx is normal, and the variances  σx2  are equal. This implies that E's have the same variance,  σ2.

  3. The Y-values are independent of each other.

  4. No assumption is made regarding the distribution of X.

    Since we do not have all of the students' course points and main exam points we must estimate the regression line E(Y|X = x) = A + BX.

    The figure shows a line that has been constructed on the scatter diagram. Note that the line seems to be drawn through the collective mid-point of the plotted points. The term  2148_simple linear regression.png  is the estimate of the true mean of Y's at any particular X = x.

    Figure 8

    682_assumptions in regression.png
Posted Date: 9/15/2012 5:05:34 AM | Location : United States







Related Discussions:- Assumptions in regression, Assignment Help, Ask Question on Assumptions in regression, Get Answer, Expert's Help, Assumptions in regression Discussions

Write discussion on Assumptions in regression
Your posts are moderated
Related Questions
Multivariate analysis of variance (MANOVA) is a technique to assess group differences across multiple metric dependent variables simultaneously, based on a set of categorical (non-


According to a recent study, when shopping online for luxury goods, men spend a mean of $2,401, whereas women spend a mean of $1,527. Suppose that the study was based on a sample o

Problem: A survey usually originates when an individual or an institution is confronted with an information need and the existing data are  insufficient. Planning the questionn

Why are graphs and tables useful when examining data? A researcher is comparing two middle school 7th grade classes. One class at one school has participated in an arts program

An approximation to the error of a Riemannian sum: where V g (a; b) is the total variation of g on [a, b] de ned by the sup over all partitions on [a, b], including (a; b

Advantages It is especially useful in case of open-end classes since only the position and not the values of items must be known. The median is also recommended if th

BCBSRI was able to reduce MSD related Workers Compensation cases with lost workdays by implementing a New Ergonomic Program in March 2000 and increasing workstation evaluations. Ex

Simple Linear Regression   While correlation analysis determines the degree to which the variables are related, regression analysis develops the relationship between the var

Construct your initial multivariate model by selecting a dependent variable Y and two independent variables X. Clearly define what each variable represents and how this relates t