Define the term multicollinearity, Applied Statistics

Question:

(a)
(i) Define the term multicollinearity.

(ii) Explain why it is important to guard against multicollinearity.

(b) (i) Sometimes we encounter missing values in databases with a large number of fields. A common method of handling missing values is simply to omit from the analysis the records or fields with missing values. Explain why this may be dangerous.

(ii) Data analysts have turned to methods that would replace the missing value with a value substituted according to various criteria. Briefly give a choice of three possible replacement values for missing data.

(c) Variables tend to have ranges that vary greatly from each other. Data miners should normalise the numerical variables to standardise the scale of effect each variable has on the results. Name two techniques for normalisation and differentiate between each one of them.

(d) The usual measure used to evaluate estimation and prediction models is the mean square error (MSE). Write down the expression for the MSE.

(e) (i) Explain briefly the term measures of variability.
(ii) Give four examples of typical measures of variability.

Posted Date: 11/20/2013 5:26:30 AM | Location : United States







Related Discussions:- Define the term multicollinearity, Assignment Help, Ask Question on Define the term multicollinearity, Get Answer, Expert's Help, Define the term multicollinearity Discussions

Write discussion on Define the term multicollinearity
Your posts are moderated
Related Questions
prove that coefficient of correlation lies between -1 and+1

CALCULATE THE PERCENTAGE OF REFUNDS EXPECTED TO EXCEED $1000 UNDER THE CURRENT WITHHOLDING GUIDELINES

Standard Deviation  The concept of standard deviation was first introduced by Karl Pearson in 1893. The standard deviation is the most important and the popular measure of disp

"MagTek" electronics has developed a smart phone that does things that no other phone yetreleased into the market-place will do. The marketing department is planning to demonstrate

Analysis of Variance for the data: Draw a random sample of size 25 from the following data : (a) With Replacement and   (b) Without Replacement and obtain Mean and Varia

To compare three brands of computer keyboards, four data entry specialists were randomly selected. Each specialist used all three keyboards to enter the same kind of text material

Simple Linear Regression   While correlation analysis determines the degree to which the variables are related, regression analysis develops the relationship between the var

Exercise: (Binomial and Continuous Model.) Consider a binomial model of a risky asset with the parameters r = 0:06, u = 0:059, d =  0:0562, S0 = 100, T = 1, 4t = 1=12. Note that u

10. If a set of scores has a sample mean of 25 and a sample variance of 4, find the following: a. the z-score for a raw score of 31 b. the z-score for a raw score of 18 c. the raw