Define the term multicollinearity, Applied Statistics

Question:

(a)
(i) Define the term multicollinearity.

(ii) Explain why it is important to guard against multicollinearity.

(b) (i) Sometimes we encounter missing values in databases with a large number of fields. A common method of handling missing values is simply to omit from the analysis the records or fields with missing values. Explain why this may be dangerous.

(ii) Data analysts have turned to methods that would replace the missing value with a value substituted according to various criteria. Briefly give a choice of three possible replacement values for missing data.

(c) Variables tend to have ranges that vary greatly from each other. Data miners should normalise the numerical variables to standardise the scale of effect each variable has on the results. Name two techniques for normalisation and differentiate between each one of them.

(d) The usual measure used to evaluate estimation and prediction models is the mean square error (MSE). Write down the expression for the MSE.

(e) (i) Explain briefly the term measures of variability.
(ii) Give four examples of typical measures of variability.

Posted Date: 11/20/2013 5:26:30 AM | Location : United States







Related Discussions:- Define the term multicollinearity, Assignment Help, Ask Question on Define the term multicollinearity, Get Answer, Expert's Help, Define the term multicollinearity Discussions

Write discussion on Define the term multicollinearity
Your posts are moderated
Related Questions
In a study of outcomes for patients who had been in the Intensive care Unit (ICU) at a large hospital, the records from last 150 patients who had been in the ICU for more than one

Grouped Data  In order to find the median, the median class is to be first located and then interpolation is to be used by assuming that items are evenly spaced over the entire

Steps in ANOVA The three steps which constitute the analysis of variance are as follows: To determine an estimate of the population variance from the variance that exi

how can we graph a trend line by semiaverages and least square method?

A medical researcher has 100 bone cancer patients in a study. Every patient's condition is one of six types, type \A" to type \F". The 100 patients split as follows: x There


Geometric Mean is defined as the n th root of the product of numbers to be averaged. The geometric mean of numbers X 1 , X 2 , X 3 .....X n is given as

Try different numbers of clusters in your program (K=2...15) and build a plot that shows the dependency between number K and value of RSS function on the last iteration. What is th

Differentiate between prediction, projection and forecasting.

Henry Kaiser suggested a rule for selecting a number of components m less than the number needed for perfect reconstruction: set m equal to the number of eigenvalues greater than I