Q. Construction of an explanatory model?
Construction of a sample: To apply multiple regression a large sample is generally essential (ideally between 2,000 to 15,000 individuals).
Note that for time series data, much less is required.
Data collection: Reliable data should be collected, either from a monitoring system or from a questionnaire survey or from a combination of both.
Calculation of coefficients: Coefficients can be calculated comparatively easily, by using statistical software which is both accessible and affordable to PC users.
Test of the model: The model aims to explain as much of the variability of observed changes as possible. To check how useful a linear regression equation is, tests can be performed on the square of correlation coefficient r. This tells us what percentage of variability in the y variable can be explained by the x variable. A correlation coefficient of 0.9 would demonstrate that 81% of the variability in Y is captured by variables X1-k used in the equation. The part which remains unexplained represents the residue (ε). So the smaller the residue better is the quality of model and its adjustment. Analysis of residues is a very significant step: it's at this stage that one sees the degree to that the model has been adapted to phenomena one wants to explain. It is residue analysis which also enables one to tell whether the tool has made it possible to estimate effects in a reasonable way or not. If significant anomalies are detected, regression model must not be used to estimate effects and original causal model must be re-examined, to see if further predictive variables can be introduced.