Data reduction, Applied Statistics

The PCA is amongst the oldest of the multivariate statistical methods of data reduction. It is a technique for simplifying a dataset, by reducing multidimensional datasets to lower dimensions for analysis. It produces a small number of derived variables that are uncorrelated and that account for most of the variation in the original data set.'By reducing the number of variables'in this way, we can understand the underlying structure of the data. 'The derived variables are combinations of the original variables. For example, it might be that students take I0 examinations and some students do well in one examination while other students do better in another. It is difficult to compare one student with another when we have 10 marks to consider. One obvious way of comparing students is to calculate the mean score.

This is a constructed combination of the existing variables. However, one might get a more useful comparison of overall performances by considering other constructed cwbinations of the 10 exam marks. The PCA is one way of constructing such combinations, doing so in such a way as to account fer the maximum possible variation in the original data. We can then compare students' performance by considering this much smaller number of variables.

PCA states and then solves a well-defined statistical problem, and except for special cases always gives a unique solution wi.th some very nice mathematical properties. We can even describe some very artificial practical problems for which PCA provides the exact solution. The difficulty comes in trying to relate PCA to real-life scientific problems; the match is simply not very good. Actually PCA often provides a good approximation to common factor analysis, but that feature is now unimportant since both methods are now easy enough.

Posted Date: 4/4/2013 3:43:13 AM | Location : United States







Related Discussions:- Data reduction, Assignment Help, Ask Question on Data reduction, Get Answer, Expert's Help, Data reduction Discussions

Write discussion on Data reduction
Your posts are moderated
Related Questions
Under the standard cost method which is also referred as the standard cost method ,stock receipts are assigned a standard cost. Any variations between the actual cost and standard

I need to know if the exam will be guarantee to pull my grade up to a B or an A. I have a D right now so i need to get someone that is willing to put effort on completing it???

Solve the following Linear Programming Problem using Simple method. Maximize Z= 3x1 + 2X2 Subject to the constraints: X1+ X2 = 4 X1 - X2 = 2 X1, X2 = 0

Examining the Population Variance Business decision making does not limit itself to setting up the hypothesis to test for the equality of more than two means or proportions sim

You are given the differential equation dy/dx = y' = f(x, y) with initial condition y(0 ) 1 = . The following numerical method is also given: where  f n = f( x n , y n )

Coefficient of Variation The standard deviation discussed above is an absolute measure of dispersion. The corresponding relative measure is known as the coefficient of vari

Let X, Y, and Z refer to the three random variables. It is known that Var(X) = 4, Var(Y) = 9, and Var(Z) = 16. It is further known that E(X) = 1, E(Y) = 2, and E(Z) = 4. Furthermor

The following data give the repair costs (in RM) for 30 randomly selected cars from a list of cars involved in collisions. a)  By using RM 1 as the lower limit of the first

Jocko's Garage has been accused of insurance fraud. Data on estimates made by Jocko and another garage were obtained for 10 damaged vehicles (available in 'jockogarage.txt'). Here

Using log(x1), log(x2) and log(x3) as the predictors, do pair wise scatterplots of all pairs of variables (including the response) and comment (use the pairs function). Do you thin