Data reduction, Applied Statistics

The PCA is amongst the oldest of the multivariate statistical methods of data reduction. It is a technique for simplifying a dataset, by reducing multidimensional datasets to lower dimensions for analysis. It produces a small number of derived variables that are uncorrelated and that account for most of the variation in the original data set.'By reducing the number of variables'in this way, we can understand the underlying structure of the data. 'The derived variables are combinations of the original variables. For example, it might be that students take I0 examinations and some students do well in one examination while other students do better in another. It is difficult to compare one student with another when we have 10 marks to consider. One obvious way of comparing students is to calculate the mean score.

This is a constructed combination of the existing variables. However, one might get a more useful comparison of overall performances by considering other constructed cwbinations of the 10 exam marks. The PCA is one way of constructing such combinations, doing so in such a way as to account fer the maximum possible variation in the original data. We can then compare students' performance by considering this much smaller number of variables.

PCA states and then solves a well-defined statistical problem, and except for special cases always gives a unique solution wi.th some very nice mathematical properties. We can even describe some very artificial practical problems for which PCA provides the exact solution. The difficulty comes in trying to relate PCA to real-life scientific problems; the match is simply not very good. Actually PCA often provides a good approximation to common factor analysis, but that feature is now unimportant since both methods are now easy enough.

Posted Date: 4/4/2013 3:43:13 AM | Location : United States







Related Discussions:- Data reduction, Assignment Help, Ask Question on Data reduction, Get Answer, Expert's Help, Data reduction Discussions

Write discussion on Data reduction
Your posts are moderated
Related Questions
Exercise: (Binomial and Continuous Model.) Consider a binomial model of a risky asset with the parameters r = 0:06, u = 0:059, d =  0:0562, S0 = 100, T = 1, 4t = 1=12. Note that u

Lifts usually have signs indicating their maximum capacity. Consider a sign in a lift that reads "maximum capacity 1400kg or 20 persons". Suppose that the weights of lift-users are

Systematic Sampling In Systematic Sampling each element has an equal chance of being selected, but each sample does not have the same chance of being selected. Here,

The mean tax-return preparation fee H&R Block charged retail customers in 2012 was $183 (The Wall Street Journal, March 7, 2012). Use this price as the population mean and assume t

Normal Distribution Meaning: According  to ya Lun Chou  There perfectly smooth and symmetrical  curve, resulting  from the expansion of the binomial (p+q) n    when n approac

Main stages of Statistical Inquiry The following are the various stages of a statistical inquiry (1)   Planning the Inquiry: First of all we have to assess the problem und

When the number of farmers growing wheat in Russia increases, the increase in world supply lowers the world price of wheat. Draw an appropriate diagram to analyze how this chang

Using the raw measurement data presented below, calculate the t value for independent groups to determine whether or not there exists a statistically significant difference between

Of the 6,325 kindergarten students who participated in the study, almost half or 3,052 were eligible for a free lunch program. The categorical variable sesk (1 == free lunch, 2 = n

Application of the chi Square Test