Data reduction, Applied Statistics

The PCA is amongst the oldest of the multivariate statistical methods of data reduction. It is a technique for simplifying a dataset, by reducing multidimensional datasets to lower dimensions for analysis. It produces a small number of derived variables that are uncorrelated and that account for most of the variation in the original data set.'By reducing the number of variables'in this way, we can understand the underlying structure of the data. 'The derived variables are combinations of the original variables. For example, it might be that students take I0 examinations and some students do well in one examination while other students do better in another. It is difficult to compare one student with another when we have 10 marks to consider. One obvious way of comparing students is to calculate the mean score.

This is a constructed combination of the existing variables. However, one might get a more useful comparison of overall performances by considering other constructed cwbinations of the 10 exam marks. The PCA is one way of constructing such combinations, doing so in such a way as to account fer the maximum possible variation in the original data. We can then compare students' performance by considering this much smaller number of variables.

PCA states and then solves a well-defined statistical problem, and except for special cases always gives a unique solution wi.th some very nice mathematical properties. We can even describe some very artificial practical problems for which PCA provides the exact solution. The difficulty comes in trying to relate PCA to real-life scientific problems; the match is simply not very good. Actually PCA often provides a good approximation to common factor analysis, but that feature is now unimportant since both methods are now easy enough.

Posted Date: 4/4/2013 3:43:13 AM | Location : United States







Related Discussions:- Data reduction, Assignment Help, Ask Question on Data reduction, Get Answer, Expert's Help, Data reduction Discussions

Write discussion on Data reduction
Your posts are moderated
Related Questions
Statistical Definition of probability: Ques: (a) (i)  Distinguish Statistical Definition of probability from the Classical Definition.                  (ii) State the A

The Neatee Eatee Hamburger Joint specializes in soyabean burgers. Customers arrive according to the following inter - arrival times between 11.00 am and 2.00 pm: Interval-arrival

2.1 Modern hotels and certain establishments make use of an electronic door lock system. To open a door an electronic card is inserted into a slot. A green light indicates that the

According to a recent study, when shopping online for luxury goods, men spend a mean of $2,401, whereas women spend a mean of $1,527. Suppose that the study was based on a sample o

Statistician is searching the \home ground" effect and is studying 20 football games, of which 14 were won by the home team and 6 by the visitors. Therefore the game is a Bernoulli

10. If a set of scores has a sample mean of 25 and a sample variance of 4, find the following: a. the z-score for a raw score of 31 b. the z-score for a raw score of 18 c. the raw

Melissa Bakery is preparing for the coming thanksgiving festival. The bakery plans to bake and sell its favourite cookies; butter cookies, chocolate cookies and almond cookies. A k

A salesperson visits from house to house to sell her knives. The probability that she makes a sale at a random house is .3. Given that she makes a sale, the sale is worth $100 with

A researcher hypothesized that the pulse rates of long-distance athletes differ from those of other athletes. He believed that the runners’ pulses would be slower. He obtained a ra

Charts when the Mean and the Standard Deviation are not known We consider the data corresponding to the example of Piston India Limited. Since we do not know population mean a