Data reduction, Applied Statistics

Assignment Help:

The PCA is amongst the oldest of the multivariate statistical methods of data reduction. It is a technique for simplifying a dataset, by reducing multidimensional datasets to lower dimensions for analysis. It produces a small number of derived variables that are uncorrelated and that account for most of the variation in the original data set.'By reducing the number of variables'in this way, we can understand the underlying structure of the data. 'The derived variables are combinations of the original variables. For example, it might be that students take I0 examinations and some students do well in one examination while other students do better in another. It is difficult to compare one student with another when we have 10 marks to consider. One obvious way of comparing students is to calculate the mean score.

This is a constructed combination of the existing variables. However, one might get a more useful comparison of overall performances by considering other constructed cwbinations of the 10 exam marks. The PCA is one way of constructing such combinations, doing so in such a way as to account fer the maximum possible variation in the original data. We can then compare students' performance by considering this much smaller number of variables.

PCA states and then solves a well-defined statistical problem, and except for special cases always gives a unique solution wi.th some very nice mathematical properties. We can even describe some very artificial practical problems for which PCA provides the exact solution. The difficulty comes in trying to relate PCA to real-life scientific problems; the match is simply not very good. Actually PCA often provides a good approximation to common factor analysis, but that feature is now unimportant since both methods are now easy enough.


Related Discussions:- Data reduction

Utility index , If the economy does well, the investor's wealth is 2 and if...

If the economy does well, the investor's wealth is 2 and if the economy does poorly the investor's wealth is 1. Both outcomes are equally likely. The investor is offered to invest

Displacement of a simply supported beam, The displacement of a simply suppo...

The displacement of a simply supported beam subject to a uniform load is given by the solution of the following differential equation (for small displacements); and q is th

Chi square test as a distributional goodness of fit, Chi Square Test as a D...

Chi Square Test as a Distributional Goodness of Fit In day-to-day decision making managers often come across situations wherein they are in a state of dilemma about the applica

Discriminant analysis, Discriminant analysis (DA) helps to determine which ...

Discriminant analysis (DA) helps to determine which variables discriminate between two or more naturally occurring groups. Mathematically equivalent to MANOVA, it ' is extensively

Harmonic mean, Harmonic Mean  The harmonic mean  also called harmonic  ...

Harmonic Mean  The harmonic mean  also called harmonic  average, in the total numbers of items of variable divided by the sum of r reciprocals of the values of the variable. In

Correlation, Definition of Correlation According  to prof, king correla...

Definition of Correlation According  to prof, king correlation means that between two series or group  of data  there  exists  some casual connection  prof, king  has also  exp

Root mean square deviation, Root Mean Square Deviation The standard d...

Root Mean Square Deviation The standard deviation is also called the ROOT MEAN SQUARE DEVIATION. This is because it is the ROOT (Step 4) of the MEAN (Step 3) o

The sum of mean and variance, the sum of mean and variance ofabinomia distr...

the sum of mean and variance ofabinomia distribution of 5 trials is 9/5, find the binomial distribution.

Level process control lab, Based on the following graphs (next page) you sh...

Based on the following graphs (next page) you should write a discussion report (2 pages) on: 1. Determination of whether the open-loop system response is consistent with a 1st o

Residual, regression line drawn as Y=C+1075x, when x was 2, and y was 239, ...

regression line drawn as Y=C+1075x, when x was 2, and y was 239, given that y intercept was 11. calculate the residual

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd