K-means cluster analysis, Advanced Statistics

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 

Posted Date: 7/30/2012 1:31:04 AM | Location : United States







Related Discussions:- K-means cluster analysis, Assignment Help, Ask Question on K-means cluster analysis, Get Answer, Expert's Help, K-means cluster analysis Discussions

Write discussion on K-means cluster analysis
Your posts are moderated
Related Questions
Chi-squared distribution : It is the probability distribution, f (x), of the random variable de?ned as the sum of squares of the number (v) of independent standard normal variables

Ask questT-TEST? ion #Minimum 100 words accepted#

Mann Whitney test is a distribution free test which is used as an alternative to the Student's t-test for assessing that whether the two populations have the same median. The test

Hurdle Model:  The model for count data which postulates two processes, one generating the zeros in the data and one generating positive values. The binomial model decides the bina

Individual differences scaling is a form of multidimensional scaling applicable to the data comprising of a number of proximity matrices from the different sources that is differe

Bubble plot : A method or technique for displaying the observations which involve three variable values. Two of the variables are used to make a scatter diagram and values of the t

Basic reproduction number : A term used in the theory of infectious diseases for the number of secondary cases which one case would generate in a completely susceptible population.

An investor with a stock portfolio sued his broker, claiming that a lack of diversification in his portfolio had led to poor performance. The data, shown below, are the rates of re

Perturbation theory : The theory useful in assessing how well a specific algorithm or the statistical model performs when the observations suffer less random changes. In very commo

This term sometimes used to describe the extra factor in variance of the sample mean when n sample values are drawn without the replacement from the finite population of size N. Th