K-means cluster analysis, Advanced Statistics

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 

Posted Date: 7/30/2012 1:31:04 AM | Location : United States







Related Discussions:- K-means cluster analysis, Assignment Help, Ask Question on K-means cluster analysis, Get Answer, Expert's Help, K-means cluster analysis Discussions

Write discussion on K-means cluster analysis
Your posts are moderated
Related Questions
Attack rate : This term frequently used for the incidence of the disease or condition in the particular group, or during a limited interval of time, or under the special circumstan

Jelinski  Moranda model is t he model of software reliability which supposes that failures occur according to the Poisson process with a rate decreasing as more faults are diagnos

Balanced incomplete repeated measures design (BIRMD): An arrangement of the N randomly selected experimental units and k treatments in which each and every unit receives k1 treatm

Confidence profile method : A Bayesian approach to meta-analysis in which the information in each piece of the evidence is captured in the likelihood function which is then used al

Probability distribution : For the discrete random variable, a mathematical formula which provides the probability of each value of variable. See, for instance, binomial distributi

Multi-hit model is the model for a toxic response which results from the random occurrence of one or the more fundamental biological events. A response is supposed to be induced o

MAREG is the software package for the analysis of the marginal regression models. The package permits the application of generalized estimating equations and the maximum likelihoo

Link functions: The link function relates the linear predictor ηi to the expected value of the data. In classical linear models the mean and the linear predictor are identical

A term commonly encountered in the analysis of the contingency tables. Such type of frequencies are the estimates of the values to be expected under hypothesis of interest. In a tw

Relative risk is the measure of the association between the exposure to a particular factor and the risk or probability of a convinced outcome, calculated as follows     therefor