K-means cluster analysis, Advanced Statistics

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 

Posted Date: 7/30/2012 1:31:04 AM | Location : United States







Related Discussions:- K-means cluster analysis, Assignment Help, Ask Question on K-means cluster analysis, Get Answer, Expert's Help, K-means cluster analysis Discussions

Write discussion on K-means cluster analysis
Your posts are moderated
Related Questions
The Null Hypothesis - H0: There is no heteroscedasticity i.e. β 1 = 0 The Alternative Hypothesis - H1:  There is heteroscedasticity i.e. β 1 0 Reject H0 if nR2 > MTB >

It is an informal method of assessing the effect of the publication bias, generally in the context of the meta-analysis. The effect measures from each of the reported study are plo

Homoscedasticity - Reasons for Screening Data Homoscedasticity is the assumption that the variability in scores for a continuous variable is roughly the same at all values of

Minimization is the method or technique for allocating patients to the treatments in clinical trials which is usually the acceptable alternative to random allocation. The procedur

This term sometimes is applied to the model for explaining the differences found between naturally happening groups which are greater than those observed on some previous occasion;

Healthy worker effect : The occurrence whereby employed individuals tend to have lower mortality rates than those who are unemployed. The effect, which can pose the serious problem

Inliers is the term used for the observations most likely to be subject to error in situations where the dichotomy is developed by making a ‘cut’ on an ordered scale, and where th

Bivariate survival data : The data in which the two related survival times are of interest. For instance, in familial studies of disease incidence, data might be available on the a

Thomas Economic Forecasting, Inc. and Harmon Econometrics have the same mean error in forecasting the stock market over the last ten years. However, the standard deviation for Thom

relevancy of time series in business management