K-means cluster analysis, Advanced Statistics

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 

Posted Date: 7/30/2012 1:31:04 AM | Location : United States







Related Discussions:- K-means cluster analysis, Assignment Help, Ask Question on K-means cluster analysis, Get Answer, Expert's Help, K-means cluster analysis Discussions

Write discussion on K-means cluster analysis
Your posts are moderated
Related Questions
Treatment allocation ratio is the ratio of the number of subjects allocated to the two treatments in a clinical trial. The equal allocation is most usual in practice, but it might

Activity Description Create an MS Word document by cutting and pasting SPSS output into the document. Complete the following: Use an existing dataset to compute a factorial AN


The probability distribution of the various observations is required to obtain the run of two successes in the series of Bernoulli trials with the probability of success equal to a

Your first task is to realize two additional data generation functions. Firstly, extend the system to generate random integral numbers based on normal distribution. You need to stu

Non-randomized clinical trial is the clinical trial in which the series of consecutive patients receive a new treatment and those which respond (according to some of the pre-defin

we are testing : Ho: µ=40 versus Ha: µ>40 (a= 0.01) Suppose that the test statistic is z0=2.75 based on a sample size of n=25. Assume that data are normal with mean mu and standa

The term which is used in the industrial experimentation, where there is commonly a large set of candidate factors believed to have the possible significant influence on the respon

Attitude scaling : The process of analysing the positions of the individuals on scales purporting to measure attitudes, for instance a liberal-conservative scale, ora risk-willingn

Cellular proliferation models : Models are used to describe the growth of the  cell populations. One of the example is the deterministic model   where N(t) is the number of cel