K-means cluster analysis, Advanced Statistics

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 

Posted Date: 7/30/2012 1:31:04 AM | Location : United States







Related Discussions:- K-means cluster analysis, Assignment Help, Ask Question on K-means cluster analysis, Get Answer, Expert's Help, K-means cluster analysis Discussions

Write discussion on K-means cluster analysis
Your posts are moderated
Related Questions
Random allocation is a technique for creating the treatment and control groups particularly in accordance of the clinical trial. Subjects receive the active treatment or the place

Mention the characteristics of Statistics. Explain any two applications of Statistics.

Recurrence risk : Usually the probability that an individual experiences an event of interest given previous experience(s) of the event; for example, the probability of recurrence

Cointegration : The vector of not motionless time sequence is said to be cointegrated if the linear combination of the individual series is stationary. Facilitates suitable testing

Pascal's triangle  is an arrangement of numbers described by Pascal in his Traité du Triangle Arithmétique published in the year 1665 as 'The number in each cell is equal to in the

Multiple correlation coefficient is the correlation among the observed values of dependent variable in the multiple regression, and the values predicted by estimated regression

Committees to monitor the accumulating data from the clinical trials. Such committees have chief responsibilities for ensuring the continuing safety of the trial participants, rele

After graduating from Tech Julia was unable to find regular employment and approached the Director of Athletics at Tech to request that she remain a vendor of the following year.

Pattern recognition is a term for a technology that recognizes and analyses patterns automatically by machine and which has been used successfully in many areas of application inc

Geographical information system (gis): The software and hardware configurations through which the digital georeferences are processed and displayed. Used to recognize the geograph