Implement a simple k-means method, Applied Statistics

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.

Posted Date: 4/1/2013 5:55:54 AM | Location : United States







Related Discussions:- Implement a simple k-means method, Assignment Help, Ask Question on Implement a simple k-means method, Get Answer, Expert's Help, Implement a simple k-means method Discussions

Write discussion on Implement a simple k-means method
Your posts are moderated
Related Questions
Find the Relation between two substance: The following table shows the results obtained in experiments aimed to determine how solubility of water in benzene depends on tempera

If the test is two-tailed, H1:  μ ≠  μ 0  then the test is called two-tailed test and in such a case the critical region lies in both the right and left tails of the sampling distr

Grouped Data  In order to find the median, the median class is to be first located and then interpolation is to be used by assuming that items are evenly spaced over the entire

Motion Picture Industry (95 Points) The motion picture industry is a competitive business. More than 50 studios produce a total of 300 to 400 new motion pictures each year, and t

Agreement The degree to which different observers, raters or diagnostic the tests agree on the binary classification. Measures of agreement like that of the kappa coefficient qu

Examine properties of good average with reference to AM, GM, HM, MEAN MEDIAN MODE

Histogram: It is generally used for charting continuous frequency   distribution. In histogram, data are plotted as a series  of rectangle one over the other. Class intervals

get a questionnaire that captured age at first marriage

The project of building a backyard swimming pool consists of eight major activities and has to be completed within 19 weeks. The activities and related data are given in the follow

Uses Arithmetic mean is widely used because of the following reasons: Mean is the simplest average to understand and easy to compute. It