Implement a simple k-means method, Applied Statistics

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.

Posted Date: 4/1/2013 5:55:54 AM | Location : United States







Related Discussions:- Implement a simple k-means method, Assignment Help, Ask Question on Implement a simple k-means method, Get Answer, Expert's Help, Implement a simple k-means method Discussions

Write discussion on Implement a simple k-means method
Your posts are moderated
Related Questions
Use only the rare event rule, and make subjective estimates to determine whether events are likely. For example, if the claim is that a coin favors heads and sample results consis

The investor has constant wealth 1 and is o?ered to invest in shares of a project that either gains 3=2 or loses 1 with equal probabilities. Therefore, if the investor obtains sha

#questionMaximize Z= 3x1 + 2X2 Subject to the constraints: X1+ X2 = 4 X1 - X2 = 2 X1, X2 = 0..

Steps in ANOVA The three steps which constitute the analysis of variance are as follows: To determine an estimate of the population variance from the variance that exi


Chi-square analysis can be used with both Goodness-of-Fit Tests and with Tests for Independence. There are specific instances when each test should be used based on the information

how to interpret results, a good explanation to help me understand.


Show that when h = h* for the histogram, the contribution to AMISE of the IV and ISB terms is asymptotically in the ratio 2:1. Compare the sensitivity of the AMISE(ch) in Equa

Make a decision about the given claim. Use only the rare event rule, and make subjective estimates to determine whether events are likely. For example, if the claim is that a coi