Implement a simple k-means method, Applied Statistics

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.

Posted Date: 4/1/2013 5:55:54 AM | Location : United States







Related Discussions:- Implement a simple k-means method, Assignment Help, Ask Question on Implement a simple k-means method, Get Answer, Expert's Help, Implement a simple k-means method Discussions

Write discussion on Implement a simple k-means method
Your posts are moderated
Related Questions
entropy test to measure interaction between enviornmental factors and genes

Assumptions in Regression To understand the properties underlying the regression line, let us go back to the example of model exam and main exam. Now we can find an estimate o

Descriptive Statistics : Carrying out an extensive analysis the data was not a subject to ambiguity and there were no missing values.  Below are descriptive statistics that hav

it is said that management is equivalent to decision making? do you agree? explain

Evaluate Gross Reproduction Rate: From the data given below compute : i)   General  Fertility  Rate ii)  Specific  Fertility  Rate iii)  Total  Fertility  Rate iv)


Statistical Errors              Statistical data are obtained either by measurement or by observation. Hence to think of perfect accuracy is only a delusion or a myth, It is no

(a) At a stream gauging station, the following discharges and stage measurements were taken for the purpose of the rating curve at that section: Stage (m) 1

Use the given information to find the P-value. The test statistic in a two-tailed test is z = 1.49 P-value = (round to four decimal places as needed)

Regression line drawn as Y=C+1075x, when x was 2, and y was 239, given that y intercept was 11. calculate the residual