Implement a simple k-means method, Applied Statistics

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.

Posted Date: 4/1/2013 5:55:54 AM | Location : United States

Related Discussions:- Implement a simple k-means method, Assignment Help, Ask Question on Implement a simple k-means method, Get Answer, Expert's Help, Implement a simple k-means method Discussions

Write discussion on Implement a simple k-means method
Your posts are moderated
Related Questions
Solve the following Linear Programming Problem using Simple method. Maximize Z= 3x1 + 2X2 Subject to the constraints: X1+ X2 = 4 X1 - X2 = 2 X1, X2 = 0

You are currently working with a supplier who is producing a shaft whose diameter specification is 6.00 ± .003 inches.  Currently, the process is yielding shafts wit

The file Midterm  Data.xls has a tab labeled "National Grid vs. Alcoa" which presents historical price data for two stocks.  Using the National Grid price as the X-value and the Al

prove that coefficient of correlation lies between -1 and+1

Mean Absolute Deviation To avoid the problem of positive and negative deviations canceling out each other, we can use the Mean Absolute Deviation which is given by

If the test is two-tailed, H1:  μ ≠  μ 0  then the test is called two-tailed test and in such a case the critical region lies in both the right and left tails of the sampling distr

how do i determine the 40th percentile in an ogive graph

Type of Variable in Regression Analysis There are two types of variable in regression analysis. These are: a.      Dependent variable b.      Independent variable

Universe or Population The word universe as used in statistics denotes the aggregate from which a sample is to be taken. According to Simpson and Kafka, a universe or populatio