Implement a simple k-means method, Applied Statistics

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.

Posted Date: 4/1/2013 5:55:54 AM | Location : United States







Related Discussions:- Implement a simple k-means method, Assignment Help, Ask Question on Implement a simple k-means method, Get Answer, Expert's Help, Implement a simple k-means method Discussions

Write discussion on Implement a simple k-means method
Your posts are moderated
Related Questions
objective of the testing stochastic regression

Properties of correlation

To compare three brands of computer keyboards, four data entry specialists were randomly selected. Each specialist used all three keyboards to enter the same kind of text material

#There were three types of food, and the researcher recorded which foods were bought. Peanut Butter Banana Hamburger 15

You are interested in testing the distance of two golf balls, Brand A and Brand B. You take a random sample of 100 golfers, each of whom hits Brand A once and Brand B once. Define

Comparison of the Principal Averages-Mean, Median and Mode The mean, median, and mode are located at the same point in a symmetrical frequency distri


Suppose both the Repair record 1978 and Company headquarters are believed to be significant in explaining the vector (Price, Mileage, Weight). Here, because of the limited sample s

Grouped data  For grouped data, the formula applied is  σ = Where f = frequency of the variable, μ= population mea