Implement a simple k-means method, Applied Statistics

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.

Posted Date: 4/1/2013 5:55:54 AM | Location : United States







Related Discussions:- Implement a simple k-means method, Assignment Help, Ask Question on Implement a simple k-means method, Get Answer, Expert's Help, Implement a simple k-means method Discussions

Write discussion on Implement a simple k-means method
Your posts are moderated
Related Questions
Testing of Hypothesis One objective of sampling theory is Hypothesis Testing. Hypothesis testing begins by making an assumption about the population parameter. Then we gather

I would like to know what the appropriate statistical test is for investigating an association between a nominal variable and an ordinal variable assuming normal distribution? It''

The calculations of arithmetic mean may be simple and foolproof, but the application of the result may not be so foolproof. An arithmetic mean may not merely lack

The Harmonic Mean is based on the reciprocals of numbers averaged. It is defined as the reciprocal of the arithmetic mean of the reciprocal of the given individual observations. Th

These techniques are applied when the rows and the columns of the data table represent the same units and when the measure is a disiance or a similarity. The goal of the analysis i

Sampling A  Population  is a collection of all the data points being studied. For example, if we are studying the annual incomes of all the people in India, then the population

Use the information given below to find the P-value. Also, use a 0.05 significance level and state the conclusion about the null hypothesis (reject the null hypothesis or fail to

Lifts usually have signs indicating their maximum capacity. Consider a sign in a lift that reads "maximum capacity 1400kg or 20 persons". Suppose that the weights of lift-users are

The Harmonic Mean is based on the reciprocals of numbers averaged. It is defined as the reciprocal of the arithmetic mean of the reciprocal of the given individual observations. Th

Explanation of standard deviation and variance Describe the importance of standard deviation and variance, what they calculate and why they are required. Importance of char