Implement a simple k-means method, Applied Statistics

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.

Posted Date: 4/1/2013 5:55:54 AM | Location : United States







Related Discussions:- Implement a simple k-means method, Assignment Help, Ask Question on Implement a simple k-means method, Get Answer, Expert's Help, Implement a simple k-means method Discussions

Write discussion on Implement a simple k-means method
Your posts are moderated
Related Questions
Multi stage or Cluster Random sampling  Under this method, the random selection is made of primary, intermediate and final units from a given population. The area of investigat

Case Problem: A Bipartisan Agenda for Change In a study conducted by Zogby International, more than 700 New Yorkers were polled to determine whether the New York state government w

A sample of college students and a separate sample of adults aged 30-59 were surveyed regarding the amount of fruit they eat each day.  The results are shown in the histograms belo

An approximation to the error of a Riemannian sum: where V g (a; b) is the total variation of g on [a, b] de ned by the sup over all partitions on [a, b], including (a; b

Stratified Sampling Stratified Sampling is generally used when the population is heterogeneous. In this case, the population is first subdivided into several parts (or s

Level of Significance: α The main purpose of hypothesis testing is not to question the computed value of the sample statistic, but to make judgment about the difference between

what is the independent variable in how energetic do people feel after drinking different types of soft drints?

construction of control chart,n chart

The Null Hypothesis - H0:  The random errors will be normally distributed The Alternative Hypothesis - H1:  The random errors are not normally distributed Reject H0: when P-v

You are given the differential equation dy/dx = y' = f(x, y) with initial condition y(0 ) 1 = . The following numerical method is also given: where  f n = f( x n , y n )