Implement a simple k-means method, Applied Statistics

Assignment Help:

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.


Related Discussions:- Implement a simple k-means method

Estimate the values of the dependent variable, 1. Suppose you are estimatin...

1. Suppose you are estimating the imports (from both the U.S. mainland and foreign countries) of fuels and petroleum products in Hawaii (the dependent variable). The values of the

Find the minimum constant workforce, Find the minimum constant workforce: ...

Find the minimum constant workforce: ABC Company, a manufacturer of roofing supplies, has developed monthly forecasts for roofing tiles. The forecasted demand and the expected

Probability and expectation, Ten balls are put in 6 slots at random.Then ex...

Ten balls are put in 6 slots at random.Then expected total number of balls in the two extreme slots

Median for grouped data, Grouped Data  In order to find the median, the...

Grouped Data  In order to find the median, the median class is to be first located and then interpolation is to be used by assuming that items are evenly spaced over the entire

Compute the sample mean and sample standard deviation, We want to investiga...

We want to investigate the income data.  In the Excel file Midterm  Data.xls there is a tab labeled "Income Data 2006".  The data in the tab is the income reported by 400 people in

Time series, Measurement of trend , least square method

Measurement of trend , least square method

Universe or population, Universe or Population The word universe as use...

Universe or Population The word universe as used in statistics denotes the aggregate from which a sample is to be taken. According to Simpson and Kafka, a universe or populatio

Large-sample and small-sample simulations, Show that when h = h* for the h...

Show that when h = h* for the histogram, the contribution to AMISE of the IV and ISB terms is asymptotically in the ratio 2:1. Compare the sensitivity of the AMISE(ch) in Equa

Angle count method, Angle Count method The method for estimating the pr...

Angle Count method The method for estimating the proportion of the area of a forest which is in fact covered by the bases of trees. An observer goes to each of the number of po

Population census, what are the challenges affecting population census in d...

what are the challenges affecting population census in developing countries

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd