Implement a simple k-means method, Applied Statistics

Assignment Help:

There exists an unclassified data set with hidden data structures in it. The task in this assignment is to perform comprehensive Cluster Analysis in order to reveal the structures and similar data groups.

1. Implement a simple K-means method, which is able to handle real values data in attributes. Also you need to add functionality in your program that allows utilization of Euclidean, City Block, Euclidean Squared and Chebyshev distances. You are free to use any kind of weights (for feature or data instance) in the program if necessary.

2. Find unlabeled data set test.txt and initial centroids data set centroids.txt in the archive, both files have the following format: [attribute1_value attribute2_value ... attribute90_value]. The unlabeled data set includes 350 samples and the initial centroids set consists of 15 samples. Data instances in both files have 90 attributes.


Related Discussions:- Implement a simple k-means method

Muti linear regression model problem, Muti linear regression model problem ...

Muti linear regression model problem An investigator is studying the relationship between weight (in pounds) and height (in inches) using data from a sample of 126 high school

Ogive percentile, how do i determine the 40th percentile in an ogive graph

how do i determine the 40th percentile in an ogive graph

Write out the estimator of the linear combination, Now, let's look at a dif...

Now, let's look at a different linear combination. Suppose we are interested n comparing the average mean log income for no college education ( 16). 1. Write out the linear com

O-give curves, real time applications on graphical representation of o-give...

real time applications on graphical representation of o-give curves

Postneonatal mortality rate, Mid year population 440000 Late fatal death...

Mid year population 440000 Late fatal death          29 No. of live birth           5200 No. of infant death      423 No. of maternal death 89 No. of infant deaths i

Managerial report, A. Compute descriptive statistics for each stock and the...

A. Compute descriptive statistics for each stock and the S&P 500. Comment on your results. Which stocks are most volatile?

Chi square test, application of chi square test in civil engineering

application of chi square test in civil engineering

Coefficient of variation, Coefficient of Variation or C.V. To compare t...

Coefficient of Variation or C.V. To compare the variability between or more series, coeffiecnt of variation is used, it is relative measure of dispersion, it innovated and used

What are the coefficients of the linear combination, For the following ques...

For the following questions we are interested in a comparison of the 16 years education vs. > 16 years. (Recall we did the analysis on the log scale, so these are actual means on t

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd