Give the coordinates of the ''outliers''

Assignment Help Basic Computer Science
Reference no: EM13165453

Programs must be running, and would be demonstrated to the instructor. There is 30% weight for writing a detailed high level algorithm and 70% weight for the running code. No credit for not running code even if you have correct algorithm. All material should be neatly typed on the computer. There is a 20% bonus points for neatness of presentation.

Problem # 1. Write a K-means high level algorithm and program for clustering the N-dimensional data point in your own language. The algorithm should be able to read the data from a data file that has data in the following form. Note: you can use software library for sorting if needed.

Dimensions = <integer>

<Tuple 1> <disease>

<tuple 2> <disease>

....

<tuple N> <disease>

<Tuple> will be given as comma separated dimension coordinates starting from dimension 1. The dimension value will be given from 1..100. The <disease> will be a word for example 'diabetes', 'kidney problem', 'acidity' etc. If the <disease> column is missing, it would mean no disease is associated with that value.

The distance measure will be Euclidean that means it is square root of squares of difference of coordinate and centeroids. Your program should display the coordinates of the centroid on the screen, the threshold value you gave, and the maximum distance from the centroid to the farthest point in a cluster for all the clusters. It should also give the coordinates of the 'Outliers' in a separate output file. Outliers are those points that do not belong to any cluster.

Problem #2. Write a minimum spanning tree high level algorithm and program for clustering the N-dimensional data point in your own language. The algorithm should be able to read the data from a data file that has data in the following form. Note: Greedy algorithms are good for MST. You can use software library for sorting if needed.

Dimensions = <integer>

<Tuple 1> <disease>

<tuple 2> <disease>

....

<tuple N> <disease>

<Tuple> will be given as comma separated dimension coordinates starting from dimension 1. The dimension value will be given from 1..100. The <disease> will be a word for example 'diabetes', 'kidney problem', 'acidity' etc. If the <disease> column is missing, it would mean no disease is associated with that value.

The distance measure will be Euclidean that means it is square root of squares of difference of coordinate and centeroids. Your program should display the coordinates of the centroid on the screen, the threshold value you gave, and the maximum distance from the centroid to the farthest point in a cluster for all the clusters. It should also give the coordinates of the 'Outliers' in a separate output file. Outliers are those points that do not belong to any cluster.

Reference no: EM13165453

Questions Cloud

Once the user enters a 0 : Once the user enters a 0 you will exit the loop, close the file and execute the code as previously designed until you have displayed all of the scores and the average handicap.
Determine the number of whole units : Determine the number of whole units to be accounted for and to be assigned costs and the equivalent units of production for the Drawing Department.
Create a class account that represents a banking account. : Create a class Account that represents a banking account.
Find the number of automobiles : The mass of an average blueberry is 0.72 g and the mass of an automobile is 1900 kg. Find the number of automobiles whose total mass is the same as 1.0 {rm mol} blueberries.
Give the coordinates of the ''outliers'' : The threshold value you gave, and the maximum distance from the centroid to the farthest point in a cluster for all the clusters. It should also give the coordinates of the 'Outliers' in a separate output file. Outliers are those points that do no..
Analyze the purity of aspirin : although aspirin solutions do not absorb visible light, we are still able to use visible light in an experiment to analyze the purity of aspirin. Why?
Prepare a common-size income statement and balance sheet : Prepare a common-size income statement and balance sheet for McDonough Products. The first column of each statement should present McDonough Products common-size statement, and the second column should show the industry averages.
What is the total password population : A phonetic password generator picks two segments randomly for each six-letter password. the form of each segment is consonant, voul, consonant, where V= and C= (V)
How many grams mtbe must be present in each gasoline : If fuel mixtures are required to contain 3.0 oxygen by mass, how many grams MTBE must be present in each 110 gasoline.

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Explain kind of system real-time statistics

Permits customers to see real-time statistics like views and click-throughs about their current banner ads. Which kind of system will most efficiently give a solution.

  Compute expected payback percentage of machine

Compute the expected "payback" percentage of the machine. In other words, for each coin played, what is the expected coin return?

  Explaining threat category

An individual threat can be represented in more than one threat category. If a hacker hacks into a network, copies a few files.

  Calculate present value of future earnings

Why do we need to calculate the present value of future earnings? A company can invest $100,000 to develop a new system, or it can put that amount into a second best alternative investment getting 10 percent.

  Find companies that specialize in computer forensics

What needs clarified? it's plainly stated use google to find 3 companies that specialize in computer forensics of those 3 companies write 2 or 3 paragraphs comparing what each company does.

  What type of hardware is needed to support t-1 connection

What kind of hardware is needed to support a T-1 connection to your business? You want to write a song and apply a digital signature to it, so that you can later prove that it is your song.

  Set of strings of balanced parentheses

Show that the set of strings of balanced parentheses is not defined by any regular expression. Hint: The proof is similar to the proof for the language E above. Suppose that the set of balanced strings had a deterministic finite automaton of m states

  Calculation for programming and machine independency

Assume f is a function that returns the result of reversing string of symbols given as its input, and g is a function that returns concatenation of the 2-strings given as its input. If x is the string abcd,

  Converting value stored in register to string representation

For this part of lab exercise, determine problem of converting value stored in a register to string representation of that value in decimal form.

  Identify and explore challenges and opportunities

LO2 Identify and explore contemporary challenges and opportunities in information systems and to formulate an opinion or judgement and offer possible solutions.

  Tools or tactics for risks for computing infrastructure

As part of project to assess security risks for computing infrastructure, you have found that other managers often have different idea. List any tools or tactics that could be used.

  Modify the addressing properties of workstations

Is ther a way you could modify the addressing properties of the workstations at each small office remotely, without having to visit those offices? Why or Why not?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd