List of the aliments and their cluster membership

Assignment Help Humanities
Reference no: EM131058156

Question 1

Get the dataset "food.txt" from GauchoSpace and read it with R. Alternatively you can download this data set from the library cluster.datasets with the following code:

library(cluster.datasets)
data(nutrients.meat.fish.fowl.1959)
The Data Set contains the quantity of Energy, Protein, Fat, Calcium and Iron of 27 differen aliments.

The task here is to finding meaningful clusters in the data. To this end perform the following:
1. Find clusters using a K-means algorithm. Try out different values of K and determine your best best solution. The number of clusters you choose should be based either on appropriate measures of fit, for example SSE as defined in the book IDM, and interpretability of the results. For each value of K that you try out provide:

a. the centroids
b. the size of each cluster and a list of the aliments and their cluster membership
c. the ratio between-SS/total-SS
d. a meaning (use your imagination) to each cluster formed, e.g. what are the summarizing characteristics of the aliments in group 1?
e. to answer part d above you might find useful using a parallel coordinate plot of the centroids
2. Apply hierarchical clustering using min, max and average distances (respectively single, complete and average methods in R).
a. For each method produce a dendrogram with the labels of the aliments
b. What are the differences, in any, in using the three different measures of distances?
c. Can you individuate clusters similar to those obtained by K-means clustering?

Additional exercises for PStat 231
Question 2
Perform PCA of the food.txtdata and use a biplot to visualize the first two PC and the Variables. Based on the biplot one could still individuate groups (clusters) of aliments with similar characteristics.

a. Is the grouping obtained by PCA similar or different from that obtained by the clustering algorithms above? Explain with some detail.
b. Which technique do you find most useful in describing the data set? Why?
1
Question 3
Suppose that we have four observations, for which we compute a dissimilarity matrix, given by

0.3 0.4 0.7
0.3 0.5 0.8
0.4 0.5 0.45
0.7 0.8 0.45
For instance, the dissimilarity between the first and second observations is 0.3, and the dissimilarity between the second and fourth observations is 0.8.
a. On the basis of this dissimilarity matrix, sketch the dendrogram that results from hierarchically clustering these four observations using complete linkage. Be sure to indicate on the plot the height at which each fusion occurs, as well as the observations corresponding to each leaf in the dendrogram.

b. Suppose that we cut the dendogram obtained in (a) such that two clusters result. Which observations are in each cluster?

Reference no: EM131058156

Questions Cloud

Design a database for an automobile company : Design a database for an automobile company to provide to its dealers to assist them in maintaining customer records and dealer inventory and to assist sales staff in ordering cars.
Identify specific environmental stewardship activities : This can include the removal of exotic species, trail repair, etc. Also, there are environmental groups that identify specific environmental stewardship activities that need volunteers to help pick up trash, plant trees, etc.
Design a database for a world-wide package delivery company : The database must be able to keep track of customers (who ship items) and customers (who receive items); some customers may do both.
Conduct a critical literature review of your research topic : What have researchers said about your research topic? What types of studies have they done, and what have been the findings and what epistemological perspectives have served as the foundation for these studies?
List of the aliments and their cluster membership : Get the dataset "food.txt" from GauchoSpace and read it with R. Alternatively you can download this data set from the library cluster.datasets with the following code:
Mean life expectancy : The U.S. Center for Disease Control reports that the mean life expectancy was 47.6 years for whites born in 1900 and 33.0 years for nonwhites. Suppose that you randomly survey death records for people born in 1900 in a certain county.
Question regarding the sample proportion : Find the test statistic that would be used for a test of H0: p = 0.3 versus Ha: p ≠ 0.3, given a sample proportion of 0.35 from a sample size of 200.
Design a database for an airline : Your design should include an E-R diagram, a set of relational schemas, and a list of constraints, including primary-key and foreign-key constraints.
How does seniority play a role in how overtime is scheduled : If an overtime list is created, how should it be managed since there are certain workers qualified for some tasks but not others? Should there be several task specific lists created, or an overall shop list? If a listed is created for overtime, ma..

Reviews

Write a Review

Humanities Questions & Answers

  Plan to improve correctional facilities

Women imprisonment should be facilitate with certain correctional facilities. In this paper, we have discussed about the plan for improving correctional facilities for the women prisoner in America.

  Situation analysis of food security in new zealand

This paper aims to present, discuss and analyse factors that are causing low food security, the management, current and proposed solutions in New Zealand (NZ).

  Social services on human traffic

The research study is limited to the understanding of  how social service agencies could approach or respond to the needs of the victim/ survivor of sex trafficking.

  Social deviance: the issue of risky sexual behaviour

The issue of deviant behaviour has been discussed by various sociologists from a number of perspectives.

  Race and racism

This paper is a brief description of the study, discrimination pertains to the prejudice perception and treatment of a person or demographical issues belonging to a category.

  Depth of challenges people face

Your own knowledge about the Depth of Challenges People face

  Project plan written report

Sufficient information to enable the organisation's Board of Executives to understand the rationale for the program and the activities required to design, implement and evaluate the program including any additional data that would need to be colle..

  Evaluation structure logical and achievable.

Health care organisational processes management, service administration, service provision and delivery

  Write an essay on locke''s theory of meaning

Write an essay on Locke's theory of meaning

  Leadership reflections paper

Write a paper on Leadership Reflections Paper Summarizes and reflects upon leadership ideas gleaned from the four required readings

  What role has feminism played

what role has feminism played in our appreciation of their work today.

  What subject matter choices were available to women artists

Prior to the beginning of the twentieth century, what subject matter choices were available to women artists? What cultural and historical influences placed limitations on their choices?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd