Reference no: EM132269084
Assignment 1 -
Answer the following questions based on the attached web file named FBS.
Please submit both the word document containing the relevant screenshots (tables generated after executing k-means clustering), as well as the Excel worksheets containing the generated tables.
1. The Football Bowl Subdivision (FBS) level of the National Collegiate Athletic Association (NCAA) consists of over 100 schools. Most of these schools belong to one of several conferences, or collections of schools, that compete with each other on a regular basis in collegiate sports. Suppose the NCAA has commissioned a study that will propose the formation of conferences based on the similarities of the constituent schools. The file FBS contains data on schools belong to the Football Bowl Subdivision (FBS). Each row in this file contains information on a school. The variables include football stadium capacity, latitude, longitude, athletic department revenue, endowment, and undergraduate enrollment.
a. Apply k-means clustering with k = 10 using football stadium capacity, latitude, longitude, endowment, and enrollment as variables. Be sure to Normalize Input Data and specify 10 iterations and 10 random starts in Step 2 of the k-Means Clustering procedure. Take screenshots of the three tables generated in the KMC_Output worksheet: Cluster Centers, Inter-Cluster Distances, and Cluster Summary. Analyze the resultant clusters. What is the size of the smallest cluster? What is the average distance in the least dense cluster? What makes the least dense cluster so diverse?
(Tips: The least dense cluster in k means is the one with highest average distance in the cluster. For the question "What makes the least dense cluster so diverse", you need to 1) describe the most unique characteristic of the least dense cluster, by referring to the table of Cluster Centers in the KMC_Output worksheet; 2) compare the inter-cluster distances, by referring to the table of Inter-Cluster Distances in the KMC_Output worksheet. What is the nearest distance between this cluster and the others?
b. What problems do you see with the plan with defining the school membership of the 10 conferences directly with the 10 clusters? (Tip: Consider the sizes of clusters)
c. Repeat part a, but this time do not Normalize Input Data in Step 2 of the k-Means Clustering procedure. Take screenshots of the three tables generated in the KMC_Output1 worksheet: Cluster Centers, Inter-Cluster Distances, and Cluster Summary. Analyze the resultant clusters. Do they look quite differ from those in part a? Identify the dominating factor(s) in the formation of these new clusters.
(Tips: Dominating factor is the variable which makes the non-normalized clustering different than the normalized clustering. You can confirm it by clustering the schools solely on the basis of the dominating factor and then noting the similarity of the resulting clusters to the clusters based on all (non-normalized) variables.)
Assignment 2 -
Answer the following questions based on the web file named FBS.
Please submit both the word document containing your answers, as well as the Excel worksheets containing the relevant tables, and Dendrogram generated after executing hierarchical clustering.
Refer to the clustering problem involving the file FBS described in Problem 1. Apply hierarchical clustering with 10 clusters using football stadium capacity, latitude, longitude, endowment, and enrollment as variables. Be sure to Normalize input data in Step 2 of the Hierarchical Clustering procedure. Use Ward's method as the clustering method. Please create the dendrogram on the HC_Dendrogram worksheet. Copy the dendrogram from Excel to Word. And
a) Draw a horizontal line at the distance 22 and indicate the composition of the clusters segment.
b) Draw a horizontal line at the distance 14 and indicate the composition of the clusters segment.
Tips: Please read the textbook (complete version) on the page 260 to 262, you need to draw a horizontal line and indicate the composition of the clusters segment, like the example provided on the page 261 of your textbook. If you use a customized textbook version, please read the pages 77-79.
Steps to draw a horizontal line on dendrogram: after the dendrogram was generated in Excel, right click the dendrogram and select Copy. And then paste it to your word document. Insert a line from the menu "Shapes" from the "Insert" tab.
Note - Attached are excel assignment and the other attachment is supporting doc like how to download the add on.
Attachment:- Assignment Files.rar