Reference no: EM132903130
Question: Use hierarchical clustering on the "Colleges and Universities" data set. Use "School" as a label. Use Average, Centroid, and Ward methods and run each with the data standardized.
a. How many clusters does each algorithm produce? Answer the question based on JMP recommendation. Copy the dendrogram for each method with color and mark clusters.
b. Copy the cluster means tables for each method and characterize the clusters?
c. Use k-means clustering with k=2,3,4,5,6,7, and 8. What is the optimal number of clusters among k=2,3,4,5,6,7, and 8? Consider the optimal k, create parallel clusters makes sense.
d. Again, use k-means clustering with k=2,3,4,5,6,7 and 8. Find the sum of squared distances for the clusters and display them on overlay plot. (SSEs on the y axis, cluster labels on the x axis.)
What is the optimal number of clusters based on the overlay plot?
e. Based on the clustering result with optimal k from part c, do you have any outliers in your data set? if yes, which clusters could be outlier/s and why?