Explain the contents of some or all of the given clusters?

Assignment Help Computer Engineering
Reference no: EM131926012

Problem

University Rankings. The dataset on American College and University Rankings contains information on 1302 American colleges and universities offering an undergraduate program. For each university, there are 17 measurements, including continuous measurements (such as tuition and graduation rate) and categorical measurements (such as location by state and whether it is a private or public school).

Note that many records are missing some measurements. Our first goal is to estimate these missing values from "similar" records. This will be done by clustering the complete records and then finding the closest cluster for each of the partial records. The missing values will be imputed from the information in that cluster.

a. Remove all records with missing measurements from the dataset.

b. For all the continuous measurements, run hierarchical clustering using complete linkage and Euclidean distance. Make sure to normalize the measurements. From the dendrogram: How many clusters seem reasonable for describing these data?

c. Compare the summary statistics for each cluster and describe each cluster in this context (e.g., "Universities with high tuition, low acceptance rate...").

d. Use the categorical measurements that were not used in the analysis (State and Private/Public) to characterize the different clusters. Is there any relationship between the clusters and the categorical information?

e. What other external information can explain the contents of some or all of these clusters?

f. Consider Tufts University, which is missing some information. Compute the Euclidean distance of this record from each of the clusters that you found above (using only the measurements that you have). Which cluster is it closest to? Impute the missing values for Tufts by taking the average of the cluster on those measurements.

Reference no: EM131926012

Questions Cloud

How the playwrights view marriage expectations of women : You are to write a four (4) to four and a half (4.5) page essay in which you identify and develop a thesis.
What are the ten most popular occupations in camp data frame : R Data Wrangling Homework- Download the "Campaign.zip" file from attachment. What are the ten most popular occupations and their counts in the camp data frame
How should they be used in the cluster analysis : For this goal, you are requested to find a cluster of "healthy cereals." Should the data be normalized? If not, how should they be used in the cluster analysis?
Bogus classification of marijuana : By using phrases like rising tide of common sense, bogus classification of marijuana, and when Uncle Sam gets out of the way
Explain the contents of some or all of the given clusters? : What other external information can explain the contents of some or all of these clusters? Remove all records with missing measurements from the dataset.
What is nashs thesis : What is he trying to convince you is true about people Colonial America and the reasons they might have participated in the American Revolution?
What happens after the jury has returned a verdict : What happens after the jury has returned a verdict. The discussion will cover motions for a new trial, motions in arrest of judgment, as well as the appeal.
How many natural clusters appear : Perform hierarchical clustering and inspect the dendrogram. From the dendrogram, how many natural clusters appear?
Examine the term-document matrix : Examine the term-document matrix. i. Is it sparse or dense? ii. Find two non-zero entries and briefly interpret their meaning, in words.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Choose a vertex according to its alphabetical order

Show intermediate results similar to the textbook example. In cases when several candidate vertices have the same minimal costs, choose a vertex according to its alphabetical order.

  What is the height of a complete four-way tree with n nodes

Give the order in which the nodes are visited when the tree in Figure 4.3 is visited in preorder. What is the height of a complete 4-way tree with N nodes?

  Sketch quality is an elusive concept

sketch quality is an elusive concept. Quality depends on specific organizational priorities: a 'good' design may be the most efficient, the cheapest, the most maintainable, the most reliable, etc.

  Design a paged-addressing mechanism for asc

Design a paged-addressing mechanism for ASC. Assume that the two most significant bits of the 8-bit direct address are used to select one of the four 10-bit.

  What fsms are able to describe

You don't need to go into all the details. Rather, you should point out what FSMs are able to describe and where they fail to capture the desired requirements.

  Design menuitem class that stores two pieces of information

Design a simple MenuItem class that stores only two pieces of information- the name of a menu item, and whether it is a vegetarian dish or not.

  Write program that implements a login window with text field

Write a program with a graphical interface that implements a login window with text fields for the user name and password.

  How standards are intended to create unity

standard issued by either the International Telecommunication Union.It was also stated that "Although standards are intended to create unity, they can have the opposite effect".

  Develop a banking application using the account hierarchy

Develop a polymorphic banking application using the Account hierarchy you created. Using a foreach loop, iterate over each account in the array.

  Explain the difference between a weak and strong entity set

Explain the difference between a weak and a strong entity set. Define the concept of aggregation. Give two examples of where this concept is useful.

  Make an entity-relationship model representing the data

The manager for the Clearwater Traders wants to gather the following data for each order placed by a customer: customer's name and address, item ordered, quantity of each item, item's size or color if applicable and retail price of each item.

  Create a table with columns for information about location

Create a table with columns for information about the location and required IP addresses for different types of devices and/or interfaces.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd