How would single link or dbscan handle such data

Assignment Help Other Subject
Reference no: EM132730304

Question: 1. In CLIQUE, the threshold used to find cluster density remains constant, even as the number of dimensions increases. This is a potential problem since density drops as dimensionality increases; i.e., to find clusters in higher dimensions the threshold has to be set at a level that may well result in the merging of low-dimensional clusters. Comment on whether you feel this is truly a problem and, if so, how you might modify CLIQUE to address this problem.

2. Name at least one situation in which you would not want to use clustering based on SNN similarity or density.

3. Give an example of a set of clusters in which merging based on the closeness of clusters leads to a more natural set of clusters than merging based on the strength of connection (interconnectedness) of clusters.

4. We take a sample of adults and measure their heights. If we record the gender of each person, we can calculate the average height and the variance of the height, separately, for men and women. Suppose, however, that this information was not recorded. Would it be possible to still obtain this information? Explain.

5. Explain the difference between likelihood and probability.

6. Traditional K-means has a number of limitations, such as sensitivity to outliers and difficulty in handling clusters of different sizes and densities, or with non-globular shapes. Comment on the ability of fuzzy c-means to handle these situations.

7. Clusters of documents can be summarized by finding the top terms (words) for the documents in the cluster, e.g., by taking the most frequent k terms, where k is a constant, say 10, or by taking all terms that occur more frequently than a specified threshold. Suppose that K-means is used to find clusters of both documents and words for a document data set.

(a) How might a set of term clusters defined by the top terms in a document cluster differ from the word clusters found by clustering the terms with K-means?

(b) How could term clustering be used to define clusters of documents?

8. Suppose we find K clusters using Ward's method, bisecting K-means, and ordinary K-means. Which of these solutions represents a local or global minimum? Explain.

9. You are given a data set with 100 records and are asked to cluster the data. You use K-means to cluster the data, but for all values of K, 1 ≤ K ≤ 100, the K-means algorithm returns only one non-empty cluster. You then apply an incremental version of K-means, but obtain exactly the same result. How is this possible? How would single link or DBSCAN handle such data?

Reference no: EM132730304

Questions Cloud

What is the generalizability of the research : What relates this example to a practical, real-world work environment in the information technology field? What is the generalizability of this research?
How do you generate research results without analysis : Think about that for a minute. How do you generate research results, without analysis? What constitutes testing the results? For this week's discussion find.
Describe how one person interprets the question : Surveys usually contain instructions for participants that direct them to answer to the best of their ability. Inherently, this expectation of honest answers.
What impact does the exclusion of the metrics : What impact does the exclusion of the metrics that represent minorities and language limited individuals have on the predictability of the CDC's SVI.
How would single link or dbscan handle such data : You are given a data set with 100 records and are asked to cluster the data. You use K-means to cluster the data, but for all values of K, 1 = K = 100.
What are the features of the application : Continue your work with your team on the features of the application by identifying ethical challenges and specifying the type of data the feature uses.
Developing a publicly accessible cloud-based application : Your team of international developers will be developing a publicly accessible cloud-based application which may potentially house user PII data, information.
How cultural perspective could impact the security decisions : How cultural perspective could impact the security decisions of an administrator setting up SharePoint® Server 2013 citing choices that were made in the labs
Describe a dbms that you have dealt with at a company : A database management system is an application that provides users with the means to manipulate, analyze, and query data. Almost all DBMSs in existence today.

Reviews

Write a Review

Other Subject Questions & Answers

  Where should the emergency response plan be posted

Where should the emergency response plan be posted? How often should training be provided-annually or quarterly? Which healthcare professionals should be a part of the emergency response team? Why

  Articulate the major ideas that will comprise the body

Develop three (3) topic sentences that articulate the major ideas that will comprise the body of your essay.

  You will prepare a property location and valuation report

Property Valuation 16234 Assignment Task. Students are to select a main street retail strip of their choice within the Sydney metropolitan area. The Sydney CBD is excluded from your choice of locations. You will prepare a property location and valu..

  Provider liable for the actions of his or her clerical staff

Do you believe it is reasonable to hold a provider liable for the actions of his or her clerical staff? Support your opinion with an example.

  Share the advantages of your childcare setting

Share the advantages of your childcare setting so that the parent chooses your method to care for their child.

  Challenges you face in applying proper grammatical rules

Describe 3-5 challenges you face in applying proper grammatical rules and APA formatting in your writing. Describe strategies you might use to address these challenges and explain why each might be effective

  What does your song sayabout the poverty

Choose one popular song from pop culture. It should be recent (within the last 5 years or so). Go online and find the lyrics to the song. Copy and paste.

  Are all good samples random

MAT 232 : Are all good samples random? This is an opportunity to bring up opinion polling, which typically tries to obtain views from particular groups.

  Discuss changing legal and regulatory environments

Discuss changing legal and regulatory environments, collaboration, and perspectives on the costs and benefits of public health.

  Define the due porcess clause of the fourteenth amendment

Define the Due Porcess Clause of the Fourteenth Amendment and Explain its Relevance to Criminal Law ?

  Introduction to the foundations of american education

one of the current debates from introduction to the foundations of american education listed in Appendix A

  What supports the instruction of vocabulary

Choose two lesson plans from the "SIOP Lesson Plans and Activities." Write a 250-500 word evaluation of each lesson. Your evaluations should address.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd