K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Two-phase sampling, Two-phase sampling is the sampling scheme including tw...

Two-phase sampling is the sampling scheme including two distinct phases, in the first of which the information about the particular variables of interest is collected on all the m

Explain multicentre study, Multicentre study : The clinical trial conducte...

Multicentre study : The clinical trial conducted simultaneously in the number of participating hospitals, with all centres following an agreed-upon study of the protocol and with

Describe nuisance parameter, Nuisance parameter : The parameter of the mode...

Nuisance parameter : The parameter of the model in which there is no scienti?c interest but whose values are generally required (but in usual are unknown) to make inferences about

#title.Statistics for management, The growth in bad debt expense for Johnst...

The growth in bad debt expense for Johnston office supply Company over this time period.If this rate continues,estimate the percentage increase in bad debts for 1997,relative to 19

Expected-utility maximizer, There are two periods. You observe that Jack co...

There are two periods. You observe that Jack consumes 100 apples in period t = 0, and 120 apples in period t = 1. That is, (c 0 ; c 1 ) = (100; 120) Suppose Jack has the util

Conditional logistic regression, Conditional logistic regression : The form...

Conditional logistic regression : The form of logistic regression designed to work with the clustered data, such as data including matched pairs of the subjects, in which subject-s

Lipstick Dilemma, For a career woman, wearing lipstick has become an integr...

For a career woman, wearing lipstick has become an integral part of her daily life. It is not unusual for a woman to look for a lipstick that will stay on her lips and not smudge

Randomization tests, Randomization tests are the procedures for determinin...

Randomization tests are the procedures for determining the statistical significance directly from the data with- out recourse to some particular sampling distribution. For instanc

Determine allowable setup cost, A metal fabrication process uses a die-cast...

A metal fabrication process uses a die-cast metal fastener at a uniform rate of 300 units per year. Currently, this item is currently purchased from an external supplier at a unit

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd