K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Profile plots, Profile plots  is a technique of representing the multivaria...

Profile plots  is a technique of representing the multivariate data graphically. Each of the observation is represented by a diagram comprising of a sequence of equispaced vertical

Cluster sampling, Cluster sampling : A method or technique of sampling in w...

Cluster sampling : A method or technique of sampling in which the members of the population are arranged in groups (called as 'clusters'). A number of clusters are selected at the

Multilevel models, Multilevel models are the regression models for the mul...

Multilevel models are the regression models for the multilevel or clustered data where units i are nested in the clusters j, for example a cross-sectional study where students are

Bayes factor, Bayes factor : A summary of evidence for the modelM1 against ...

Bayes factor : A summary of evidence for the modelM1 against the another modelM0 provided by the set of data D, which can be used in the model selection. Given by the ratio of post

Mauchly test, Mauchly test is a test which a variance-covariance matrix of...

Mauchly test is a test which a variance-covariance matrix of pair wise differences of responses in the set of longitudinal data is the scalar multiple of identity matrix, a proper

Treatment allocation ratio, Treatment allocation ratio is the ratio of the...

Treatment allocation ratio is the ratio of the number of subjects allocated to the two treatments in a clinical trial. The equal allocation is most usual in practice, but it might

Quittingill effect, Quittingill effect is a  problem which occurs most fre...

Quittingill effect is a  problem which occurs most frequently in studies of the smoker cessation where smokers frequently quit smoking following the onset of the disease symptoms

Chain-binomial models, Chain-binomial models : Models arising in mathematic...

Chain-binomial models : Models arising in mathematical theory of the quite infectious diseases, which postulate that at any stage in the epidemic there are a certain number of the

Lexis diagram, Lexis diagram  is the diagram for displaying the simultaneou...

Lexis diagram  is the diagram for displaying the simultaneous effects of the two time scales (generally age and calendar time) on a rate. For instance, mortality rates from cancer

Environmental statistics, The procedures used for determining how the quali...

The procedures used for determining how the quality of life is affected by the environment, in particular by factors such as air and solid wastes, water pollution, hazardous substa

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd