K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Define high-dimensional data, High-dimensional data : This term used for da...

High-dimensional data : This term used for data sets which are characterized by the very large number of variables and a much more modest number of the observations. In the 21 st

Mardia''s multivariate normality test, Mardia's multivariate normality test...

Mardia's multivariate normality test is a test that a set of the multivariate data arise from the multivariate normal distribution against departures due to the kurtosis. The test

Exponential family, A family of the probability distributions of the form g...

A family of the probability distributions of the form given as   here θ is the parameter and a, b, c, d are the known functions. It includes the gamma distribution, normal dis

Multimodal distribution, Multimodal distribution is the probability distri...

Multimodal distribution is the probability distribution or frequency distribution with number of modes. Multimodality is frequently taken as an indication which the observed di

Describe martingale, Martingale: In the gambling context the term at first...

Martingale: In the gambling context the term at first referred to a system for recouping losses by doubling the stake after each loss has occured. The modern mathematical concept

Nested design, Nested design  is the design in which levels of one or more ...

Nested design  is the design in which levels of one or more factors are subsampled within one or more other factors such that, for instance, each level of a factor B happens at onl

Median, Median is the value in a set of the ranked observations which divi...

Median is the value in a set of the ranked observations which divides the data into two parts of equal size. When there are an odd number of observations the median is middle v

Ascertainment bias, Ascertainment bias : A feasible form of bias, particula...

Ascertainment bias : A feasible form of bias, particularly in the retrospective studies, which arises from the relationship between the exposure to the risk factor and the probabil

Queuing, The number of passengers arriving at an airport terminal average 1...

The number of passengers arriving at an airport terminal average 1200 each hour. To process passengers (check in, take luggage, etc) take an average of 6 minutes each. There are

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd