K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Describe lorenz curve., Lorenz curve : Essentially the graphical representa...

Lorenz curve : Essentially the graphical representation of cumulative distribution of the variable, most often used for the income. If the risks of disease are not monotonically in

Homoscedasticity - reasons for screening data, Homoscedasticity - Reasons f...

Homoscedasticity - Reasons for Screening Data Homoscedasticity is the assumption that the variability in scores for a continuous variable is roughly the same at all values of

Construct a stem-and-leaf diagram, The number of employees absent from work...

The number of employees absent from work at a large electronics manufacturing plant over aperiod of 106 days is given in the table below. 146 141 139 140 145 141 142 131 142 140

Captures recapture sampling, Captures recapture sampling : Another approach...

Captures recapture sampling : Another approach to a census for estimating the size of population, which operates by sampling the population number of times, identifying the individ

Zero-inflated poisson regression, Zero-inflated Poisson regression is  the...

Zero-inflated Poisson regression is  the model for count data with the excess zeros. It supposes that with probability p the only possible observation is 0 and with the probabilit

Cross-sectional study, A study not involving the passing of time. All infor...

A study not involving the passing of time. All information is collected at the same time and subjects are contacted only once. Many surveys are of this type. The temporal sequence

Follow back surveys, Surveys which use lists related with the vital statist...

Surveys which use lists related with the vital statistics to sample individuals for the further information. For instance, the 1988 National Mortality Follow back Survey sampled de

Non parametric maximum likelihood (npml), Non parametric maximum likelihood...

Non parametric maximum likelihood (NPML) is a likelihood approach which does not need the specification of the full parametric family for the data. Usually, the non parametric max

O''brien''s two-sample tests, O'Brien's two-sample tests are the extension...

O'Brien's two-sample tests are the extensions of the conventional tests for assessing the differences between treatment groups which take account of the possible heterogeneous nat

Proportional allocation, how to get the proportional allocation of the give...

how to get the proportional allocation of the give stratified random sampling example

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd