K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Principal components regression analysis, Principal components regression a...

Principal components regression analysis is a process often taken in use to overcome the problem of multicollinearity in the regression, when simply deleting a number of the expla

Computer-intensive methods, Computer-intensive methods : The statistical me...

Computer-intensive methods : The statistical methods which require almost identical computations on the data repeated number of times. The term computer intensive is, certainly, a

Describe lorenz curve., Lorenz curve : Essentially the graphical representa...

Lorenz curve : Essentially the graphical representation of cumulative distribution of the variable, most often used for the income. If the risks of disease are not monotonically in

Dummy variables, The variables resulting from the recoding categorical vari...

The variables resulting from the recoding categorical variables with more than two categories into the sequence of binary variables. Marital status, for instance, if originally lab

Explanatory analysis, This term is sometimes used for the analysis of data ...

This term is sometimes used for the analysis of data from the clinical trial in which treatments A and B are to be compared under the suppositions that the patients remain on their

Parks test, The Null Hypothesis - H0: β 1 = 0 i.e. there is homoscedastici...

The Null Hypothesis - H0: β 1 = 0 i.e. there is homoscedasticity errors and no heteroscedasticity exists The Alternative Hypothesis - H1: β 1 ≠ 0 i.e. there is no homoscedasti

Follow back surveys, Surveys which use lists related with the vital statist...

Surveys which use lists related with the vital statistics to sample individuals for the further information. For instance, the 1988 National Mortality Follow back Survey sampled de

TIME SERIES, moving and semi average method graphical reprsentation

moving and semi average method graphical reprsentation

Latin square, Latin square  is an experimental design targeted at removing ...

Latin square  is an experimental design targeted at removing from the experimental error the variation from two extraneous sources so that a more sensitive test of the treatment ef

Hypergeometric distribution, Hypergeometric distribution is t he probabili...

Hypergeometric distribution is t he probability distribution related with the sampling without replacement from the population of finite size. If the population comprises of r ele

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd