K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Coefficient of concordance, Coefficient of concordance : The coef?cient is ...

Coefficient of concordance : The coef?cient is taken in use to assess the agreement among m raters ranking n individuals according to some of the speci?c characteristic. Which can

Please answer this question, How large would the sample need to be if we ar...

How large would the sample need to be if we are to pick a 95% confidence level sample: (i) From a population of 70; (ii) From a population of 450; (iii) From a population of 1000;

Gaussian process, The generalization of the normal distribution used for th...

The generalization of the normal distribution used for the characterization of functions. It is known as a Gaussian process because it has Gaussian distributed finite dimensional m

Cube law, A law supposedly applicable to voting behaviour which has a histo...

A law supposedly applicable to voting behaviour which has a history of several decades. It may be stated thus: Consider a two-party system and suppose that the representatives of t

Path analysis, Path analysis  is  a device for evaluating the interrelat...

Path analysis  is  a device for evaluating the interrelationships among the variables by analyzing their correlational structure. The relationships between the variables are man

Goodmanand kruskal measures of association, Goodmanand kruskal measures of ...

Goodmanand kruskal measures of association is the measures of associations which are useful in the situation where two categorical variables cannot be supposed to be derived from

Dummy variable, Discuss the use of dummy variables in both multiple linear ...

Discuss the use of dummy variables in both multiple linear regression and non-linear regression. Give examples if possible

Dorfman scheme, An approach to investigations designed to recognize a parti...

An approach to investigations designed to recognize a particular medical condition in the large population, usually by means of a blood test, which might result in the considerable

Persson rootze ´n estimator, Persson Rootze ´n estimator  is an estimator f...

Persson Rootze ´n estimator  is an estimator for the parameters in the normal distribution when the sample is truncated so that all the observations under some fixed value C are re

Homoscedasticity - reasons for screening data, Homoscedasticity - Reasons f...

Homoscedasticity - Reasons for Screening Data Homoscedasticity is the assumption that the variability in scores for a continuous variable is roughly the same at all values of

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd