K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Disability adjusted life years (dalys), This is an attempt to measure the s...

This is an attempt to measure the suffering caused by the illness which takes into the account both the years of the potential life lost due to the premature mortality as well as t

Glejser test, Glejser test is the test for the heteroscedasticity in the e...

Glejser test is the test for the heteroscedasticity in the error terms of the regression analysis which involves regressing the absolute values of the regression residuals for the

Define kalman filter, Kalman filter : A recursive procedure which gives an ...

Kalman filter : A recursive procedure which gives an estimate of the signal when only the 'noisy signal' can be observed. The estimate is efficiently constructed by putting the exp

Generalized estimating equations (gee), Technically the multivariate analog...

Technically the multivariate analogue of the quasi-likelihood with the same feature that it leads to consistent inferences about the mean responses without needing specific supposi

Regression analyze, I do have a data of real gdp for each state and from 20...

I do have a data of real gdp for each state and from 2000 to 2010 and I also have estimated population of illigel immigrants for each state from 2000 to 2010. In my thesis I am try

Gaussian process, The generalization of the normal distribution used for th...

The generalization of the normal distribution used for the characterization of functions. It is known as a Gaussian process because it has Gaussian distributed finite dimensional m

Explain prevalence, Prevalence : The measure of the number of people in a p...

Prevalence : The measure of the number of people in a population who have a certain disease at a given point in time. It c an be measured by two methods, as point prevalence and p

Multiple correlation coefficient, Multiple correlation coefficient is th...

Multiple correlation coefficient is the correlation among the observed values of dependent variable in the multiple regression, and the values predicted by estimated regression

RESEARCH METHODS AND STATISTICS.., a researcher is interested in whether st...

a researcher is interested in whether students who attend privte high schools have higher average SAT Scores than students in the general population. a random sample of 90 student

Cycle hunt analysis, The procedure for clustering variables in the multivar...

The procedure for clustering variables in the multivariate data, which forms the clusters by performing one or other of the below written three operations: * combining two varia

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd