K-means cluster analysis, Advanced Statistics

Assignment Help:

K-means cluster analysis is the method of cluster analysis in which from an initial partition of observations into K clusters, each observation in turn is analysed and reassigned, if suitable, to a different cluster in an attempt to optimize some predefined numerical criterion that measures in some sense the 'quality' of cluster solution. Several such clustering criteria have been suggested, but the most usually used arise from considering the features of the within groups, between groups and whole matrices of sums of squares and the cross products (W, B, T) which can be described for every partition of the observations into the particular number of groups. The two most ordinary of the clustering criteria developing from these matrices are given as follows

minimization of trace W

minimization of determinant W

The first of these has tendency to produce the 'spherical' clusters, the second to produce clusters that all have same shape, though this will not necessarily be spherical in shape. 

 


Related Discussions:- K-means cluster analysis

Regression to the mean, Regression to the mean is the procedure first note...

Regression to the mean is the procedure first noted by Sir Francis Galton that 'each peculiarity in man is shared by his kinsmen, but on average to the less degree.' Hence the ten

Dummy variables, The variables resulting from the recoding categorical vari...

The variables resulting from the recoding categorical variables with more than two categories into the sequence of binary variables. Marital status, for instance, if originally lab

Factor rotation, Generally the final stage of an exploratory factor analysi...

Generally the final stage of an exploratory factor analysis in which factors derived initially are transformed to build their interpretation simpler. Generally the target of the pr

Hazard plotting, Hazard plotting  is based on the hazard function of a dist...

Hazard plotting  is based on the hazard function of a distribution, this procedure gives estimates of distribution parameters, the proportion of units failing by the given time per

Explain initial data analysis (ida), Initial data analysis (IDA): The firs...

Initial data analysis (IDA): The first phase in the examination of the data set which comprises  number of informal steps including the following steps * checking the quality o

Causality, Causality: The relating of the reasons to the effects they prod...

Causality: The relating of the reasons to the effects they produce. Several investigations in medicine seek to establish the causal relations between the events, for instance, whi

January 2015 Take-Home Assignment, 3. a. A researcher in Hong Kong computes...

3. a. A researcher in Hong Kong computes the correlation between the percentage of employee turnover and the local unemployment rate (also expressed as a percentage) over a 20-mont

Over dispersion, Over dispersion is the phenomenon which occurs when empir...

Over dispersion is the phenomenon which occurs when empirical variance in the data exceeds the nominal variance under some supposed model. Most often encountered when the modeling

Leaps-and-bounds algorithm, Leaps-and-bounds algorithm is an algorithm whi...

Leaps-and-bounds algorithm is an algorithm which is used to ?nd the optimal solution in problems which might have a large number of possible solutions. Begins by dividing the poss

Explain multicentre study, Multicentre study : The clinical trial conducte...

Multicentre study : The clinical trial conducted simultaneously in the number of participating hospitals, with all centres following an agreed-upon study of the protocol and with

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd