Outliers - reasons for screening data, Advanced Statistics

Assignment Help:

Outliers - Reasons for Screening Data

Outliers are due to data entry errors, subject is not a member of the population that the sample is trying to represent, or the subject is really different. Statistical tests are quite sensitive to outliers so this problem should be addressed.

Univariate outliers are easy to detect (z-scores, box plots, histograms, etc.) standard scores larger than +/-3 are outliers (consider 4 is n>100 or 2.5 if n<10)

Multivariate outliers are difficult to detect. Mahalanobis distance is one powerful technique to use in this case (discussed later). This is evaluated as a chi-square statistic with degrees of freedom equal to number of variables in the analysis. A chi-sqaure statistic value that is significant beyond p<0.001 level determines outliers.

In most cases, it is ok to drop the value from the sample. One can also take steps to reduce the relative influence of outliers if the researcher decides to include the values in the analysis.


Related Discussions:- Outliers - reasons for screening data

Exponential family, A family of the probability distributions of the form g...

A family of the probability distributions of the form given as   here θ is the parameter and a, b, c, d are the known functions. It includes the gamma distribution, normal dis

Classification matrix, Classification matrix: A term many times used in di...

Classification matrix: A term many times used in discriminant analysis for the matrix summarizing the results and outputs obtained from the derived classi?cation rule, and obtaine

Glejser test, Glejser test is the test for the heteroscedasticity in the e...

Glejser test is the test for the heteroscedasticity in the error terms of the regression analysis which involves regressing the absolute values of the regression residuals for the

Week 5 Assignment 1, Activity Description Create an MS Word document by c...

Activity Description Create an MS Word document by cutting and pasting SPSS output into the document. Complete the following: Use an existing dataset to compute a factorial AN

Definition, what is operational gaining

what is operational gaining

Attitude scaling, Attitude scaling : The process of analysing the positions...

Attitude scaling : The process of analysing the positions of the individuals on scales purporting to measure attitudes, for instance a liberal-conservative scale, ora risk-willingn

Cluster analysis, Cluster analysis : A set of methods or techniques for con...

Cluster analysis : A set of methods or techniques for constructing a sensible and informative classi?cation of an initially unclassi?ed set of data, using variable values observed

Mareg, MAREG is the software package for the analysis of the marginal regr...

MAREG is the software package for the analysis of the marginal regression models. The package permits the application of generalized estimating equations and the maximum likelihoo

Business statistics, I need you to help me for Business Statistics class wi...

I need you to help me for Business Statistics class with homework quizzes. Can you help to do it?

Define percentile, Percentile : The set or group of divisions which produce...

Percentile : The set or group of divisions which produce exactly 100 equal parts in the series of continuous values, like blood pressure, height, weight, etc. Hence a person with b

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd