Calculate cutoff values and analyzing histograms, Advanced Statistics

Assignment Help:

1. You are interested in investigating if being above or below the median income (medloinc) impacts ACT means (act94) for schools. Complete the necessary steps to examine univariate grouped data in order to respond to the questions below. Although deletions and/or transformations may be implied from your examination, all steps will examine original variables.

a. How many subjects have missing values for medlonic and act94?

b. Is there a severe split in frequencies between groups?

According to the descriptive analysis, no severe split is detected. This is also reflected in the skewness number which is lower than .5.

c. What are the cutoff values for outliers in each group?

d. Which outlying cases should be deleted for each group?

Average ACT score 1994 Stem-and-Leaf Plot for

medloinc= below the median for low inc % 1993

 Frequency Stem & Leaf

 7.00 14 . 1223789

 9.00 15 . 234478888

 5.00 16 . 12788

 4.00 17 . 1378

 2.00 18 . 09

 1.00 19 . 6

 3.00 20 . 069

 1.00 Extremes (>=22.5)

 Stem width: 1.0

 Each leaf: 1 case(s)

e. Analyzing histograms, normal Q-Q plots, and tests of normality, what is your conclusion regarding normality? If a transformation is necessary, which one would you use?

Tests of Normality

 

above or below median loinc

Kolmogorov-Smirnova

Shapiro-Wilk

 

Statistic

df

Sig.

Statistic

df

Sig.

average ACT score 1994

below the median for low inc % 1993

.162

32

.032

.903

32

.007

above the median for low inc % 1993

.166

32

.025

.921

32

.023

 

According to the information and the test of normality, it appears that this is a normal distribution.  Therefore, for the transformation, we would select 'Square Root."

 

 

f. Do the results from Levene's Test of Equal Variances indicate homogeneity of variance? Explain.

In running the test, there were no significant differences between the categories. Therefore; we can assume that this indicates homogeneity of variance.

2. Examination of the variable of scienc93 indicates a substantial to serve positively skewed distribution. Transform this variable using the most two appropriate methods. After examining the distribution for these transformed variables, which produced the best alteration?


Related Discussions:- Calculate cutoff values and analyzing histograms

Epidemic, The rapid development or growth of the disease in a community or ...

The rapid development or growth of the disease in a community or region. Statistical thinking has made very much significant contributions to the understanding of such type of phen

Hirap, #q A paper mill products two grade of paper viz., X & Y. Because of ...

#q A paper mill products two grade of paper viz., X & Y. Because of raw material restriction, it cannot produce more than 400 tons of grade X paper & 300 tons of grade Y paper in a

Normality - reasons for screening data, Normality - Reasons for Screening...

Normality - Reasons for Screening Data Prior to analyzing multivariate normality, one should consider univariate normality Histogram, Normal Q-Qplot (values on x axis

Computer-intensive methods, Computer-intensive methods : The statistical me...

Computer-intensive methods : The statistical methods which require almost identical computations on the data repeated number of times. The term computer intensive is, certainly, a

Correspondence analysis, The method or technique for displaying the relatio...

The method or technique for displaying the relationships between categorical variables in a type of the scatter plot diagram. For two this type of variables displayed in the form o

Explain kleiner hartigan trees, Kleiner Hartigan trees is a technique for ...

Kleiner Hartigan trees is a technique for displaying the multivariate data graphically as the 'trees' in which the values of the variables are coded into length of the terminal br

Cluster sampling, Cluster sampling : A method or technique of sampling in w...

Cluster sampling : A method or technique of sampling in which the members of the population are arranged in groups (called as 'clusters'). A number of clusters are selected at the

Outliers - reasons for screening data, Outliers - Reasons for Screening Dat...

Outliers - Reasons for Screening Data Outliers are due to data entry errors, subject is not a member of the population that the sample is trying to represent, or the subject i

Multi co linearity, Multi co linearity is the term used in the regression ...

Multi co linearity is the term used in the regression analysis to indicate situations where the explanatory variables are related by a linear function, making the inference of the

Probability distribution of the net present value, Suppose that $4 million ...

Suppose that $4 million is available for investment in three projects.  The probability distribution of the net present value earned from each project depends on how much is invest

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd