Best subsets regression, Advanced Statistics

Assignment Help:

In the time series plot and scatter graphs there were many outliers that were clearly visible. These have been removed to identify if they were influential or had high leverage and in order to see if the multiple regression model assumptions have been met.

Below are the rows of the outliers that I removed out of the 1519 observations:

77, 674, 448, 757, 317, 549, 1187, 1198, 26, 456, 405, 307, 1205, 1348, 611, 368, 309

Best Subsets Regression: wfood versus totexp, income, age, nk

Response is wfood

                                                                   t i

                                                                   o n

                                                                    t c

                                                                    e o a

                               Mallows                         x m g n

Vars  R-Sq  R-Sq(adj)       Cp         S             p e e k

   1  22.9       22.9     67.4            0.092326  X

   1   5.5        5.4      424.9           0.10222    X

   2  24.8       24.7     31.3            0.091236  X     X

   2  24.2       24.1     42.7           0.091572  X   X

   3  26.1       26.0      6.1            0.090461  X   X X

   3  24.8       24.7     32.3           0.091239  X X   X

   4  26.3       26.1      5.0            0.090397  X X X X

The best subset is a way of identifying which independent variable such as the totexp, income, age and nk are best suited to the regression model.  According to the results above income is the variable that has the highest Cp and the lowest R-squared value therefore it will be the variable that will be dropped to see if the data fits the model.


Related Discussions:- Best subsets regression

Battery reduction, Battery reduction : A common term for reducing the numbe...

Battery reduction : A common term for reducing the number of variables of the interest in a study for the purposes of study and perhaps later data collection. For instance, an over

Mendelian randomization, Mendelian randomization is the term applied to th...

Mendelian randomization is the term applied to the random assortment of alleles at the time of gamete formation, a process which results in the population distributions of genetic

Mauchly test, Mauchly test is a test which a variance-covariance matrix of...

Mauchly test is a test which a variance-covariance matrix of pair wise differences of responses in the set of longitudinal data is the scalar multiple of identity matrix, a proper

Frequency polygon, It is the diagram used to display the values graphically...

It is the diagram used to display the values graphically in a frequency distribution. The frequencies are graphed as an ordinate against the class mid-points as abscissae. The p

LASPEYERES QUANTITY INDEX, HOW TO OBTAIN THE LASPEYRES QUANTITY INDEX AND T...

HOW TO OBTAIN THE LASPEYRES QUANTITY INDEX AND THE FORMULA

Residual calculation, Regression line drawn as y= c+ 1075x ,when x was2, an...

Regression line drawn as y= c+ 1075x ,when x was2, and y was 239,given that y intercept was 11. Calculate the residual ?

Point scoring, Point scoring is an easy distribution free method which can...

Point scoring is an easy distribution free method which can be used for the prediction of a response which is a binary variable from the observations on several explanatory variab

Ordinal variable, Ordinal variable is a measurement which allows a sample ...

Ordinal variable is a measurement which allows a sample of the individuals to be ranked with respect to some characteristic but where differences at different points of the scale

Deviance, The measure of the degree to which the particular model differs f...

The measure of the degree to which the particular model differs from the saturated model for the data set. Explicitly in terms of the likelihoods of the two models can be defined a

Reasons for screening data, Reasons for screening data     Garbage i...

Reasons for screening data     Garbage in-garbage out     Missing data          a. Amount of missing data is less crucial than the pattern of it. If randomly

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd