Calculate descriptive statistics for your variables

Assignment Help Basic Statistics
Reference no: EM131151264

Data Exploration and Descriptive Statistics.

For the final you will pick 4 variables to work with. At least one of them has to be an interval-ratio variable; please consult me if you are having trouble finding an interval-ratio variable. Your other variables can be nominal,ordinal, or interval ratio.

Part .Data Exploration

a) Run a histogram for each interval ratio variable. Cut and paste those onto your word file. Briefly describe the shape of the distribution, making note of its overall shape and also looking for any outliers.

b) Build a scatter plot if you have two or more interval ratio variables. What type of relationship, if any, can you observe between the variables?

Now, turn to your categorical variables (if you have any).

c) Run a frequency distribution for each of your categorical variables- you can use either the tab or fre command. Cut and paste the output in your word file and briefly describe the distribution of the variable. Note which category has the most observations, and note categories which have very few observations.

d) If you have two or more categorical variables run a cross-tab to examine the relationship between two of them using the tab command and either column or row percentages. Briefly describe the relationship between the variables. Cut and paste your output into a word file.

Part. Descriptive Statistics

Now, calculate descriptive statistics for your variables using the sum command. You can do all four variables at once.

a) Make sure you describe the mean of each variable. If the mean is not a good measure of central tendency for a particular variable please explain why. Do the same for the standard deviation and the range.

Of course, the type of correlation that you should calculate depends upon the type of variables you are working with. At the minimum you should calculate Three correlations but you must be very careful to use the right type of correlation coefficient for your data.

Just a few reminders:

1) To correlate interval-ratio variables, use pearson's r. Make sure you display a scatter plot before you run your correlation.

2) To correlate ordinal variables use spearman's rho. Make sure you display a cross tab before you run your correlation.

3) To correlate nominal variables use lambda. Make sure you display a crosstab before you run your correlation.

For each of your three correlations make sure you describe the size, statistical significance and, when applicable, the direction of the relationship.

Notice that the three correlations your report could be very different depending on the types of variables you are working with.

Regression and Multiple Regression

Now you will build a series of regression models. Before you begin keep the following in mind:

-Your outcome variable MUST be interval ratio.
-The interpretation of the regression coefficients depends upon the type of variable you are using and it's coding.
-For categorical predictors you might want to do some recoding. If you recode any variables make sure that you SAVE your data and that you describe how you recoded in your homework.

Part 1.
Estimate a regression model with a single predictor. Interpret the regression coefficient and it's p-value, the intercept and the R2.

Part 2.
Add another predictor variable to the model you estimated in Part 1. Describe any change in the coefficient of the original variable (and it's associated p-value) and interpret the coefficient and p-value of the new variable. Note any change in the R2

Part 3.
Add a third predictor variable. Describe any changes in the coefficients and p-values for the variables you entered into the previous models. Interpret the coefficient and p-value for your new variable. Note any change in the R2.

Effect size, prediction, and diagnostics

There are multiple ways to think about "effect size" in a multiple regression context. In this section:

1) Use the listcoef command after your regression models to obtain standardized coefficients. Briefly interpret the standardized coefficients using 2-3 sentences.

2) Now, calculate some "effect sizes" as I show in the video and the notes. The way that you do this section will depend upon the types of variables that you have. For categorical variables it probably makes most sense to calculate and effect at the mode. For interval ratio variables you might want to use the 25th and 75th percentiles. Do whatever makes sense for the type of data you are working with.

Take a few sentences to describe what you have calculated. Which variable appears to be the most important now?

Predicted values:

1) Create 2-4 "archetypes" or "representative cases" and calculate predicted values for those cases. How you do this will depend upon what types of variables you are working with. Please show your work and explain your archetypes.

Residuals

1) Calculate the residuals from your final regression model. Plot those residuals on a histogram and cut and paste the histogram into your word file. Do the residuals follow a normal distribution?

Heteroscedasticity

1) Plot your residuals against your fitted values using a scatter plot (paste the scatter plot into your word file). Do you see visual evidence of heteroscedasticity?

2) Test for heteroscedasticity using the Breusch-Pagan/ Cook-Weisberg test. What does this test tell you? Make sure you paste the output of the test into your word file.
Multicollinearity

1) Calculate VIFs for your model and paste the output into your word file? Does your model have multicollinearity problems?

Reference no: EM131151264

Questions Cloud

Name the titles of the financial reports : Name the titles of the financial reports in theIntel Corp. annual report that provide specific information about economic resources.
Which is characteristic of most effective training practices : What would you call a meeting that is typically done once a year to identify and discuss job-relevant strengths and weaknesses of individuals or work teams?
Applied for job promotion at marketing firm : A 43-year old male, Mike, applied for a job promotion at a marketing firm, along with four other employees. He is notified that he did not get the job, but his 32-year-old female co-worker, JoAnn, will be promoted. Mike feels he was equally qualified..
Potential effects of the political turmoil : In early 2011, political revolution started in many Middle Eastern and African countries. Discuss the actual, probable and potential effects of the political turmoil on MNCs that operate in these countries. Specifically, describe the risk MNCs face w..
Calculate descriptive statistics for your variables : Build a scatter plot if you have two or more interval ratio variables. What type of relationship, if any, can you observe between the variables?
Compute the depletion charge per ton : Compute the depletion charge per ton.  - Compute the depletion expense that Mertz should record for the year.
Context of organizational behavior theory : Each student will be required to assess the overall effectiveness of a company/institution/agency in Albany or any other city in the context of Organizational Behavior theory. Students will be required to collect data using interviews or surveys and ..
Prepare your comments about the two companies performances : You will calculate and compare the financial ratios listed further down this documentfor the fiscal year ending 2014, and prepare your comments about the two companies' performances based on your ratio calculations.
The objectives of generally accepted accounting principles : What are the objectives of generally accepted accounting principles in their application to the income statement?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Which of the following statement are false about the

which of the following statement are false about the chi-suared distribution with v specific valuedegrees of freedom?a

  No sweat garment labels

Following complaints about the working conditions in some apparel factories both in the United States and abroad, a joint government and industry commission recommended in 1998 that companies that monitor and enforce proper standards be allowed to..

  Standard deviation of vending machine

A vending machine that dispenses coffee into cups must fill the cups with 7.8oz of liquid. Before selling the vending machine to a college or business, the company tests the machine to be sure it is dispensing an average amount of 7.8oz of coffee.

  What is a hypothesis test

What is a hypothesis test? Why do we need to use them to make decisions about relating sample results to the population; why can't we just make our decisions by the sample value?

  Determine probability for condition

Decide between two stock investment choices. Determine the probability for each condition to make the investment even between the two.

  What is the probability of choosing a girl or an a+ student

At a particular school with 300 female students, 50 play football, 48 play basketball, and 8 play both. What is the probability that a randomly selected female student plays neither sport?

  Margin of error for sample of irs tax returns

The Internal Revenue Service plans to examine an SRS of individual federal income tax returns from each state. One variable of interest is the proportion of returns claiming itemized deductions.

  1 the residents of a housing development for senior

1. the residents of a housing development for senior citizens have completed a survey in which they indicated how

  Draw the independence graphs corresponding

Let X1,..., X4 be binary. Draw the independence graphs corresponding to the following log-linear models. Also, identify whether each is graphical and/or hierarchical (or neither).

  Critical thinking statistical analysis of article

Choose three peer C-reviewed research articles and write a short review of each of them that includes the following: Write the problem statement. Name the theoretical model used or briefly describe the overall conceptual model.

  Mean and standard deviation

In fact, the insurance company sees that in the entire population of homeowners, the mean loss from ?re is μ = $300 and the standard deviation of the loss is σ  = $400.

  Assign a set of random numbers

Using the creativity study data (Section 1.1.1) and a computer, assign a set of random numbers to the 47 subjects. Order the entire data set by increasing values of the random numbers, and divide the subjects into group 1 with the 24 lowest random..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd