Reference no: EM132372197
Online SPSS Project Chapters 1 - 5
SPSS Statistics is a software package used for statistical analysis. It is menu driven so you will find it easy to use and very user friendly. Before starting this assignment, make sure that you are in a computer lab on JMU campus (search "SPSS" on the computer) or have installed SPSS on your computer through JMU Libraries. If you encounter problems in downloading/installing, please call the JMU IT Help line at (540) 568-3555.
Once in a computer lab or have downloaded SPSS on your computers (as per instructions in the beginning of the semester), go to Canvas Files area and open the Employee Satisfaction Data. Sometimes it will take awhile so be patient with starting SPSS.
The SPSS drop down menu can be found horizontally in the very top of your desktop once you have SPSS open. You will see "SPSS Statistics File Edit View Data Transform Analyze ... Help". The graphs and tables can be copied/cut and pasted on a WORD document.
1. Notice that it looks like an Excel spread sheet. Column sizes can be adjusted the same way by click-hold and move. This is a data set of 15,000 employees of a company. You will find the SPSS editor has 15,000 lines of information, and by clicking Variable view (found in the lower center), you will find the list of variables contained in the data set. You should see 12 variables. In Variable view, under Label, you will find a description of each variable. These descriptions will help you summarize the variable. SPSS will have a separate window open for all output tables and graphs.
2. Perform the operations indicated. Copy and paste each item from the SPSS output and insert in the appropriate space to this WORD document. Describe/summarize the graphs in complete contextual sentences. Make sure to use the unit of measurement of the variable.
First:READ both documents (title has "SPSS Project 1") found in the Files area that will summarize how to describe the distribution of a random variable and the document on examples of describing distributions. Remember that a distribution of a random variable is a graph, table, or formula that specifies the probability (or proportion) for all possible values of that random variable. So, you will be describing the graph.
Read the documents carefully.
If it is a quantitative variable, mention the shape (skewed right, skewed left or symmetric [with respect to the y-axis]), center (use the median if the shape is skewed and use the mean if the shape is symmetric) and spread (mention the min, max, and the first and third quartile if the shape is skewed and use the standard deviation if the shape is symmetric).
If it is a categorical/qualitative variable, mention the overall characteristic of the graph or the characteristic of the graph that stands out to you. For example, if the bars are all about (need not to be exactly equal) the same size, mention that (see example below). Or a couple of bars are the about the same height and the others are not. If every bar height is very different from each other, mention the category that has the highest and lowest frequency. DO NOT mention each category and the corresponding percentage.
And for both types of variables, write the overall message the graph conveys. These summaries should be written in context and with proper units of measurements. It should read like an article in the Internet that is describing a graph. Remember, DO NOT include your opinion or speculate what might have happened or why the graph of the data is such. Just facts. Again, write in the context of the problem. Succinct and simple is key.
3. Below are the SPSS instructions that you can use. If you want additional instructions, there is always the HELP tab in SPSS.
4. Remember it does not hurt to explore and discover other things. There are different ways to do a graph or get numerical values. SPSS is very forgiving, that if there is an error or if things can't be done, it is in gray.
Project Assignment Directions
a. Use SPSS to get the bar chart of the variable department and the numerical summary (Numerical Summaries>Categorical Variables) of each category.Copy and paste the graph and table here then write the description below. Make sure to follow the description as stated in the documents you downloaded.
b. Use SPSS to get the pie chart for the variable work_accident and the numerical summary (Numerical Summaries>Categorical Variables) of each category. Copy and paste the graph here then write the description below. Make sure to follow the description as stated in the documents you downloaded.
c. Use SPSS to get the side-by-side boxplot of the variables satisfaction level and Last_eval_score. Copy and paste the graph only here. No need to describe. A side-by-side boxplot is used to compare two quantitative variables.
d. Use SPSS to get the Histogram for the variable Years_Spent_at_Company.
e. Use SPSS to get the Dot plot for the variable Years_Spent_at_Company.
f. Use SPSS to get the Stem plot for the variable Years_Spent_at_Company.
g. Use SPSS to get the Boxplot for the variable Years_Spent_at_Company. Copy and paste the graph below. The outliers are graphed as an asterisk (extreme) or a circle (mild). Mention them when summarizing, if any. The numbers beside them are the line numbers where they can be found in the editor. Find the actual values for the outliers. The number of outliers are the number of line numbers you see in the boxplot. The values of the outliers are the values of the variable that are considered outliers. Sometimes there is more than one employee for an outlier value.
h. Use SPSS to get the numerical summaries for the variable Years_Spent_at_Company. ( Follow the instructions above: Numerical Summaries> Quantitative Variable> number 2). Check off the following numerical summaries:mean, median, standard deviation, quartiles, minimum, maximum, 38th percentile, and 89th percentile. Copy and paste the table here.
Write the description below. Make use of the graphs d - g above and use the appropriate numerical summaries from the table in h. Make sure to follow the description as stated in the documents you downloaded.
i. Now let us examine the relationship of two quantitative variables, more specifically how well number of years with the company predicts the satisfaction level score.
Let us first use SPSS to graphically display the relationship by getting the scatter plot.
Use SPSS to draw a scatter plotofYears_Spent_atCompany and Satisfaction_Level. (Graphs>Legacy Dialog>Scatter/Dot). Click Simple Scatter and then Define. Move Years_Spent_atCompany variable to the x-axis and Satisfaction_Levelto the y-axis by clicking on the variable and then the purple/blue arrow. Click on Titles (right hand corner of the window) and type the title of your graph. Cut and paste the scatter plot below:
What a mess! It is because we have 15,000 data points.
j. To find the correlation between the two variables (Analyze>Correlate>Bivariate). You have to move over at least 2 variables into the Variables box. Correlation is a quantity that determines the strength and directionfor two quantitative variables. If you put more than two variables in the Variables box, SPSS will calculate the correlation of all possible pairs of variables. In the output table, the correlation of any pair of variables is the intersection of one variable in the row and the other variable in the column.What is the correlation coefficient value? ______________.
A generic interpretation: There is a positive/negative, strong/moderate/weak (see RULE below), linear relationship between ___var 1____ and ____var 2____.
RULE: Strong if the correlation is between -1 to -.8 or .8 to 1. Moderate if the correlation is between -.8 to -.5 or .5 to .8 and weak if the correlation is between -.5 and .5. This rule is for classroom purposes only. Not really used in the real world as different areas of study have their own interpretation of what is weak, moderate, and strong.
Interpret the correlation coefficient in terms of the problem: (Use the actual variable names in the interpretation.)
Type your interpretation here:
k. The coefficient of determination is used more to determine strength of the linear relationship between two quantitative variables since it gives an actual numerical value and it will be up to the reader to determine the strength of the linear relationship. The coefficient of determination is calculated to be r 2 (correlation-squared) x 100%. What is the coefficient of determination value?_____________.
Look up the definition of the coefficient of determination and write the paraphrased definition here. Cite your source.
l. We would like to predict the satisfaction level score (Y) based on the number of years an employee has been with the company (X). Use SPSS to find the least squares line (Analyze>Regression>Linear). Move the variables to their corresponding role as Independent or Dependent variable. Click OK. In the output, the Coefficients Table (last table) contains the slope and y-intercept. The entry under the column B and row Constant is the y-intercept and the entry under column B and row "variable name" (you should have defined this early on) is the slope. Cut and paste the Coefficient Table here.
ype the Least Squares Line here in the form: y-hat = a + b*X
The sum of squared errors (SSE) also known as residual sum of squares and can be found in the third table called ANOVA. The value is in the intersection of sum of squares and residual. Write the value of the sum of the squares here: ______________________.
m. Interpret the slope value of the least squares line, in the context of the problem, you found in d.
A generic interpretation: On average, for every one unit increase in X, the Y variable is expected to increase (if slope > 0) or decrease (if slope <0) by b1 (slope) units.Use the actual variable names and their units of measurement in your interpretation. Satisfaction level has no units but of course the unit for number of years with the company is years. There are different ways to write this, so write it such that makes sense and follow the general idea of the statement above. Do not mention the y-intercept value in the interpretation.
Type your interpretation here:
n. What is the predicted satisfaction level score for an employee who has been with the company for 10 years?Show work.
Attachment:- SPSS Project.rar