Simple linear regression

Assignment Help Basic Statistics
Reference no: EM13534472

Simple Linear Regression

Dataset Need: IQ

1. Relationship Between Eighth Grade IQ and Ninth grade Math Score

For a statistics class project, students examined the relationship between x = 8th grade IQ and y = 9th grade math scores for 20 students. The data are displayed below.

Student Math Score IQ Abstract Reas
1 33 95 28
2 31 100 24
3 35 100 29
4 38 102 30
5 41 103 33
6 37 105 32
7 37 106 34
8 39 106 36
9 43 106 38
10 40 109 39
11 41 110 40
12 44 110 43
13 40 111 41
14 45 112 42
15 48 112 46
16 45 114 44
17 31 114 41
18 47 115 47
19 43 117 42
20 48 118 49

Open the dataset IQ found in the Datasets folder in ANGEL. Perform a linear regression with the Response (dependent variable) math score and the variable IQ as the Predictor (independent variable). Store/Save the (unstandardized) Residuals and Fitted(Predicted) values. These will be stored in the fourth and fifth columns of the data worksheet. The output should look as follows:

SPSS: Regression Analysis: Math Score versus IQ


a. Explain this equation. Discuss slope as change in Y per unit change in X in context of the variables used in this problem



b. Create a scatter plot of the measurements by selecting Math Score for the y-axis (response) and IQ for the x-axis (predictor). Describe the relationship between math score and IQ. Minitab Users: Graph > Scatter Plot > Simple. SPSS Users: Graphs > Legacy Dialogues > Scatter/Dot > Simple Scatter



c. One of the students with a high IQ (number 17) appears to be an outlier. With a sample size of only 20 this can affect our normality assumption. Also, the constant variance assumption could be compromised. We can visually check for constant variance using a Residual Plot and test for normality using a Probability Plot (or Q-Q plot)t. To get a residual plot, simply create a Scatterplot using the Residuals as the y-variable and the Fitted(Predicted) Values as the x-variable. (Remember these should have been stored/saved when you first performed the regression per instructions above. If not, re-run regression and click store/save and click the boxes for unstandardized residuals and fits(predicted) values.)

Now create a probability plot (Q-Q plot if using SPSS) of the residuals. We are provided the results of a test of the null hypothesis that the data follows a normal distribution. Based on these two graphs and what you have learned about hypothesis testing, what interpretations do you come to regarding the assumptions of constant variance and normality?




Minitab Users: Probability plot go to Graphs > Probability Plot > Single and select Residuals

SPSS Users: Q-Q plot with normal test go to Analyze > Descriptive Statistics > Explore and enter Unstandardized Residuals in Dependent List click Plots and select box for Normal plots with tests


d. The least squares regression line for predicting math score from IQ is given in the above output. What is the fitted regression line (i.e. regression equation)?



e. What do the Fitted (predicted) values and Residuals represent?



f. Based on the output, what is the test of the slope for this regression equation? That is, provide the null and alternative hypotheses, the test statistic, p-value of the test, and state your decision and conclusion.




2. Although outliers should never be deleted without a reason, there are several reasons why it may be legitimate to conduct an analysis without them. Delete the IQ data point for row 17 and re-calculate the regression line for the remainder of the data . You should obtain the following output:



SPSS:

1.) Directions: Read the following problem. In your drop box submission, please include the following information:
a. Determine if the problem is either a test of hypothesis, a confidence interval or something else and specify the 'key words' found in the problem that demonstrate your choice.
b. Determine the procedure name and parameters involved for each problem (use the Stat 200 Formulas and Techniques Summary document.) Specify the 'key words' found in the problem that lead you to this choice.
c. If the problem is a hypothesis test, indicated if it has a lower-tail, upper-tail, or two-tail alternative hypothesis as well as the test statistic formula for the test. If the problem is a confidence interval then indicate the formula used for the margin of error.
You DO NOT need to do the ACTUAL test of hypothesis, confidence interval, etc. in order to submit in the dropbox, just answer parts a, b, and c for the problem.

A large organization is being investigated to determine if its recruitment is sex-biased. Tables 1 and 2, respectively, show the classification of applicants for sales and for secretarial positions according to gender and result of interview. Table 3 is an aggregation of the corresponding entries of Table 1 and Table 2.
a. According to the data in tables 1 and 2, does there seem to be an association with gender and hiring status at the 5% significance level?
b. Does the data in Table 3 indicate the same result.

Table 1 Sales Positions

Offered
Denied
Total

Male
25
50
75

Female
75
150
225

Total
100
200
300


Table 2 Secretarial Positions

Offered
Denied
Total

Male
150
50
200

Female
75
25
100

Total
225
75
300


Table 3 Secretarial and Sales Positions

Offered
Denied
Total

Male
175
100
275

Female
150
175
325

Total
325
275
600





2.) Directions: Read the following problem. In your drop box submission, please include the following information:
a. Determine if the problem is either a test of hypothesis, a confidence interval or something else and specify the 'key words' found in the problem that demonstrate your choice.
b. Determine the procedure name and parameters involved for each problem (use the Stat 200 Formulas and Techniques Summary document.) Specify the 'key words' found in the problem that lead you to this choice.
c. If the problem is a hypothesis test, indicated if it has a lower-tail, upper-tail, or two-tail alternative hypothesis as well as the test statistic formula for the test. If the problem is a confidence interval then indicate the formula used for the margin of error.
You DO NOT need to do the ACTUAL test of hypothesis, confidence interval, etc. in order to submit in the dropbox, just answer parts a, b, and c for the problem.


An instructor at Arizona State University asked a random sample of eight students to record their study times in a elementary statistics course. She then made a table for total hours studied over 2 weeks and test scores at the end of the 2 weeks. Here are the data:

Study time 10 15 12 20 8 16 14 22
Test Scores 92 75 86 76 92 80 84 81

Assuming that a linear association exists, how are the data correlated? Determine the exact linear relationship between study time and test scores and use it to estimate the predicted test score for a student who studies 12 hours.





3.) Directions: Read the following problem. In your drop box submission, please include the following information:
a. Determine if the problem is either a test of hypothesis, a confidence interval or something else and specify the 'key words' found in the problem that demonstrate your choice.
b. Determine the procedure name and parameters involved for each problem (use the Stat 200 Formulas and Techniques Summary document.) Specify the 'key words' found in the problem that lead you to this choice.
c. If the problem is a hypothesis test, indicated if it has a lower-tail, upper-tail, or two-tail alternative hypothesis as well as the test statistic formula for the test. If the problem is a confidence interval then indicate the formula used for the margin of error.
You DO NOT need to do the ACTUAL test of hypothesis, confidence interval, etc. in order to submit in the dropbox, just answer parts a, b, and c for the problem.

A company sells a strong commercial floor cleaner and claims that the flashpoint (the lowest temperature at which the vapor of a combustible liquid can be ignited in air) exceeds 200ºF. A random sample of cleaner was obtained and the flashpoint of each was measured. The sample mean was 198.2ºF and the sample standard deviation was 10ºF. At the 1% significance level, is there sufficient evidence to support the companies claim?


4.) Directions: Read the following problem. In your drop box submission, please include the following information:
a. Determine if the problem is either a test of hypothesis, a confidence interval or something else and specify the 'key words' found in the problem that demonstrate your choice.
b. Determine the procedure name and parameters involved for each problem (use the Stat 200 Formulas and Techniques Summary document.) Specify the 'key words' found in the problem that lead you to this choice.
c. If the problem is a hypothesis test, indicated if it has a lower-tail, upper-tail, or two-tail alternative hypothesis as well as the test statistic formula for the test. If the problem is a confidence interval then indicate the formula used for the margin of error.
You DO NOT need to do the ACTUAL test of hypothesis, confidence interval, etc. in order to submit in the dropbox, just answer parts a, b, and c for the problem.

A survey of 500 households found that the family room was the primary television location in 415 homes. Is there evidence that the true population proportion of households having the family room as the primary television location is actually less than .85 at the 5% significance level?

Reference no: EM13534472

Questions Cloud

Evaluate the expression : The number of miles driven each day for a five-day trip. Let an ordered pair be given in the form (day, miles).
Find his rate on the side roads : During rush hour, Fernando can drive 20 miles using the side roads in the same time that it takes to travel 15 miles on the freeway.  If Fernando's rate on the side roads is 9 mi/h faster than his rate on the freeway, find his rate on the side roads.
Describe an algorithm you could use that would output each : Given a list containing Province, CustomerName and SalesValue (sorted by Province and CustomerName), describe an algorithm you could use that would output each CustomerName and SalesValue with the total SalesValue per Province.
Find a given person telephone number : Given an alphabetically sorted list of 500,000 people’s names and telephone numbers, describe an algorithm that you could implement that would allow you to find a given person’s telephone number in the shortest amount of time.
Simple linear regression : Simple Linear Regression
The warren area regional transit authority : WARTA, the Warren Area Regional Transit Authority
Mutually exclusive then these two events will be independent : Mutually exclusive then these two events will be independent
Grade point averages : Calculate the mean, median of the following grade point averages
Why is the central limit theorem used : Why is the Central Limit Theorem used?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Statistics-probability assignment

MATH1550H: Assignment:  Question:  A word is selected at random from the following poem of Persian poet and mathematician Omar Khayyam (1048-1131), translated by English poet Edward Fitzgerald (1808-1883). Find the expected value of the length of th..

  What is the least number

MATH1550H: Assignment:  Question:     what is the least number of applicants that should be interviewed so as to have at least 50% chance of finding one such secretary?

  Determine the value of k

MATH1550H: Assignment:  Question:     Experience shows that X, the number of customers entering a post office during any period of time t, is a random variable the probability mass function of which is of the form

  What is the probability

MATH1550H: Assignment:Questions: (Genetics) What is the probability that at most two of the offspring are aa?

  Binomial distributions

MATH1550H: Assignment:  Questions:  Let’s assume the department of Mathematics of Trent University has 11 faculty members. For i = 0; 1; 2; 3; find pi, the probability that i of them were born on Canada Day using the binomial distributions.

  Caselet on mcdonald’s vs. burger king - waiting time

Caselet on McDonald’s vs. Burger King - Waiting time

  Generate descriptive statistics

Generate descriptive statistics. Create a stem-and-leaf plot of the data and box plot of the data.

  Sampling variability and standard error

Problems on Sampling Variability and Standard Error and Confidence Intervals

  Estimate the population mean

Estimate the population mean

  Conduct a marketing experiment

Conduct a marketing experiment in which students are to taste one of two different brands of soft drink

  Find out the probability

Find out the probability

  Linear programming models

LINEAR PROGRAMMING MODELS

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd