Reference no: EM132858923
GEOG 2700 Geographical Data and Analysis - University of Lethbridge
1) First, a basic review question. These are five measurements.
Standardize the following sample data to z values (subtract mean and divide by s). You will have a new standardized value for each of these five values. Give the five new standardized values.
11.1
15.4
19.4
16.6
10.0
Sample statistics: the following figures are the distances (km) from five villages to the source of water that they need to collect and bring back for village use.
9.3
3.9
10.5
5.5
11.1
11.2
Determine the following:
2) mean distance and median distance:
3) sample variance:
4) standard deviation:
5) standard error of the mean:
Binomial distribution probabilities
Suppose we need to irrigate an farm. We are planning for the coming 6 years. In the past 30 years, 9 years were dry enough to require irrigation, so we will use that experience as our baseline probability of needing irrigation in any given year. During the next 6 years, we want to estimate the following things so we can plan on whether we should buy equipment.
6) The probability that irrigation will be needed in exactly two of the next 6 years.
Show the value of p, n, and X, and your calculated P(X).
7) The probability that irrigation will be needed in more than two of the next 6 years.
Show the value of p, n, and X, and your calculated P(X).
8) Choose the best statistical method that would apply. We collect twenty jars of water from the surface (top 0.5 m) at random locations near the middle of a small lake. We send them for chemical analysis of a certain herbicide. We want to know if the concentration of herbicide in the lake exceeds 5.0 parts per million. The twenty quantities, one for each jar, with jar labels, come back to us from the lab in an email.
a) analysis of variance
b) chi-squared analysis
c) regression
d) correlation
e) one-sample t test
f) two-sample t test
g) combinations and permutations
9) Choose the best statistical method that would apply. We collect twenty jars of water from the surface (top 0.5 m) starting at the shore and moving out, with a jar of water collected roughly every 30 m out from the shore, then 60 m, and so on, towards the middle of the lake. We record jar label number and distance from the shore, for each. We send them for chemical analysis of a certain herbicide. We want to know if the concentration of herbicide in the lake exceeds 5.0 parts per million, and whatever else we can conclude. The twenty quantities, one for each jar, with jar labels, come back to us from the lab in an email.
a) analysis of variance
b) chi-squared analysis
c) regression
d) correlation
e) one-sample t test
f) two-sample t test
g) combinations and permutations
10) Choose the best statistical method that would apply. We go to the middle of a lake and collect eight jars of water from the surface (top 0.5 m) and then use a sampler device on a rope to collect eight jars from 10 m deep, from a lake that has no inflow. We then do the same for another lake in the area with no inflow, and the same for two lakes that do have streams flowing in. We label the jars from the four lakes and two depths, and send all the samples for chemical analysis of a certain herbicide. We want to know if the concentration of herbicide in the lake exceeds 5.0 parts per million, and whatever else we can conclude regarding inflow. The data for the list of jars, with labels, come back to us from the lab in an email.
a) analysis of variance
b) chi-squared analysis
c) regression
d) correlation
e) one-sample t test
f) two-sample t test
g) combinations and permutations
11) Choose the best statistical method that would apply. We collect information on the algae status of lakes in a certain county. We record whether the algae levels of each lake are negligible, low, medium, or high. We also have data on whether soil erosion around each lake is negligible, low, moderate, high, very high, or extreme, based on field assessments and observations. We would like to know if there might be some connection between algae growing in a lake and the soil erosion around it.
a) analysis of variance
b) chi-squared analysis
c) regression
d) correlation
e) one-sample t test
f) two-sample t test
g) combinations and permutations
Suppose one of your collaborators tells you they did a small study of weights collected in certain recycling programs, in some geographic regions (which they call groups), and they just want to know if the differences were significant, so they can decide on how to plan a larger future study. Help them interpret the results from SPSS ANOVA table below. The study included three groups, with equal numbers of measurements in each.
12) How many measurements did they have in total (numbers in the dataset)? How many were in each group?
13) Show how the value of F was calculated.
14) Does the F test indicate a significant difference among the group means or not? What does "Sig." mean?
15) Write the null hypothesis. Do we accept or reject it?
We have been provided field records and information on severity of erosion of land, assessed by experts in soil and vegetation cover. They recorded the severity for 350 parcels of land, in three land cover types: grassland, scrubland, or forest. Severity of erosion was rated as Severe, Moderate, Slight, or None, as shown below. The parcels of land were added up to make the table below. These numbers are counts. For example, 26 parcels of grassland had erosion category "severe". We would like to determine whether there was an overall significant difference in severity of erosion of the land cover classes.
You will use a chi-squared test to answer the question "Does erosion seem to differ in severity in different land cover classes?" Show your sums and expected values.
16) What is the null hypothesis?
17) Calculate the row totals, column totals, and expected values for each of the 12 categories in the table. Present the expected values in a table here.
18) Compute the value of the chi-squared statistic.
19) Which critical value in the table below would you compare to your calculated value?
20) Make a conclusion. Do you reject the null hypothesis or not? State your conclusion in plain language.
Someone brings you this partially completed ANOVA table. They obtained data on wheat yield from fields that had sandy soil, clay soil, or loam soil. The study had equal numbers of fields in each soil type. You are trying to help them understand it and conduct a test to see if wheat yield differed according to soil type.
21) What is the null hypothesis?
22) Complete the calculations of the Sum of squares column (write in the value of SSE).
23) Calculate the MS for Soil type and the MS for Error. What does MS stand for?
24) Calculate F.
25) The critical F from a table for 2, 24 df is 3.40. Compare it to your F value. Do you accept or reject the null hypothesis?
We are told that the density of a plant (stems per unit area) might vary as a function of distance from the river. Farther away, they could be more densely growing, or less densely. We are given data to see if there appears to be a relationship between density and distance, in this case. The data are shown, with a table from simple linear regression of these x and y variables. This output is from a different kind of software (JMP) but you need to see if you can read it and interpret in a similar way as SPSS.
26) Does the density decline with distance, increase with distance, or stay about the same?
27) Calculate r-squared, how much of the variability in density is explained by distance.
28) What is the null hypothesis?
29) Accept or reject the null hypothesis.
30) What is the regression equation that gives a fitted line predicting density from distance
density =
31) Calculate the correlation coefficient r for these two variables, which were posted on an archery clubhouse wall after eight people participated in a contest.
hours of practice score in archery contest
15 290
30 240
6 188
25 294
20 296
10 250
30 288
8 200
32) Give an example of a case in which we would consider spatial autocorrelation (short answer).
33) We compiled a class forum of news items regarding data or geographic variables. In a few lines, describe one that you found interesting, with reference to the data that was mentioned or described.
34) How many independent variables are used in a multiple regression, and how many dependent variables?
35) In the graphs below, guess at the correlation.
which ones is most likely to have r = 0.5 ?
which ones is most likely to have r = -0.5 ?
which ones is most likely to have r = 0.1 ?
which ones is most likely to have r = 0.98 ?
36) Bonus point: which data analysis method did you feel was most likely to be useful or interesting to you after this course?
Attachment:- Geographical Data and Analysis.rar