Reference no: EM132291812
Assignment -
You should include relevant statistical (R and Stata) code showing how you answered the question.
Please provide output for every question. Please do not copy paste R output, make proper tables and present them nicely. Please use web resources to answer your question. Please state your source by providing a weblink or through proper referencing.
Provide code for graphs and graphs in the same place. If you have used code for defining the value labels and variables labels do not provide this code in the main text of results but provide it in the appendix. If you are presenting data summaries (e.g. cross tabulation) then provide them with variable label and value label as shown in the output.
Question 1 - Answer all parts of Question 1 using Stata.
Please see the attached file which details about the Magpie Recovery dataset (Source EURING). Use data set titled "Magpie.dta" to answer the below questions.
a) Read the data into Stata. Perform appropriate checks on the data and produce a summary detailing your findings. Include in the summary appropriate figures (graphs) to assess the distribution of the continuous variables (Recovery time in days, recovery latitude, and recovery longitude).
Having assessed the data, detail the changes you made to 'clean the dataset' and apply required corrections.
b) Compute the following summary statistics for time to recovery in days by survival status (conditionaliveordead) and recovery country: count, mean, median, standard deviation and min and max. Show your code used for these computations and also the data set created. Make clear any assumptions you have made in undertaking this calculation. Present the code with appropriate comments and in clear font size.
c) Graph how the mean time to recovery in days varies by survival status for each country. Include 95% confidence intervals around the means in this graph. Briefly interpret your results. Add appropriate titles, labels, legend and colours to your graph. Produce a single graph.
d) Produce a single graph which compares boxplots of recovery latitude and recovery longitude by survival status (use only Dead and Alive) and recovery country (use only Finland and Sweden) for the recovery year, 1979. Use graphical options in Stata to ensure that each boxplot is displayed in different colour. Do not use default colours. Present your code. Present the plots with appropriate title, legend and font size.
e) For each recovery country, produce a single scatter plot comparing recovery latitude and longitude by survival status. Add proper titles, legends. Using graphical options change the scatter plot picture character to diamonds, circles and squares, use medium font size.
f) Do causes of death vary between months of the year? Are there national or regional differences in causes of death? Show your chosen method of graphical display to answer this question. Use proper titles, labels and legends if any.
g) Are there differences in recovery circumstances (cause of death) of younger and older birds (i.e. birds recovered within one year of ringing and those recovered after more than a year). Show the code method of analysis and display any summary statics or graphs used in answering this question. Present them neatly as if ready to be published in Nature Journal.
Question 2 - Answer all parts of Question 2 using R programming.
Characteristics of the data set -
These records have been collected in the same garden in the East Midlands over 25 years. A few birds, when first trapped in the garden, already have a ring. These include those with foreign rings and a few with British rings. Unringed birds are ringed on their first capture in the garden. The earliest capture in the garden of any individual (should that be relevant to any analysis) can be found by examination of dates of captures.
Catching effort has not been the same from year to year.
Some individual birds have been trapped repeatedly and some recapture information may need to be discarded for some analyses.
Wing length is very difficult to measure without error and so recorded wing lengths on any individuals may be expected to vary by a millimetre or so about the mean. In addition, wing lengths on individuals gradually reduce from late summer to the next summer through abrasion, meaning a Blackbird wing of, say 120 mm in autumn may be only 118 mm before the moult in the following summer. Note that where feathers have clearly lost their tip through severe abrasion or, perhaps close encounter with a predator, the wing length will not be measured.
Feathers on juveniles are also shorter than those on adults but, after the first full moult when the bird moves from First Year to Adult plumage, the wing length will be unlikely to increase further in subsequent moults. In addition to all these possible sources of variation, it is always possible that a wing length has been measured correctly but recorded inaccurately. Examples of errors include transposition of digits (so that 114 becomes 141) or misreading the position on the rule by 10 mm. In the summer moult, new feathers are grown which may be slightly longer than feathers the previous year. Birds which are moulting and with feathers not fully grown will not have the wing measured.
Before doing any analysis using wing lengths it could be useful to look at individual birds which have been trapped more than once and remove any wing lengths which are clearly incorrect. It might also be useful to assign an average wing length to each individual which has been caught more than once in any age class.
For any analysis which involves time of day, note that times in summer are BST so will need one hour subtracting to bring them into line with the winter GMT.
Note that in some cases, either wing or weight may be missing, age or sex may also not be recorded. In some cases, age and sex can be established from other captures of the same bird. Clearly sex should be the same throughout the life of the bird, but age will change with time.
All these various points can give considerable opportunities for practice in data preparation.
Data Set information
|
Field name
|
Description
|
Code(s) and units
|
Scheme
|
The ringing scheme
|
GBT are British rings; all others are continental.
|
Ring number Alphanumeric to identify individuals
|
Age
|
Three age classes are usually distinguishable although sometimes the class cannot be determined and is recorded as Unknown.
|
J - Juvenile
F - First year
A - Adult
U - Unknown
|
Sex
|
Sex can be determined for almost all individuals, others recorded as Unknown.
|
M - Male
F - Female
U - Unknown
|
Wing
|
Length from carpal joint to wing tip.
|
Millimetres to nearest mm
|
Weight
|
Total weight
|
Grams to nearest 1 g
|
Day
|
Numeric 1-31
|
|
Month
|
Numeric 1-12
|
|
Year
|
Numeric, 4 digit
|
|
Time
|
Nearest hour. Local time.
|
Numeric
|
Answer all parts of question 2 using R. Use data set titled "blackbird_new.csv" to answer below questions.
a) Go through each of the variable in the provided data and check for any inconsistencies (e.g. incorrect entries or misplaced entries) in the data set. Detail clearly the inconsistencies found and the modifications made. Present the details of the methods used for correcting these inconsistencies. Plot histograms of weight, wing length by Age. Show the cuts used for creating groups of wing length. Use Google to find how to plot a 3 dimensional histogram in R. Now use the 3d histogram and plot the frequencies of Age and Sex cross tabulation. Briefly (in couple of sentences) explain what you think of this plot.
b) Do a scatter plot for winter birds (November to February inclusive) using weight and wing length. Add the line of best fit to the scatterplot. Add proper titles, axis titles with proper metrics used and legends. Change the appearance of markers from circle to diamonds.
c) Give summaries of the wing length for the following four age sex classes: adult males, adult females, first year males and first year females. Summaries could include either medians and quartiles or means and standard deviations with appropriate graphical summaries. Add error bars too if means used.
d) Using appropriate graphical display show the distributions of wing length by Age. Look for individuals who have been measured as Juveniles or First year and Adults. Using appropriate summary statistics detail how much longer are adult wings than Juvienile/First Year Wings?
e) By looking at the average wing lengths throughout the year (perhaps using day number of the year to measure time of year or else using time in half-month intervals), suggest when the longer winged continental birds arrive and depart. Use appropriate graphical display.
Attachment:- Assignment Files.rar