Reference no: EM132756131
BUS105 Computing Assignment - Math Questions
Overview - You need to submit a word file with the answers to 10 questions the first 8 are about the dataset the last question is a paraphrasing task (refer to pages 3 to 6)
You will use your dataset and the automatic dataset summarizer to get the descriptive statistics that are used questions 1 to 5 and the inferential statistics that are used in question 6 to 8. To check you have correctly obtained your dataset check both p-values are correct when you investigate both categorical variables (question 6 to 8)
The word count can be less than 1500 words if you are giving answers that demonstrate you have understood the material.
Summary of the datasets (question 1 to 8 given on pages 3 to 6 are about the datasets)
Dataset 1
University XYZ gives out a survey to students in a statistics course
The survey questions were
Do you think the course is useful and do you understand why?
How many videos have you watched?
The questions and the students' answers are a dataset
Dataset 2
University XYZ gives out a survey to students in a statistics course
The survey questions were
What style of Youtube video do you prefer, chatty or direct?
Are you scared of maths
How many videos did you watch?
The questions and the students' answers are a dataset
Dataset 3
Business XYZ is using videos to replace meetings to maintain social distancing
The duration of the video (in seconds) and engagement score is recorded for many videos
The engagement score is low if people only watch the first part of the video. "
Question 1 -
Paste dataset 1 into the dataset summarizer
a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables "course useful?" and "number of videos watched ?" using the sample
b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (choose one)
Difference between sample means x-1 - x-2
Difference between sample proportions p^1 - p^2
correlation coefficient r
Question 2 -
Paste dataset 2 into the dataset summarizer
a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables "Preferred style?" and "Scared of maths?" using the sample
b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)
Difference between sample means x-1 - x-2
Difference between sample proportions p^1 - p^2
correlation coefficient r
Question 3 -
Paste dataset 3 into the dataset summarizer
a) Paste in the descriptive sample statistics and the scatterplot into the word file. The descriptive statistics let you investigate the relationship between the variables "Duration?" and "Engagement score?" using the sample
b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics
Difference between sample means x-1 - x-2
Difference between sample proportions p^1 - p^2
correlation coefficient r
c) Predict the engagement score of a video with duration 600.
Question 4 -
Use the output for question 1a
Just considering the people that do not find the course useful find the zscore of the sample mean if you assume the population mean is µ=5 and the population standard deviation is σ=3
Just considering the people that do find the course useful find the zscore of the sample mean if you assume the population mean is µ=5 and the population standard deviation is σ=3
Question 5 -
Just considering the people that prefer the chatty style of video find a 90% confidence interval for the proportion of people that are scared of maths
Just considering the people that prefer the direct style of video find a 90% confidence interval for the proportion of people that are scared of maths
Question 6 -
Paste dataset 1 into the dataset summarizer
a) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables "course useful?" and "number of videos watched ?" if you consider the whole population
b) Make suitable comments about the output in part (a)
c) Go back to the dataset summarizer and scroll down , Paste in the output for question 6c given below the inferential statistics and fill in the blank , replace the blank with a number that would make the p-value lower than the p-value in question 6a
Question 7 -
Paste dataset 2 into the dataset summarizer
a) Paste in computer output that measure evidence for the claim there is a relationship between the variables "preferred style ?" and "scared of Maths?" if you consider the whole population
Hint: inferential statistics measure evidence for a claim.
b) Make suitable comments about the output in part (a)
c) Go back to the dataset summarizer and scroll down , Paste in the output for question 7c given below the inferential statistics and fill in the blanks, you have to replace the blanks with numbers that give a smaller p-value than the p-value in question 7a , Note that the total of blanks must also agree with the existing total as well.
Question 8 -
Paste dataset 3 into the dataset summarizer
a) Paste in computer output that measures evidence for the claim there is a relationship between the variables "Duration?" and "engagement score?" if you consider the whole population
Hint: inferential statistics measure evidence for a claim.
b) Make suitable comments about the output in part (a)
c) If another sample had a higher correlation would you expect the pvalue to be lower or higher ?
Question 9 -
Briefly discuss the sample report given in the link below 300 words is enough, in particular discuss the dataset, how the data was analysed and the main message of the report.
Question 10 -
Give a quick comment about the discussion of p-values given in the link below, 300 words is enough.
For each case discuss the relationship between p-value and the percentile of p-value (in other words discuss the distribution of p-value). Note that in the first case discussed there is a large difference in population means so there is a strong relationship in the population, how would you describe the distribution of p-value? In the second case there is almost no difference between the means so there is almost no relationship, how would you describe the distribution of p-value?