Reference no: EM132411769
OPRE6301 - Statistics and Data Analysis Assignment - The University of Texas at Dallas, USA
1. Use the following data to compute the unweighted mean suicide rate and weighted mean suicide rate by population size for the seven countries in Asia. You should search online for the most recent population data for those countries.
|
Suicides per 100,000 people
|
Population size
|
South Korea
|
28.9
|
|
Sri Lanka
|
28.8
|
|
Nepal
|
24.9
|
|
Kazakhstan
|
23.8
|
|
India
|
21.1
|
|
Japan
|
18.5
|
|
Bhutan
|
17.8
|
|
Are the unweighted and weighted means similar? Why or why not?
2. Create 1,000 random numbers from a normal distribution with the mean of 45 and the standard deviation of 10.5. Assuming these data are the age of 1,000 new patients in the UT Southeastern Medical Center during the year 2015, use the data to solve the following questions:
(1) Find the percentage of patients who are younger than 30.
(2) Find the age that bounds the oldest 15 percent of the patient distribution.
(3) Suppose Donald was 75 years old, what is the percentile of his age in the entire distribution?
3. The state of Texas is considering a mass immunization campaign using newly-developed vaccines against West Nile virus but concerned about potential side effect from vaccination. On the basis of past experience, 15% of those who got vaccinated with a new vaccine had side effects. Choosing 10 people who are vaccinated at random, what is the probability that more than five will have side effects?
4. Using the "pain_medication.xls" data posted to eLearning, test if the average time for medication to take effect ("time") is equal between male ("gender"=1) and female ("gender"=0) at the 99% significance level. Interpret your test results using appropriate statistical terms, along with its practical implication.
5. Use the "pain_medication.xls" data to create a contingency table using "gender" (1: male, 0: female) as an independent variable and general health condition "health" (1: poor, 2: fair, 3: good) as a dependent variable and then perform the chi-square test if general health conditions are different between male and female respondents. Interpret the results.
6. Use the "pain_medication.xls" data to assess if time for medication to take effect ("time") is affected by age of the patients. Which type of analysis would be more appropriate? Perform the analysis you choose and interpret all major statistical results along with their practical implications.
7. Using the "customer_database.xls" data posted to eLearning, test if the years of education for respondents ("edu") and those for their spouses ("spousedu") are different at the 95% significance level. Interpret your test results using appropriate statistical terms, along with its practical implication.
8. Using the "customer_database.xls" data posted to eLearning, create a frequency distribution table for the Job Satisfaction variable ("jobsat") including the columns for frequency, percentage, cumulative frequency and cumulative percentage. Refer to the "Data Dictionary" sheet for the variable description. Use the table to identify the mode and the median for the variable.
9. Using the "customer_database.xls" data posted to eLearning, compare the age variability between male and female customers. Refer to the "Data Dictionary" sheet for the variable description. Compute the appropriate statistics in Excel and interpret the results.
10. Search online to find out any raw data containing more than one variables. Use the data to consider the causal structure of any two variables in the dataset and define a hypothesis for the relationship between the two variables. Specify which variable serves as a dependent variable (DV) or independent variable (IV) in your hypothesis. Please also illustrate visually the relationship between these two variables using an appropriate chart type, and describe your findings from the output briefly.
Attachment:- Statistics and Data Analysis Assignment Files.rar