Reference no: EM132367453
Assignment
Answer all questions specified on the problem and include a discussion on how your results an- swered/addressed the question.
Please do the following problems from the text book R Handbook and stated.
1. Question 1.1, pg. 23 in Handbook this question will require you to make some assumptions. List your assumptions and how you interpreted the question.
Ex. 1.1 Calculate the median profit for the companies in the US and the median profit for the companies in the UK; France, and Germany.
2. Question 1.2,
Ex. 1.2 Find all German companies with negative profit.
3. Question 1.3,
Ex. 1.3 'lb which business category do most of the Bermuda island companies belong?
4. Question 1.4,
Ex. 1.4 For the 50 companies in the Forbes data set with the highest profits, plot sales against assets (or some suitable transformation of each variable), labeling each point with the appropriate country name which may need to be abbreviated (using abbreviate) to avoid making the plot look too 'messy'.
5. Question 1.5,
Ex. 1.5 Find the average value of sales for the companies in each country in the Forbes data set, and find the number of companies in each country with profits above 5 billion US dollars.
6. Question 2.1,
Ex. 2.1 The data in Table 2.3 are part of a data set collected from a survey of household expenditure and give the expenditure of 20 single men and 20 single women on four commodity groups. The units of expenditure are Hong Kong dollars, and the four commodity groups are housing housing, including fuel and light, food foodstuffs, including alcohol and tobacco, goods other goods, including clothing, footwear, and durable goods, service services, including transport and vehicles.
The aim of the survey was to investigate how the division of household expenditure between the four commodity groups depends on total expen¬diture and to find out whether this relationship differs for men and women. Use appropriate graphical methods to answer these questions and state your conclusions.
Table 2.3: household data. Household expenditure for single men and women.
housing
|
food
|
goods
|
service
|
gender
|
820
|
114
|
183
|
154
|
female
|
184
|
74
|
6
|
20
|
female
|
921
|
66
|
1686
|
455
|
female
|
488
|
80
|
103
|
115
|
female
|
721
|
83
|
176
|
104
|
female
|
614
|
55
|
441
|
193
|
female
|
.801
|
56
|
357
|
214
|
female
|
396
|
59
|
61
|
80
|
female
|
864
|
65
|
1618
|
352
|
female
|
845
|
64
|
1935
|
414
|
female
|
404
|
97
|
33
|
47
|
female
|
7.Question 2.3,
Ex. 2.3 Mortality rates per 100,000 from male suicides for a number of age groups and a number of countries are given in Table 2.5. Construct side¬by-side box plots for the data from different age groups, and comment on what the graphic tells us about the data.
Table 2.5: suicides2 data. Mortality rates per 100,000 from male suicides.
|
A25.34
|
A35.44
|
A45.54
|
A55.64
|
A65.74
|
Canada
|
22
|
27
|
31
|
34
|
24
|
Israel
|
9
|
19
|
10
|
14
|
27
|
Japan
|
22
|
19
|
21
|
31
|
49
|
Austria
|
29
|
40
|
52
|
53
|
69
|
France
|
16
|
25
|
36
|
47
|
56
|
Germany
|
28
|
35
|
41
|
49
|
52
|
Hungary
|
48
|
65
|
84
|
81
|
107
|
Italy
|
7
|
8
|
11
|
18
|
27
|
Netherlands
|
8
|
11
|
18
|
20
|
28
|
Poland
|
26
|
29
|
36
|
32
|
28
|
Spain
|
4
|
7
|
10
|
16
|
22
|
Sweden
|
28
|
41
|
46
|
51
|
35
|
Switzerland
|
22
|
34
|
41
|
50
|
51
|
UK
|
10
|
13
|
15
|
17
|
22
|
USA
|
20
|
22
|
28
|
33
|
37
|
8. Using a single R expression, calculate the median absolute deviation, 1.4826 median |x-µ|, where µ is the sample median. Use the dataset chickwts. Use the R function mad() to verify your answer.
9. Using the data matrix state.x77, find the state with the minimum per capita income in the New England region as defined by the factor state.division. Use the vector state.name to get the state name.
10. Use subscripting operations on the dataset Cars93 to find the vehicles with highway mileage of less than 25 miles per gallon (variable MPG.highway) and weight (variable Weight) over 3500lbs. Print the model name, the price range (low, high), highway mileage, and the weight of the cars that satisfy these conditions.
11. Form a matrix object named mycars from the variables Min.Price, Max.Price, MPG.city, MPG.highway, EngineSize, Length, Weight from the Cars93 dataframe from the MASS package. Use it to create a list object named cars.stats containing named components as follows:
a)A vector of means, named Cars.Means
b)A vector of standard errors of the means, named Cars.Std.Errors
12. Use the apply() function on the three-dimensional array iris3 to compute:
a) Sample means of the variables Sepal Length, Sepal Width, Petal Length, Petal Width, for each of the three species Setosa, Versicolor, Virginica
b) Sample means of the variables Sepal Length, Sepal Width, Petal Width for the entire data set.
13. Use the data matrix state.x77 and the tapply() function to obtain:
a) The mean per capita income of the states in each of the four regions defined by the factor state.region
b) The maximum illiteracy rates for states in each of the nine divisions defined by the factor state.division
c) The number of states in each region
14. Using the dataframe mtcars, produce a scatter plot matrix of the variables mpg, disp, hp, drat, qsec. Use different colors to identify cars belonging to each of the categories defined by the carsize variable in different colors.
carsize = cut (mtcars[,"wt"], breaks= c(0,2.5,3.5,5.5),
+ labels = c("Compact","Midsize","Large"))
15. Use the function aov() to perform a one-way analysis of variance on the chickwts data with feed as the treatment factor. Assign the result to an object named chick.aov and use it to print an ANOVA table.
16. Write an R function named ttest() for conducting a one-sample t-test. Return a list object containing the two components:
• the t-statistic named T;
• the two-sided p-value named P.
Use this function to test the hypothesis that the mean of the weight variable (in the chickwts dataset) is equal to 240 against the two-sided alternative. For this problem, please show the code of function you created as well as show the output.