Write R code to create the barplot

Assignment Help Other Subject

Reference no: EM132189223

Assignment -

The pdf document should include any numerical results and plots and textual responses and include all the R-code necessary to achieve the correct answer for each part. The R code sections should be in a uniform width font such as Courier.

1. Write R code to create the barplot below of the expression levels of the "Ras-Like Protein Tc4" gene from the golub dataset. Hint: the gene names are listed in golub.gnames.

biocLite(c("hopach"))

library(hopach)

data(golub)

View(golub)

Test the hypothesis that the expression levels of this gene differ between ALL and AML patients at a significance level α = 0.05. Explain all of the steps in your testing procedure, i.e. what is your null hypothesis, why did you choose that particular test, what is your conclusion and why.

2. Set up a for-loop that allows you to take 1000 random samples of size k = 20 from a normal distribution of 'true' mean=10 and standard deviation =5. In the loop, calculate the mean for each sample, and store the sample mean values in a vector sample_mean.

Plot the sample mean values for sample size k = 20 on a plot. Indicate the true mean. (Hint: you can use the function abline to plot a straight line at the true mean value.)

Now create a function that calculates the standard deviation of sample_mean. Put everything in a function that returns the standard deviation of 1000 means for user-specifiable sample size k, true mean m and standard deviation s.

sd_sample_mean<-function(k,m,s){...}

Using this function calculate the standard deviation of the sample mean for k =3 and k=100. Have a look at the lecture notes: what should be the theoretical standard deviation of the mean of samples of size 3 and 100 taken from this distribution?

Plot the standard deviation of the sample mean from your function for random samples of size from 3 to 100. Add a line showing the theoretical expectation for the standard error of the mean

3. Mass spectrometry measurements of the proteome that quantitate the amount of each protein present are known to be NOT normally distributed about the true quantity of protein present. A series of experiments is carried out on wild-type and knockout mutant cell lines. A transcription factor has been deleted in the knockout cell line. For each of the cell lines, 20 replicate measurements of the concentration of a protein X are carried out by mass spectrometry.

wildtype<-c(560,968,3297,1200,858,646,992,2507,2037,546,2929,1171,1389,1958,3149,1165,2257,2120,65,1571)

knockout<-c(589,232,983,2597,827,1363,634,12,643,1889,2840,1291,939,811,3290,525,90,543,2400,3012)

The researchers wish to report the results of these experiments and to determine if the measurements support the idea that deletion of the transcription factor changes the median concentration of protein X present in the cell. Why is it better to report the results of these experiments in terms of the median value of the measurements rather than the mean? Calculate a 95% confidence interval for the median protein X concentration of each cell line. Use a bootstrap approach to test if the medians of the two cell lines differ.

4. It is suspected that mutations in gene X are involved in the response of cancer patients to a drug treatment. Of 236 patients diagnosed with a particular form of cancer, it is found that 82 have a mutation in gene X and the remainder have the normal version of the gene. All of the patients take the drug for one year, of these 87 die within one year and the rest survive. Of the survivors, 42 have a mutation in gene X.

(a) Organize the data into a contingency table and formulate a null hypothesis to test for the dependence of survival on mutation of gene X.

(b) Perform an appropriate test to determine whether the null can be rejected

5. The following data describes the levels of a cellular enzyme and a metabolite in a set 20 experiments

enzyme <- c(0.114, 0.510, 0.722, 1.276, 1.928, 2.150, 2.238, 2.732, 2.758 , 3.015, 3.616, 3.951, 4.281, 5.315, 6.693, 6.964, 7.056, 8.162, 8.216, 8.410)

metabolite <- c(56.1, 60.6, 67.2, 72.7, 80.5, 83.2, 82.2, 88.9, 89.5, 90.6, 94.9, 95.2, 97.1, 96.3, 77.6, 71.6, 69.3, 37.2, 36.0, 26.9)

Show, with appropriate statistical test(s), that the level of the metabolite is dependent on the level of the enzyme.

Find the best-fit polynomial equation that describes this dependence.

Attachment:- Assignment File.rar

Reference no: EM132189223

Questions Cloud

What is survey research : What is program evaluation from research methods perspective? What is survey research? Explain each stage briefly?

Explain the cultural influences in family dynamics : Explain the cultural influences in family dynamics and relationships present and how they might impact your professional responsibilities.

What is the pathophysiology of the disease : What are the cellular and molecular mechanisms of the disease? (at least one diagram explaining the mechanism)

Delivery of the service : Provide an example of a service which is given to customer, explain how you ensure delivery of the service to be prompt and in accordance with legislative

Write R code to create the barplot : Write R code to create the barplot below of the expression levels of the "Ras-Like Protein Tc4" gene from the golub dataset

Greenfield investments-merger and acquisitions investments : Explain Greenfield Investments, and Merger and Acquisitions investments.

Assess how effective the approach is : In the analysis identify how this approach is different from and similar to at least one historical approach. Assess how effective (or ineffective).

Description of realistic commercial behavior : Give an example / description of realistic commercial behavior that is on the border between lawful, aggressive business dealing, and bad faith.

How bumpbie compares with other area employers : What should Paul do to determine how Bumpbie compares with other area employers in terms of wages and benefits? How could Bumpbie use variable pay to motivate.

User Account

All Pages