Define zero order markov model for sequence

Assignment Help Advanced Statistics
Reference no: EM131008526

Question 1. Profile HMMs for sequence families

a) Define matching (M), insert (I) and delete (D) states of the multiple sequence alignment (MSA) shown in Figure 1

b) Derive parameters of profile HMM for MSA given in figure 1
I. Emission counts for match states
II. Emission counts for insert states
III. Counts of transitions between states
IV. Emission probabilities for match, insert, and hidden states

Figure 1. Multiple sequence alignment of five DNA sequences

T--CT-

-AA-TA

T--CTA

TC-G-A

C-CGAC

Feel free to use Durbin's Figure 5.7c format

2. Provide 1-1.5 page review for the paper "Genome-wide genetic marker discovery and genotyping using next-generation sequencing" available under this week's course content

Some guidelines:
- Underline main points of the paper.
- Keep your work structured.
- While focusing on big picture keep in mind our class is on statistical processes.

3. Use <HW3_solution_reviewN.pptx> file available in course content for this week tor write and submit R-script which will:

a. Define HMM model for Q4 in Homework 3
b. Parse the Homework 3 Q4 sequence to show sequence of hidden states using Viterbi algorithm:;

Homework 3 Solution Question 4: (a) Define zero order Markov model for sequence2_A2, which represents portion of non-coding sequence of Mycobacterium tuberculosis (refer to course content)
zero order for sequence2_A2:
P(A) 107 0.195255474
P(C) 156 0.284671533
P(G) 183 0.333941606
P(T) 102 0.186131387

b) Use zero order Markov models defined for sequence1_A2 and sequence2_A2 and apply Viterbi algorithm to find the most likely path for sequence CGCGTTACTTCAATG without taking frame into consideration

Assume:
Initial transition probabilities
a0c= a0n =0.5
State transition probabilities
acc 0.55
acn 0.45
ann 0.5
anc 0.5

where, aij is transition probability, c- coding, n-non-coding

sequence CGCGTTACTTCAATG
path of hidden states CCCCNNCCNNCCCCC

Attachment:- post.xlsx

Reference no: EM131008526

Questions Cloud

What is a typical value for this data set : Construct a back-to-back stem-and-leaf display for the wireless percentage of the states in the West and the states in the East. How do the distributions of wireless percentages compare for states in the East and states in the West?
Write a perl program that asks a user for a motif : Write a Perl program that asks a user for a motif (like QDSV or MKPL) and returns a message saying whether the motif is found in the sequence or not - Write a program that calculates and prints
Prepare any journal entry necessary as a direct result : Determine the amounts to be reported for each of the five items shown above from the 2009 and 2010 financial statements when those amounts are reported again in the 2009-2011 comparative financial statements.
Was the community experience better or worse than expected : Newgroveton is a community of 445,000. In the most recent year, there were 750 new cases of disease A in the community. Assume the expected incidence rate for disease A is 245 per 100,000 people. Was the community's experience better or worse than..
Define zero order markov model for sequence : Page review for the paper "Genome-wide genetic marker discovery and genotyping using next-generation sequencing" available under week's course content
The energy stored in the dielectric in joules : A dielectric slab with 500mm x 500mm cross-section is 0.4m long. The slab is subjected to a uniform electric field of E = 6ax + 8aykV /mm . The relative permittivity of the dielectric material is equal to 2. The value of constant ε0 is8.85 × 10-12F /..
What is the mad for the moving average forecast : What is the forecast for year 13 based on the 5-year moving average? What is the forecast for year 13 based on the 5-year weighted moving average? What is the MAD for the moving average forecast
Summarize this information using a comparative bar graph : Summarize this information using a comparative bar graph that shows differences between males and females within the two different age groups. Comment on the interesting features of your graphical display.
The effective capacitance across the terminals : Three capacitors C1, C2, and C3 whose values are 10µF, 5µF and 2µF respectively, have breakdown voltages of 10V, 5V and 2V respectively. For the interconnection shown, the maximum safe voltage in Volts that can be applied across the combination and t..

Reviews

Write a Review

Advanced Statistics Questions & Answers

  Are the chosen analyses appropriate for the variables

Are the chosen analyses appropriate for the variables/relationships under investigation, and are the assumptions underlying these analyses met? Are the analyses carried out correctly?

  Use of probability for business decisions

In your work environment, identify a probability (either stated or assumed) that is a premise for making decisions and directing action. Potential candidates are customer or competitor behavior, technology or regulatory change, and "forces of natu..

  What is blocking and how does it reduce noise

Explain the difference between multiple independent variables and multiple levels of independent variables. Which is better and what is the difference between a cell (condition) mean and the means used to interpret a main effect?

  Computing budgeted gross profit

For Nolte Company, the budgeted cost for one unit of product is direct materials $10, direct labor $20 and manufacturing overhead 90% of direct labor cost.

  Give an expression for the pmf of nb as a function of t

Let {NB(t); t ≥ 0} be the counting process of the total number of arrivals. Give an expression for the PMF of NB(t) as a function of t.

  Break even point and fixed cost

For 2011, Flint Corporation sold 100,000 units of its profit for $20 each. The variable cost per unit was $12, and Flint's margin of safety was 30,000 units. What was the amount of Flint's total fixed costs?

  Description of sampling

A polling company obtains an alphabetical list of names of voters in a precinct. They select every 20th person from the list until a sample of 100 is obtained. They then call these 100 people. Does this sampling plan result in a random sample?

  Analyzing production costs

A small publishing company is planning to publish a new book. The production costs will include onetime fixed costs (such as editing) and variable costs [such as printing).

  What portion of variation in stock price percentage change

what portion of variation in stock price percentage change is explained by the percent change in profit and what is the approximate predicted value for tips if the total bill is $100?

  Show that sn has a density that is positive

Then show that Sn has a density that is positive for all t > 0.

  Real estate profit for desired return

Kevin wants a return of 50% from his one-year investment in real estate. He believes that he can sell the property at the end of the year for $250,000 and that the property will provide him with income of $50,000.

  Find the expected value and variance of the total ounces

Find the expected value and variance of the ounces of beer sold in pitchers on Friday and find the expected value and variance of the total ounces of beer sold on Friday.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd