Characteristic of the transition and transversion mutations

Assignment Help Other Engineering
Reference no: EM13981457

PART A: Term Papers # A.1 & A.2

NEEDLEMAN and WUNSCH (NW)& SMITH-WATERMAN (SW) ALGORITHMS

Each student is assigned a sequence pair (1 & 2) as in Table TP-NW/SW Exercises. Develop a computer algorithm and build a code (MatLab or C/C++) to perform NW/SW algorithm based computationsas indicated for each student based on the comparison between the two given sequences.Elucidate the optimal path-way (that is, optimal global/local alignment).

If you are not familiar with any programming, you may do hand calculation and give your step-by-step results. Due credit will be given.

Table TP-NW/SW Exercises

Pairs

Sequence pairs for SW exercise

NW/SW

(A.1 & A.2

1

2

F F T E E Q S D I E D N C Q T

D F T Q E T E D I E D N C Q Q

NW

1

2

F R F Q N T I L D G G A E E G

F Q F Q N T I S Y Y G G E L D

SW

General Format of TPs:

You are required to supplement your answers (Term-papers) with appropriate and relevant (state-of-the-art) details plus the particulars as needed. Each of your term-paper should include the following:

• One page Executive Summary

• An elaborate description of the topic assigned with relevant references. You may supplement your answers and augment your concepts with appropriate cross-references as necessary. All such references should be clearly identified and listed in a standard format such as IEEE journal publication format. Any Web page reference can be shown by its title and web-site address. You are encouraged to append the hard copies of such references with your solutions

Term Paper # B.1 :Descriptive Study Projects

EXERCISE B.1.D

Use the following link to find the sequence of yeast clone #71020.

https://genome-www4.stanford.edu/cgi-bin/SGD/getSeq?map=a3map&seq=71020&flankl=&flankr=&rev=

Use Genscan to find the ORFs in this sequence using "Vertebrate" as your organism.

How many complete genes are there?
How many of the complete genes have introns?
How many amino acids are there in ORF #3?

Copy the predicted protein sequence from ORF #3 and use that sequence to perform an appropriate search to determine the identity of the protein and the gene that encodes it.

What is the name of the gene that encodes this protein?

Based on the gene acronym (and other information that you might have already found), in what molecular process do you suppose this gene is involved?

Locate the DNA sequence of the gene. There are many ways to do this, but all of them should get you to the same answer. As a hint to be sure you're on the right track, the first few bases of the ORF are ATGGCAAAAACG.

PAIRWISE SEQUENCE COMPARISON Dot-plot, Needleman-Wunsch (NW) and Smith-Waterman (SW) Algorithms

Using Dotlet Program

The reason why we wrote dotlet is that we needed a diagonal plot tool for the December 1998 practical sessions in bioinformatics at the Institute of Biochemistry. Since we had decided to base all the practical sessions on the World-Wide Web, we needed a program that would run in a web browser. To our knowledge, there was none, so we wrote it.

Reference: T. Junier and M. Pagni: Dotlet: diagonal plots in a Web browser,BIOINFORMATICSAPPLICATIONS NOTE, Vol. 16 no. 2 2000, Pages 178-179

Dotlet: diagonal plots in a Web browser

Problem B.1 mutational changes

Construct a matrix of the set {A, C, T, G} to illustrate the characteristic of the transition and transversion mutations.

(Hint: You may use a score of 100 % to depict the element of the matrix pertinent to no mutation and use prorated percentages to represent other elements illustrating the characteristic as above. The spontaneous base substitutions ratio of transitions to transvGiersions is approximately 2:1. Therefore each transition should have a probability of 2/3and each transversion 1/3).

 Problem # C.1

For the two binary sequences X and Y, indicated above in Problem C. 13, plot the Kulback-Leibler (KL) measure between the strings. Hence confirm the most common substring locations between them as decided via HD measure in the previous problem.

(Hint: Again, select a window of size 4. For a given sequence in each window, calculate KL measure. Plot window # versus KL = KL1 + KL2 for each string

KL1 = (p(0)loge[(p(0)/q(1)])window#1 + ....

KL2 = (q(1)loge[(q(1)/p(0)])window#1 + ....

p(0): Probability of 0 in that window; q(1): Probability of 1 in that window)

Problem D.1

Construct a dot-plot for the following pair of sequences using the matrix methods described in the example:

x:         G T G A C C G C T A A C C T C

y          G T T G C GA C T G C G G C G T

Problem D.1(A):

Construct a dot-plot for the following pair of sequences using the dotlet or any other compatible program available as an open source

x:         G T G A C C G C T A A C C T CA C G T T A C

y          T T T G C GA C T G C G G C G T C C C T A A G C

-----------------------------------------------------------------------------------------------------------------------

Problem  D. 2

Assigned is a pair of amino acid sequences (S and T). Determine the best global alignment

S: C U U A C G C A

T: A U G A G A A C U U  

Problem D.3

Given a sequence pair,X and Y as indicated below, determine the best global alignment via trace-back using NW algorithm

X:        G A GC A                              Y:        G A T T C A 

Problem D.4

Given a sequence pairs,U and V as indicated below, determine the best global alignment via trace-back using NW algorithm

U:        C T C G T                               V:        C TA A G T 

Problem D.5

Via hand calculations, perform NW-algorithm based comparison between the two given sequences indicated belowand elucidate the maximum path-way:

MA V R K L S L E G

M S T A L P G L G S

Problem D.5(A):

Via hand calculations, perform NW-algorithm based comparison between the two given sequences and elucidate the maximum path-way:

Sequence Pair

W F G Q E T S A I S

SF T Q F S E D A I

Problem D.6

Given a sequence pairs,X and Y as shown below, determine the best local alignment via trace-back using SW algorithm.

X:        W R N D C Q E G S A          Y:         W G Q E G S I E A

Problem D.6(A) :

Given a sequence pairs,U and V as shown below, determine the best local alignment via trace-back using SW algorithm.

U:        AASTHECWCTWH              V:        AASRNPSCWTTWHT

Problem D.6(B) : Via hand calculation, perform SW-algorithm based comparison between the two given sequences and elucidate the common regions of similarity

Sequence Pair

WY G Q E Q S Y I Q

WY T Q E T S D I Q

Problem E.1

Translate the following regular expressions:

(a) [GA]-T-{C, G}(2)-X-[TGC]-G(3)-[TC]

(b) [TCG]-{A, C}(3)-P-x-[ATG]-x-[VIL]-[IVT]-x-[GS]-G-Y-S-[QL]-A

(c) [TAG]-XXAG-V-X(4)-{AEGD}-[AC]-x-V-x(4)-{ED}

(d) Write regular expression to match each string in the C terminus:

V or L, any (two to four times), A, T, any but D or E

Problem E.2

(a) For the following set of multiple sequence alignment, construct the regular expression and expand it in terms of 3-letter code for amino acids:

T

E

C

V

L

A

R

T

I

N

G

P

V

L

A

R

T

I

N

G

P

T

I

T

R

T

I

N

G

A

V

M

M

R

T

I

A

E

C

V

I

C

R

T

I

K

E

C

V

I

C

R

T

I

A

E

C

T

I

C

R

T

S

N

P

C

V

I

A

R

T

T

K

E

E

V

M

M

R

T

I

(b) For the following set of multiple sequence alignment, construct the regular expressionand expand it in terms of the relevant nucleotide bases

T

C

C

T

G

A

C

A

G

T

G

C

G

G

A

T

A

G

C

C

G

T

C

T

C

T

C

A

G

C

G

G

A

C

T

G

G

T

G

T

G

A

T

G

A

A

C

C

T

G

A

C

T

G

C

G

C

T

A

A

C

T

G

A

G

C

G

G

A

C

T

G

A

C

C

G

G

G

T

T

G

Problem F.1

Using the UPGMA concept, construct an evolutionary tree for the data on pairwise species differences indicated in the following table:

OUT

A

B

C

D

E

A

0

5

30

45

35

B

 

0

28

42

32

C

 

 

0

10

15

D

 

 

 

0

20

E

 

 

 

 

0

Problem F.2

Using the UPGMA concept, construct an evolutionary tree for the data on pairwise species differences indicated in the following table:

OUT

A

B

C

B

4

 

 

C

4

2

 

D

 8

 8

 6

Problem  F.3

OUT

H

C

G

O

A

H

0

95

110

185

205

C

 

0

118

195

220

G

 

 

0

190

215

O

 

 

 

0

215

A

 

 

 

 

0

 Using the data as above, construct an un-rooted tree formulating the lengths of branches from the common ancestral node.

Problem F. 4

Neighbor-Joining Method

Given the following state of evolutionary distances, create a distance matrix on the resulting taxa

OUT

A

B

C

D

E

B

5

 

 

 

 

C

10

20

 

 

 

D

 15

25

 35

 

 

E

45

55

60

65

 

F

70

75

80

85

90

Hint:

Calculate the new distance matrix (m) for each pair of nodes.

m (i, j) = d(j) - [r(i)] + r(j)/(N - 2) where N is the number of taxa

Problem F.5

Trace the path for each sequence in HMM for the given MSA

AB - CDE

ABGCDE

AB - C- E

Hint:

HMMs and their variants have been used in gene prediction, pairwise and multiple sequence alignment, base-calling, modeling DNA sequencing errors, protein secondary structure prediction, ncRNA identification, RNA structural alignment,acceleration of RNA folding and alignment, fast noncoding RNA annotation, and many others.

Simulates a multiple sequence alignment of specified length. Deals with base-substitution only, not indels.

Reference no: EM13981457

Questions Cloud

Decision-analysis course : After hearing about your decision-analysis course, he asks you whether you have learned anything that might help him in his decision. What kinds of is sues are important in deciding whether to buy a retail business? Describe how he might use sensi..
An economic consulting firm has estimated : You compete with many firms offering similar products (monopolistic competition). An economic consulting firm has estimated the own-price elasticity for your most profitable product is -1.50. Your marginal cost is constant at $75 across most of your ..
Affirmative action programs or anti-discrimination policies : Some argue that the government doesn't need affirmative action programs or other anti-discrimination policies because the profit motive provides sufficient motivation to eliminate discrimination in employment. Explain why this might NOT happen i.e. w..
Identify the facts that establish those elements : With regard to location: 121 Apple Street, is Doe guilty of any crime(s)? If so, what crime(s)? If not, what elements are missing that, if present, would result in him being guilty? With regard to the elements that are present, identify the fac..
Characteristic of the transition and transversion mutations : Construct a dot-plot for the pair of sequences using the matrix methods and determine the best global alignment via trace-back using NW algorithm - Perform SW-algorithm based comparison between the two given sequences and elucidate the common regions..
Maximum annual cost of debt the company can borrow : If a company has a required WACC of 10% per year. it's stocks are expected to have a 16% rate of return. The capital structure of the company has to be 65% debt and 35% equity. the income tax rate of the company is 30%. What is the maximum annual cos..
What is maximum number of maximums we will see on screen : A light with a wavelength of 500nm is incident on a double slit opening with a width of 40 microm. If the screen is 0.9m away from the open- what is the maximum number of maximums we will see on the screen?
Charge of ordering products but do not set the price : Suppose you manage a convenience mart and are in charge of ordering products but do not set the price. The home office provides the prices. In your area, the income elasticity of demand for peanut butter is -.05. Due to local factory closings, you ex..
How far horizontally from launch point is buliding located : Draw a suitable diagram, defining your variables and coordinate system. Express the velocity vector of the stone at t=0 in i(hat), j(hat) format. How far horizontally from the launch point is the buliding located.

Reviews

Write a Review

Other Engineering Questions & Answers

  Characterization technology for nanomaterials

Calculate the reciprocal lattice of the body-centred cubic and Show that the reciprocal of the face-centred cubic (fcc) structure is itself a bcc structure.

  Calculate the gasoline savings

How much gasoline do vehicles with the following fuel efficiencies consume in one year? Calculate the gasoline savings, in gallons per year, created by the following two options. Show all your work, and draw boxes around your answers.

  Design and modelling of adsorption chromatography

Design and modelling of adsorption chromatography based on isotherm data

  Application of mechatronics engineering

Write an essay on Application of Mechatronics Engineering

  Growth chracteristics of the organism

To examine the relationship between fermenter design and operating conditions, oxygen transfer capability and microbial growth.

  Block diagram, system performance and responses

Questions based on Block Diagram, System Performance and Responses.

  Explain the difference in a technical performance measure

good understanding of Mil-Std-499 and Mil-Std-499A

  Electrode impedances

How did this procedure affect the signal observed from the electrode and the electrode impedances?

  Write a report on environmental companies

Write a report on environmental companies

  Scanning electron microscopy

Prepare a schematic diagram below of the major parts of the SEM

  Design a pumping and piping system

creating the pumping and piping system to supply cool water to the condenser

  A repulsive potential energy should be a positive one

Using the data provided on the webvista site in the file marked vdw.txt, try to develop a mathematical equation for the vdW potential we discussed in class, U(x), that best fits the data

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd