Examine multiple variation parameters for a genomic region , Biology

Assignment Help:
  1. Determine SNP variation among the aligned DNAs for a genomic region.   See below for how to count SNP variation.  The output file (Your_name_snp.txt) should have two columns of numbers.  The first column will indicate total number of SNP sites per species and the second will be the percent of sequences/species having that same number of variant nucleotides.
  2. Determine in-del variation among the aligned DNAs for a genomic region. The output file (Your_name_in_del.txt) should be two columns of numbers.  The first column will indicate total number of in-del sites per species and the second will be the percent of sequences/species having that same number of in-del.
  3. Determine overall variation (SNPs and in-dels) among the aligned DNAs for a genomic region. The output file (Your_name_both.txt) two columns of numbers.  The first column will indicate total number of variant sites (SNP and in-del) per species and the second will be the percent of sequences/species having that same number of variant nucleotides.  This will generate the same data used for the figure on page 3.

Sample Alignment: 48 bases,  differences are highlighted

Seq1      ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC

Seq2      AAAAATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC

Seq3      AAAAATGCATGCATGCA-GCATGCATGCATGCATGCATGCATGCATGC

Seq4      AAAAATGCATGCATGCA-GCATGCATGCATTTTTGCATGCATGCATGC

Seq5      AAAAATGCATGCATGCA-GCATGCATGCATTTTTGCAT-CATGCATGC

Computation:  Compare Seq1 to 2,3,4, and 5 you find the differences (SNPs and InDels).

Seq1:Seq1 = 0 changes

Seq1:Seq2 = 3 changes

Seq1:Seq3 = 4 changes

Seq1:Seq4 = 7 changes

Seq1:Seq5 = 8 changes

 Repeat using each of the other sequences as the basis for comparison

Seq2:Seq1 = 3 changes                  Seq3:Seq1 = 4 changes

Seq2:Seq2 = 0 changes                  Seq3:Seq2 = 1 changes

Seq2:Seq3 = 1 changes                  Seq3:Seq3 = 0 changes

Seq2:Seq4 = 4 changes                  Seq3:Seq4 = 3 changes

Seq2:Seq5 = 5 changes                  Seq3:Seq5 = 4 changes

 

Seq4:Seq1 = 7 changes                  Seq5:Seq1 = 8 changes

Seq4:Seq2 = 4 changes                  Seq5:Seq2 = 5 changes

Seq4:Seq3 = 3 changes                  Seq5:Seq3 = 4 changes

Seq4:Seq4 = 0 changes                  Seq5:Seq4 = 1 changes

Seq4:Seq5 = 1 changes                  Seq5:Seq5 = 0 changes

 

Our input file is a FASTA format file of all sequences/species that has been previously aligned and trimmed.  There are some odd characters in the file, so we'll have to deal with that.


Related Discussions:- Examine multiple variation parameters for a genomic region

Bovine spongiform encephalopathy (bse), B o vi n e spongiform encephalo...

B o vi n e spongiform encephalopathy (BSE) Bovine spongiform encephalopathy (BSE) is a transmissible, neurodegenerative, fatal brain disease of cattle characterized by post

Evaluate the magnitude of st depression, Q. Evaluate the Magnitude of ST De...

Q. Evaluate the Magnitude of ST Depression? It is intuitive that the magnitude of ST depression should correlate with the degree of the ischaemia. In patients with left main or

Dietary management during atherosclerosis, Q. Dietary management during ath...

Q. Dietary management during atherosclerosis? Dietary management and the nutrient requirements during atherosclerosis remain the same as for the management of dyslipidemia. Hen

Human skeleton, HUMAN SKELETON - Hard supportive or protective elements...

HUMAN SKELETON - Hard supportive or protective elements of the animal body form the skeleton system. It's study in osteology. EXOSKELETO N - Present outside skin. It

Express difference between chromosomes and chromatin, What is the differenc...

What is the difference between chromosomes and chromatin?

Determine about the brain tissue, Determine about the Brain tissue Bra...

Determine about the Brain tissue Brain tissue looks solid to the naked eye (it has a consistency of stiff jelly), so 'finger-grain' investigations had to await two technologic

The halstead-reitan neuropsychological battery, The halstead-reitan neurops...

The halstead-reitan neuropsychological battery The beginnings of the battery can be traced to the special laboratory established by Halstead in 1935 for the study of neurosurgi

What are diseases of the connective tissue, What are diseases of the connec...

What are diseases of the connective tissue? What are some of them? Diseases of the connective tissue are hereditary or acquired diseases(lots of autoimmune cause) characterized

Products of cleavage - morula and blastula, Products of Cleavage (Morula an...

Products of Cleavage (Morula and blastula) In several cases, the blastomeres in early cleaving stages tend to presume spherical shape like that of the egg earlier than cleava

Lower calorific value (lcv) or net calorific value (ncv), It is defined as ...

It is defined as the amount of heat liberated when one unit mass of fuel is burnt and the products of combustion are allowed to escape.                                  LCV = HC

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd