Write perl code to process and analyze the sequence data

Assignment Help Programming Languages
Reference no: EM131469714

1. Download and decompress the sequence data of chromosome 22.

2. Write Perl code to process and analyze the sequence data file downloaded.
 a. Read in the data from the file
 b. Use regular expression to extract the sequence from the file.
 c. Remove non-ATGC characters from the sequence
 d. Extract all the open reading frames (ORFs) from the whole sequence.
     - An ORF is a part of DNA sequence that has the potential to be translated.
     - Its length should be a multiple of 3.
    - ORFs are defined as those subsequences which have a start codon 'ATG' and any of the three stop codons 'TAA', 'TAG' and 'TGA'.Each codon includes threenucleotides.
     - In addition to start and stop codon, ORFs extracted here should have 30-90 nucleotides.
     - Only one stop codon is allowed in each ORF.
 e. Print out a message showing how many open reading frames are found in the screen.
 f. Translate each ORF into amino acid sequence using the subroutines provided.
 g. Write all found ORFsto a new data file.
 h. Write all translated amino acid sequence into another new data file.

Importantnotes:
    2.a - 2.ishould be done in one .pl file.
     Please use the subroutines provided to perform the translation.
     Once I implement your function, I would expect to input the data file name from the screen. Shown below is an example.

2073_Figure.jpg

Your output file in step 2.hand 2.i should automatically be created. Below is the example of 2.h output file.

1417_Figure1.jpg

Support information:

1. To write an array into a file, where each entry is shown in one line, use the following command:

print MFILE "$_\n" for @ORFs;

MFILE is the handle for output file.
@ORFs is the array.

2. If you have groups in your pattern
my $string = "TTTATGTGCTGCTAAAAA";
@matches = $string =~ m/^(ATG).*(TAA)$/g

=> With parenthesis () surrounding the subpattern "ATG" ad "TAA", substrings matching the "ATG" and "TAA" part will also be returned.
=> values in @matches will be ("ATGTGCTGCTAA", "ATG", "TAA")
=> Add ?: in the front, such as m/^(?:ATG).*(?:TAA)$/g, substring matching "ATG" and "TAA" will not be returned.
=> In this case the output will be ("ATGTGCTGCTAA")

Download Sequence data of chromosome 22

https://www.dropbox.com/s/6v5whj22boa3kur/hs_ref_GRCh38.p7_chr22.fa.gz?dl=0

Reference no: EM131469714

Questions Cloud

Define the sampling plan : Consider the following double sampling plan. First select a sample of 5 from a lot of 100. If there are four or more defectives in the sample, reject the lot.
What is the probability that a lot passes the inspection : For the double sampling plan described in Problem, determine the following: The probability that the lot is rejected based on the first sample.
Graph the acceptance and rejection regions : A manufacturer of aircraft engines uses a sequential sampling plan to accept or reject incoming lots of microprocessors used in the engines. Assume an AQL of 1.
What proportion of compactor bags will not meet requirements : The tensile strength of a heavy-duty plastic bag used in trash compactors is normally distributed with mean 150 pounds per square inch and standard deviation.
Write perl code to process and analyze the sequence data : Programming for Science Informatics - Write Perl code to process and analyze the sequence data file - Write all translated amino acid sequence
Develop strategic objectives for your division of business : Develop the strategic objectives for your new division of the existing business in a balanced scorecard format in context of key trends, assumptions, and risks.
What fraction of the applicants are denied on this basis : A credit rating company recommends granting of credit cards based on several criteria. One is annual income. If the annual income of applicants is normally.
Find probability that normal variable exceed two-sigma limit : What is the probability that a normal variable exceeds two-sigma limits? (That is, what is the probability of observing a value of the random variable larger).
Write from perspective of a scholar who observes about case : To do this, write from the perspective of a scholar who observes and researches about the case. Therefore, first person should be avoided.

Reviews

Write a Review

Programming Languages Questions & Answers

  Program to read an employee-s number from keyboard

Write a program that reads an employee's number from the keyboard, number of hours worked and an hourly rate of pay.

  Write error message and repeat input until a answer is found

Give the user specific instructions for what their answer should be (i.e. Y or N). If their answer is anything other than one of the specified choices, write an error message and repeat the input until a desired answer is found.

  Program that print a grade report for students

The program will print a grade report for students in a course. he program is to read the input file and calculate each student's average and letter grade for the course. The average is calculated by dropping the student's lowest test score and the..

  What innovation introduced in algol68 is credited to pascal

If the left-hand side (LHS) appears in the right-hand side (RHS) of a rule, it is a(n):

  Program calculate average number of days employee are absent

Write a program that calculates the average number of days a company's employees are absent. The program should have the following functions: a function called main that asks the user for the number of employees.

  Write program to find smallest-largest value from n numbers

Write down program which will determine the smallest, largest and average values in collection of N numbers. Get value of N before scanning each value in collection of N numbers.

  Test a program for summing

You are to implement and test a program for summing 1/x as x runs over all approximately eight million (23 fraction bits) single precision floating point numbers in the interval [1, 2). You are to do this on a server, PC (or Mac) of your choice..

  Write a program which generates 100 three-digit random

an armstrong number of three digits is an integer such that the sum of the cubes of its digits is equal to the number

  Creating main function that opens the input file

Create a main function that opens the input file, reads each line, and based on first character in input line, calls the [A]dd function or [M]ultiply function.

  Program to compute risk of weight-related health problems

A quantity known as the body mass (BMI) is used to calculate the risk of weight-related health problems. Write a program that accepts weight and height and then displays the BMI value and Status.

  Create program for hollywood movie rating guide

Create a program for the Hollywood Movie Rating Guide,in which users continuously enter a value from 0 to 4 that indicates the number of stars they are awarding to the Guide's featured movie of the week.

  Write a program to converts temperatures

Write a program to converts temperatures between Fahrenheit and Celsius. Your program should print a brief message describing what it does, and then prompt the user to enter "1" if they would like to convert a Fahrenheit number to Celsius.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd