Reference no: EM132211198
Question :
Write a program that will read a DNA or RNA sequence in FASTA format and determine the count of each nucleotide in the sequence. Your task is to write such a program as a Perl script
The Perl script should
- work with FASTA files containing only one sequence
- The name of the file can be given on the command line when the script is invoked. If the name of the FASTA file is not specified on the command line, the script will read the sequence information (in FASTA format) from the standard input
- The script is to confirm that the record is in FASTA format. If it is not, it is to issue an error message and terminate
- Sequence information in the FASTA file can be in upper or lower case
- Output information is to be prefaced by the sequence identifier from the FASTA header
Notes:
To loop through the characters of a string you can use a construction such as
while( $c = ( chop $str ) )
For example
Given a FASTA file with content such as
>441E-590 Nov_16c/Nov16c/441E-590.SEQ trimmed vector-stripped GCGTCGACGTTCTACGACACGTCGTGCCCCAGGGCTCTGGCCACCATCAAGAGCGGCGTG GCGGCAGCCGTGAGCAGCGATCCCCGCATGGGCGCCTCCCTGCTCAGGCTGCACTTCCAC GACTGCTTTGTCCAAGGCTGTGACGCGTCTGTTCTTCTGTCCGGCATGGAACAAAACGCG GGCAAACAACCAAACCTTGGNCGNAACCNTGGGAACACNNANCGAACGCCCCCAAANGGC GTTTTCNGAACAAAACGGCCCTTNACTTNCAACCCAAAACCCTTCCCTTGGTCAACAAAA AAAAAANGGGGGNTTCCCCTTGGGNAACTTCGNGGAAANCCAAANGGNGGGGTTTCTTTT AAAAACAAAC
your output should look like
inventory for '441E-590' : A: 89 C: 111 G: 89 T/U: 64 other characters: 17 Do not include newline or carriage-return characters in any of your counts.