List of some important libc functions that are used

Assignment Help Computer Engineering
Reference no: EM131903870

Assignment: DNA sequencer

The biology department at UC Davis is looking for an application that can decode sequences of DNA, by locating genes and transcribing the sequence of corresponding proteins.

Genes are substrings of DNA which code for proteins and carry the heritable information from our parents. Genes start with the sequence of three letters ATG, called the start codon, and end with one of the three sequences TGA, TAA, or TAG, called stop codons. The stretch of sequence between the start codon and any of the stop codons is a potential gene.

Each codon codes for an amino acid represented by a letter of the alphabet. There is a total of 19 amino acids. Strung together, amino acids from proteins. A substring of a DNA sequence is a translatable sequence if:

• it has a length that is multiple of three,
• it starts with a start codon and ends with a stop codon
• it can be translated into an amino acid sequence

For example, DNA sequence AATTAAGATGGGGCTCTAAAAT contains such a translatable sequence, starting at the 8th position and of length 12 (ATGGGGCTCTAA), thus consisting of 4 codons. This sequence can be translated using a codon table into the length three amino acid sequence MGL.

Note that the start codon codes for amino acid M while the stop codons don't code for any amino acids.

On the other hand, DNA sequence AATGAATCTAGT is not a translatable sequence.

Write program dna_translate.c that takes two command line arguments: an input file name, containing DNA sequences, and an output file name, in which you will store the translated, protein sequences. For each sequence, the program should identify the longest possible translatable sub-sequence, if one exists, and translate it into a protein using a codon table given in the file codeoflife.txt. See example below.
$ cat codeoflife.txt I ATT I ATC I ATA ... R CGT x TAA x TAG x TGA $ cat dna_seqs.txt aaATttaTggattagcaagcag ACGATGATGATGGGGCCCTAATAGTGATAAAAAACT AAAATAATTTGGA ATGAAATGGTAGATGAAACCCGGGATATGATAG $ ./dna_translate dna_seqs.txt prot_seqs.txt MD MMMGP none MKPGI $

Here are a list of requirements, assumptions and hints:

• This program shall contain no global variables.

• All the dynamically allocated memory should be properly freed by the terminated by the end of the program.

• The translated sequences, in the output file, must be in the same order as the DNA sequences.

• If no translatable sequence is found, none should be outputted.

• We assume that the maximum number of characters a DNA sequence can contain is

• We assume that the DNA sequence file contain only proper sequences ( i.e. strings over {A, C, G, T, a, c, g, t}).

• You are expected to use a linked-list to represent the codon table (as read from file codeoflife.txt).

• You are expected to use a linked-list to represent the list of DNA sequences (as read from the input file).

• You will probably need to split the problem into a few principal functions, such as:

• A function that builds the linked-list of codons, as read from codeoflife.txt.

• A function that builds the linked-list of DNA sequences, as read from the input file.

• You will probably need to think of the order of insertion, in order to keep the same order when outputting the resulting sequences of proteins.

• A function that iterates through all the DNA sequences, and for each, finds the longest translatable sequence from each and outputs the corresponding sequence of proteins in the output file (or none if no translatable sequence was found).

• Two functions that iterate through the two linked-lists and free every dynamically allocated items and any dynamically allocated objects they might contain.

• List of some important libc functions that are used in the reference program: fopen(), fgets(), fprintf(), fclose(), sscanf(), strncpy(), strncmp(), etc.

Reference no: EM131903870

Questions Cloud

Should the firm accept this project : What is the NPV for the project if the required return is 7 percent?At a required return of 7 percent, should the firm accept this project?
How much more will that amount be than the cash price : John Walters is comparing the cost of credit to the cash price of an item. If John makes a down payment of $100 and pays $35 a month for 24 months.
What was the cash flow to stockholders for the year : If the company paid out $600,000 in cash dividends during 2015, what was the cash flow to stockholders for the year?
Find the total value of darlenes assets : If you put $2,620 in a savings account and make no further deposits, what type of calculation would provide you with the value of the account in 28 years?
List of some important libc functions that are used : List of some important libc functions that are used in the reference program: fopen(), fgets(), fprintf(), fclose(), sscanf(), strncpy(), strncmp(), etc.
What are the percentage return on your investment : During the year, you received a dividend of $7.3 per share. Today, you sold all your shares for $50.47. What are the percentage return on your investment.
Determine the required return on unilevers stock : The risk-free interest rate is 5% and the Unilever's beta coefficient is 1.5. If the market risk premium is 6%, what is the required return on Unilever's stock?
What is the total aftertax cash flow to shareholders : What is the total aftertax cash flow to shareholders if the company invests in T-bills?
Write merge script to correct the missing calorie count info : Merge - Write MERGE script to correct the missing calorie count information. Use the cals.txt document provided.

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd