Reference no: EM133354291
Question
1. Your lab discovers new tiny, tiny organism with a tiny, tiny genome. You decide to use high-throughput sequencing to assemble the genome of this microorganism.
You perform a low-throughput sequencing experiment of small region of the genome and get back the following fragments:
GTTCGAG, TTATCAG, CGAGACG, TCAGGTC
Unfortunately, several of the base pairs in the sequence had low quality scores. What is the most likely original sequence? Draw a graph that uses the reads as nodes and edges as the alignment of one sequence to the next. After constructing the sequence, you notice that it does not perfectly match the database. What could be a potential cause of the discrepancy?
2. In the next experiment, you decide to be more algorithmically savvy and use a de Bruijn graph to assemble your sequencing reads. Assemble the following set of sequencing reads using a de Bruijn graph where the nodes are k-mers of size 3. Label the edges as well as the nodes. GAGCAT, CATGCC, AATGAG