Bioinformatics for representing sequence annotation

Assignment Help Database Management System
Reference no: EM1389771

QUESTION 1

For each of the following tasks you'll use the xxxxxxxx' database. You need only provide the query or command you used within MySQL for each.

a. How many columns are in the cv table? (The command here doesn't need to return a number, just show the columns so you can count them.)

b. Which ID (cv_id) corresponds to the GO ontology stored in the database?

c. How many controlled vocabulary terms (cvterm table) are linked to the GO ontology?

d. How many entries in the feature table are linked to any of the GO terms? (see the feature_cvterm linking table.)

Question 2:

The GFF3 format is a commonly-used one in bioinformatics for representing sequence annotation. You can find the specification here:

sequenceontology.org/gff3.shtml

Vi editor as below:
______________________________________________________________________________
##gff-version 3
#date Tue Feb 8 19:50:12 2011
#
# Saccharomyces cerevisiae S288C genome
#
# Features from the 16 nuclear chromosomes labeled chrI to chrXVI,
# plus the mitochondrial genome labeled chrMito and the 2-micron plasmid.
#
# Created by Saccharomyces Genome Database
#
# Weekly updates of this file are available via Anonymous FTP from:
# ftp.yeastgenome.org/yeast/data_download/chromosomal_feature/saccharomyces_cerevisiae.gff
#

#

____________________________________________________________________________

Within the feature table another column of note is the 9th, where we can store any key=value pairs relevant to that row's feature such as ID, Ontology_term or Note.

Your task is to write a GFF3 feature exporter. A user should be able to run your script like this:

$ export_gff3_feature.pl /path/to/some.gff3 gene ID YAR003W

There are 4 arguments here that correspond to values in the GFF3 columns. In this case, your script should read the path to a GFF3 file, find any gene (column 3) which has an ID=YAR003W (column 9). When it finds this, it should use the coordinates for that feature (columns 4, 5 and 7) and the FASTA sequence at the end of the document to return its FASTA sequence.

Your script should work regardless of the parameters passed, warning the user if no features were found that matched their query. (It should also check and warn if more than one feature matches the query.)

The output should just be printed on STDOUT (no writing to a file is necessary.)

Reference no: EM1389771

Questions Cloud

Calculate the dividend yield and the capital-gain yield : calculate the dividend yield, the capital-gain yield, and the total return to the stock. Express your calculations in percentage terms.
Determine the critical values for the test : The null hypothesis is to be tested at 95% confidence. Determine the critical values for this test.
Elements in frequency histogram : When making the histogram from frequency table, (a) what goes along the bottom, (b) what goes along the left edge, and (c) what goes above each value?
Interaction of calcium with other proteins : Explain the interaction of calcium with other proteins and how this alternate control system affects the rate and duration of smooth muscle contraction.
Bioinformatics for representing sequence annotation : The GFF3 format is a commonly-used one in bioinformatics for representing sequence annotation and which ID (cv_id) corresponds to the GO ontology stored in the database and how many controlled vocabulary terms (cvterm table) are linked to the GO onto..
Estimation of the proportion of hospital referrals : What size sample would be required to estimate the proportion of hospital referrals with a margin of error of 0.04 or less at 95% confidence?
Grouped frequency table : Describe to a person who has never taken a course in statistics the meaning of a grouped frequency table.
Summarize the structural organization of dna : Provide summary the structural organization of DNA. In your answer, be certain that you identify the chemical components of the molecule, and the arrangement of the molecule
Centrifugation of a cell suspension : Assume if centrifugation of a cell suspension at a rotation speed of 1200 rpm takes three min, Determine how much time will be required to achieve the same degree of cell.

Reviews

Write a Review

Database Management System Questions & Answers

  Describe the different operations of relational algebra

Describe relationships with the example. Also illustrate degree of relationship for that example. Describe the different operations of relational algebra with suitable example each.

  Define set of relational schemas and identify primary keys

We want to construct a database for a world-wide package delivery company. Define a set of relational schemas and identify primary and foreign keys. Try not to include redundant schemas.

  Choose a data storage problem of storing data in database

You should choose a data storage problem of your interest and identify the different pieces of data that should be stored in database.

  Use cases perform a requirements analysis for the case study

Use Cases Perform a requirements analysis for the Case Study

  Explain meaning of expression acid transaction

What is lock granularity? Explain the use of BEGIN, COMMIT, and ROLLBACK TRANSACTION statements. Explain the meaning of expression ACID transaction.

  Write select statement that returns three columns

Write a SELECT statement that returns three columns: InvoiceTotal From the Invoices table, 10% 10% of the value of InvoiceTotal.

  Explain issues to convert relationship for new cardinality

Using example of vehicles and drivers, explain issues to convert relationship for new cardinality including new relationships and attributes for the tables.

  How database solve multiple concurrent data management issue

Investigate how databases solve multiple concurrent data management issues comprising lost updates, deadlocks and different kinds of lock management styles.

  Sales transaction in retail clothing

Examine different sales transactions. Design a context diagram and a level-0 diagram that represent the selling system at the store.

  Database to keep track of auto sales in car dealership

CAR (Serial-No, ModConsider the given relations for database which keeps track of auto sales in car dealership.

  Information-gathering techniques for the project

Explain the information-gathering techniques and design methods you would suggest to use for project. Recognize the key factors that help ensure the information required for the project.

  Application important part of security model for databases

Using Internet as the research tool for extra information not presented in text, describe why application roles are important part of the security model for databases.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd