Dna sequences, Computer Engineering

Assignment Help:

The dataset provided in this assignment contains a collection of real DNA sequences. The number of true binding sites is quite limited and that makes the problem challenging. In machine learning community, this is termed as imbalanced datasets. Some techniques dealing with imbalanced data classification, such as sampling or filtering, can be applied for the biological data. It is a good idea to find some relevant publications to see in which way you can build effective classifiers for motif recognition.

The whole dataset should be partitioned into a training dataset used to build the learner models, and a testing dataset used to evaluate generalization capability of the classification systems. System performance will be evaluated by looking at the recall, precision, F-measure and recognition rate for both the training dataset and the test dataset.

It is very important to notice that unlike traditional way for evaluating classifier's performance, here a kmer is classified as a motif instance if its location has at least 50% overlap with a true binding site in the DNA sequences. For example, consider two true binding sites ACACGGGA and ACACGGGA in the following DNA sequence.

ccttacacaaACACGGGAgaattaatACACGGGAtcagatcaataaa (1)

Suppose that the 8mers acaaACAC and ACGGGAtc are classified as binding sites by a learner model. Then, we will count them as correct prediction because they have 50% and 75% overlaps with the true binding sites in sequence (1), respectively. Conversely, if classifiers classify them as non-binding sites, then we will count them as incorrect prediction because they have at least 50% overlaps with the true binding sites. Take another 8mer, GAgaatta, in (1). If it is classified by a learner model as a binding site, then it will be counted as a misclassified one because it has only 25% overlap with the true binding site ACACGGGA


Related Discussions:- Dna sequences

Explain register marker, Explain Register marker. Register marker: St...

Explain Register marker. Register marker: Strowger selectors doing searching and counting. Conversely, the crossbar switch has no 'intelligence'. Something external to the

Define security protocols used for e-commerce applications, Define various ...

Define various security protocols used for e-commerce applications. The e-commerce systems of today are composed of very many components as like: a commerce server, client soft

What do you mean by shopping bots, Q. What do you mean by shopping bots? ...

Q. What do you mean by shopping bots? ANSWER: A shopping bot or buyer agent is an intelligent agent on a Web site that assists you, the customer, search the products and servic

Online teaching jobs, Dear, I''m an engineering post graduate in computer s...

Dear, I''m an engineering post graduate in computer science. I would like to work as online tutor. please suggest ideas. Thank You.

Define the vbscript basics, VBScript is an easy and powerful to learn tool ...

VBScript is an easy and powerful to learn tool which can be used to add interaction to your Web pages. Web browser receives scripts along with rest of the Web document. Browser par

Find minimum sampling rate of analog signal to be sampled, The analog signa...

The analog signal needs to be sampled at a minimum sampling rate of: (A) 2fs                                               (B) 1/(2fs) (C)  fs/2

#dbms., #example of cascading rollback#

#example of cascading rollback#

Select-options and parameters statement, The fields specified by select-opt...

The fields specified by select-options and parameters statement cannot be grouped together in the selection screen.  No, It can be grouped together in the selection screen

Find blocking probably in 100-line strowger switching system, Calculate the...

Calculate the blocking probably Pb in 100 line strowger switching system where 10 calls are in progress and 11th one arrives, probably that there is a call in a given decade = 1/10

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd