16S rrNA gene bioinformatics
Once a 16S PCR product has been sequenced and amplified it must be placed in the circumstance of its phylogenetic relationships with other sequences. The preliminary idea of the close relatives can be gained through the use of an alignment to one or more known sequences. The program like as BLAST can do this while it is extremely limited in detail and will only give information relating to one other sequence at one time. To gain a true idea of phylogeny the 16S rRNA gene sequence should be compared to as several other sequences as possible concurrently. Only ten years ago this was impossible to achieve on anything but a supercomputer; moreover, recent advances in computing power now mean in which most personal computers can carry out some or all of this procedure. Several web-based programs also allow free access to the more powerful computers which may be required.
To start with the newly acquired sequence must be aligned with all or some of the sequences obtained in past. As there is some variation in length of 16S rRNA gaps, genes must be inserted to achieve a perfect alignment by this can be done through programs such as CLUSTAL. The aligned sequences are then clipped so in which the 5’ and 3’ ends are equivalent bases and the alignment is sent to a program capable of generating phylogenetic trees.
An ideal representation of phylogeny would be multidimensional but given the constraints of our 3-dimensional universe in common and the scientific predilection for presentation in 2-dimensional form on paper in particular the ‘tree’ is a good compromise. Two major algorithms are used with that are: maximum parsimony and neighbor joining. Neighbor joining is an evolutionary distance technique based on a matrix of differences in the dataset. The resulting tree has branches of lengths proportional to evolutionary distance statistically corrected for back mutation. Maximum parsimony is a more hard concept to grasp in which the resulting tree has branches whose length is proportional to the mini- mum amount of sequence modification necessary to enable the creation of a new branch.
For both techniques it is possible to generate trees differing in details like as the number of branches from one dataset. The process known as bootstrapping is applied to get an idea of the sum of all the possible trees and this gives a confidence value for the presence of each branch. In addition parsimonious trees and neighbor joining generated from the similar dataset can give quite different results and to date neither technique is considered to be more right than the other. Therefore any tree should be considered as the best possible result with the data available and should not necessarily over-rule any other information.