Saturday, February 6, 2010

Bioinformatics: How to prepare your sequences for a phylogenetic tree

ADVERTISEMENTS

Bioinformatics: How to prepare your sequences for a phylogenetic tree

In order to make a phylogenetic tree, we have to do a multiple sequence alignment first, because you can't make a good and accurate tree without an accurate multiple sequence alignment.

To learn haw to build a multiple sequence alignment, you can see this video tutorial HERE.

To build a multiple seqeunce alignment and then a phylogenetic tree, you have to prepare you sequences considering some factors:

1- Avoid using sequence fragments: you have to align the complete sequences not only fragments, and if you want to align fragments, you have to use fragments for all sequences that you want to align.

2- Avoid using a lot of sequences: large datasets or large number of sequences can make your phylogenetic tree not accurate, because most algorithms can't handle large datasets especially softwares that are used online, because it will take a lot of time and hurt your phylogenetic tree accuracy.

3- Avoid aligning Xenologs: because they are produced by lateral transfer by a virus or bacteria, and they can't make the original history of your gene, if you want more information about Xenologs you can read this post HERE.

4- Avoid recombinant sequences: because recombinant sequences are a result of two species (may be very distinct species), Phlogenetic trees builders can't handle the history of two distinct species in the same time.

5- Add a distant sequence to your alignment: it has to be similar but diverged long time ago, because it will work as the first common ancestor to you phylogenetic tree.

6- Don't depend on guide trees: On EBI server for example, when you make a multiple sequence alignment with ClustalW, a guide tree is included in the results, don't use this tree because its not a phylogenetic tree, it's a guide tree that ClustalW uses to assemble the multiple sequence alignment, if you use it in place of phylogenetic tree, it will give you false results.

Any question you're welcome.

0 comments:

Post a Comment