tag:blogger.com,1999:blog-31728649043007370152024-02-19T08:56:22.466-08:00Bioinformatics made easyBioinformatics, lessons, tutorials, how-to tutorials, useful bioinformatics ressources, books, articles, genomics, proteomics, phylogeny, microarray, Gene expression, Bioinformatics applications, Bioinformatics tools and softwares, Analysis, Research, Biomedical.Unknownnoreply@blogger.comBlogger58125tag:blogger.com,1999:blog-3172864904300737015.post-18344295826503728252010-04-24T04:15:00.000-07:002010-04-24T04:15:02.475-07:00Bioinformatics:Multiple sequence alignment different formats:<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics:Multiple sequence alignment different formats: </span></b></div>People sometimes find it confusing when it comes to different multiple sequence alignment formats (what to use with what???), that's because the variety of programs that handles multiple sequence alignments, sometimes you find a program that uses FASTA format and sometimes MSF (Multiple Sequence Format)...etc.<br />
<br />
The reason why there are a lot of formats out there, is that every format had appeared by specialists in a specific field, for example specialists in phylogeny use Phylip format...etc<br />
<br />
So before you use any format you have to ask yourself questions like: is this format supported by the program i'm running, is it easy for me to modify in it, is it widely accepted...etc.<br />
<br />
<b><span style="font-size: small;"><span style="color: #cfe2f3;">Some of the most popular multiple sequence alignment formats:</span></span></b><br />
<br />
<b>1- FASTA:</b> a text format that's widely accepted and its easy to read and modify.<br />
<b>2- MSF:</b> (Multiple Sequence Format), the most popular, supported by most programs, easy to read and difficult to modify.<br />
<b>3- ALN:</b> produced by ClustalW, easy to read and widely supported.<br />
<b>4- Phylip:</b> text format, supported by most phylogenetic packages.<br />
<br />
Any question, u're welcome.Unknownnoreply@blogger.com6tag:blogger.com,1999:blog-3172864904300737015.post-23190013160169853682010-03-23T05:11:00.000-07:002010-03-23T05:11:29.593-07:00Bioinformatics careers: Bioinformatics Systems Analyst Job Offer<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics careers: Bioinformatics Systems Analyst Job Offer</span></b></div><br />
The <b>DOE</b> Joint Genome Institute (<b>JGI</b>) in Walnut Creek, CA has a job offer for an experienced Bioinformatics Analyst to support the Plant Genome Assembly Analytics Group.<br />
<br />
The main responsibility includes obtaining Genomes data and analyzing it using several software, storing and organizing it...etc<br />
<br />
For more info about this offer (job requirements and how to apply) you can visit this link <a href="http://careers.crijob.com/lbnlcareers/detailsRedirect.asp?jid=23370"><b>HERE</b></a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-25504487687471726142010-03-23T03:42:00.000-07:002010-03-23T03:42:26.398-07:00Bioinformatics: NCBI releases a Database of Genomic Structural Variation (dbVar)<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: NCBI releases a Database of Genomic Structural Variation (dbVar)</span></b></div><br />
As part of <b>NCBI</b>, a new database was released, this database is the <b>Database of Genomic Structural Variation (dbVar)</b>, this database contains data from the analysis of genomic variations and their relationship with phenotype information.<br />
<br />
<a href="http://www.ncbi.nlm.nih.gov/dbvar/"><b>The dbVar homepage</b></a> contains links that help you understand what Genomic Structural Variations are, FAQs, and submission of information.<br />
<br />
The database also include an <i>RSS feed</i> to let you know about any updates.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-52618802808490841142010-03-22T06:58:00.000-07:002010-03-23T04:04:17.808-07:00Bioinformatics: Job Offer at University of CopenhagenThe Center for non-coding <a href="http://bioinformatics-made-easy.blogspot.com/2010/01/bioinformatics-genomics-different-types.html">RNA</a> in Technology and Health at Bioinformatics Faculty of Life Sciences, University of Copenhagen in Denmark, is a newly established center that specialized in studying non-coding <a href="http://bioinformatics-made-easy.blogspot.com/2010/01/bioinformatics-genomics-different-types.html">RNAs</a>. The center has a position as PhD fellow in bioinformatics open, with start May 1st or soon thereafter. The duration is three years.<br />
<br />
This project concentrates on studying ncRNAs or non-coding RNAs, their role, structure...etc<br />
<br />
The project will be in collaboration with Prof. Henrik Nielsen, University of Copenhagen and others. <br />
<br />
For more on the job description and Qualification requirements, you can read the full article <a href="http://www.scholar-guide.com/phd-fellow-in-bioinformatics-faculty-of-life-sciences-university-of-copenhagen/">HERE</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-18026133602712820492010-03-19T05:42:00.000-07:002010-03-19T05:42:55.878-07:00Bioinformatics: Protein sequence databases names for use with BLAST<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Protein sequence databases names for use with BLAST</span></b></div><br />
In the "<a href="http://bioinformatics-made-easy.blogspot.com/2010/03/bioinformatics-nucleotide-sequence.html"><b>Nucleotide sequence databases names for use with BLAST</b></a>" post, we've seen the nucleotide sequence databases names that we can use in a command line <b>BLAST</b> search, the same thing applies to Protein sequence databases when it come to use command line <i>BLAST</i>.<br />
<br />
Here are some of the protein databases names that we can use with <b>BLAST</b>:<br />
<br />
<span style="font-size: small;"><b>1- nr:</b></span> Non-redundant merge of SWISS-PROT, PIR, PRF, and proteins derived from GenBank coding sequences and PDB atomic coordinates<br />
<br />
<b>2- swissprot:</b> The SWISS-PROT database<br />
<br />
<b>3- pdb:</b> Amino acid sequences parsed from atomic coordinates of three-dimensional structures<br />
<br />
<b>4- ecoli:</b> All proteins encoded by the <i>E. coli</i> genome<br />
<br />
<b>5- yeast:</b> All proteins encoded by the <i>S. cerevisiae</i> genome<br />
<br />
<b>6- drosoph:</b> All proteins encoded by the <i>D. melanogaster</i> genome<br />
<br />
These are some of the abbreviations used in a commend line BLAST search, if you want more you can read the documentation of using command line BLAST on the internet.<br />
<br />
<br />
As a Bioinformatician, learning to use command line BLAST on Linux is very important, because it will make parsing files and looking for specific info very easy, because what takes 1 minute in an automated task, will take half an our doing it by hand, and the number goes with the amount of data you want to retrieve.<br />
<br />
Any questions, you're welcome:-).Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-3172864904300737015.post-84025369102642652382010-03-14T07:32:00.000-07:002010-03-14T07:32:48.144-07:00Bioinformatics: video tutorial: Using PHYLIP to build phylogenetic trees<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: video tutorial: Using PHYLIP to build phylogenetic trees</span></b></div><br />
As you know <b><a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformatics-phylogeny-inference.html">PHYLIP</a></b> or (<b><a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformatics-phylogeny-inference.html">PHYlogeny Inference Package</a></b>) is a set of programs that can construct <b><a href="http://bioinformatics-made-easy.blogspot.com/2009/11/bioinformatics-phylogenetic-trees.html">phylogenetic trees</a></b>.<br />
<br />
<br />
To understand what <b>phylogenetic trees</b> can do for you you can read this post <a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsphylogeny-what.html">HERE</a>.<br />
<br />
In order to build <b>phylogenetic trees</b> you have to <a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformatics-how-to-prepare-your.html"><b>prepare a set of sequences in a multiple sequence alignment</b>.</a><br />
<br />
In this video tutorial i'm going to use one of the <b><a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformatics-different-methods-used.html">3 methods in building phylogenetic trees</a></b>, which is distance methods by using a program included in PHYLIP called protdist.<br />
<br />
This video tutorial have 2 parts:<br />
<br />
<a href="http://www.youtube.com/watch?v=wayiGqcMAyE">Part 1:</a><br />
<br />
<object height="385" width="480"><param name="movie" value="http://www.youtube.com/v/wayiGqcMAyE&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/wayiGqcMAyE&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object><br />
<br />
<a href="http://www.youtube.com/watch?v=7VvPiud9-Bk">Part 2:</a><br />
<br />
<object height="344" width="425"><param name="movie" value="http://www.youtube.com/v/7VvPiud9-Bk&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/7VvPiud9-Bk&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
<br />
Any questions, comment<br />
<span style="color: #99ffff; font-size: 180%; font-weight: bold;"></span>Unknownnoreply@blogger.com6tag:blogger.com,1999:blog-3172864904300737015.post-55108957107506402182010-03-07T07:28:00.000-08:002010-03-07T07:28:20.594-08:00Bioinformatics: Nucleotide sequence databases names for use with BLAST<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Nucleotide sequence databases names for use with BLAST</span></b></div><br />
In most cases people like to use <b>BLAST</b> that is hosted on servers like <b>NCBI</b>, but sometimes you would like to use a <b>command line</b> <b>BLAST</b> already installed on your computer on a <b>Windows</b> or <b>Linux</b> operating system.<br />
<br />
In order to do that you have to know the databases names that you can use in the command line <b>BLAST</b>.<br />
<br />
Here are some of the nucleotide databases names that you can use with <b>BLAST</b>:<br />
<br />
<b style="color: #cfe2f3;">1- nr :</b> Nonredundant GenBank, a database that provides comprehensive collections of both amino acid and nucleotide sequence data, with redundancy reduced by merging sequences that are completely identical.<br />
<br />
<b style="color: #cfe2f3;">2- est :</b> expressed sequence tags.<br />
<br />
<b style="color: #cfe2f3;">3- sts :</b> sequence tagged sites.<br />
<br />
<b><span style="color: #cfe2f3;">4- htgs :</span></b> high-throughput genomic sequences.<br />
<br />
<b style="color: #cfe2f3;">5- ecoli :</b> Complete genomic sequence of <i>E. coli</i>.<br />
<br />
<b style="color: #cfe2f3;">6- yeast :</b> Complete genomic sequence of <i>S. cerevisiae.</i><br />
<i><br />
</i><b style="color: #cfe2f3;">7- drosoph :</b> Complete genomic sequence of <i>D. melanogaster</i>.<br />
<br />
<b style="color: #cfe2f3;">8- mito :</b> Complete genomic sequences of vertebrate mitochondria.<br />
<br />
<b><span style="color: #cfe2f3;">9- vector :</span></b> Collection of popular cloning vectors.<br />
<br />
These are some of the most used nucleotide databases names in a <b>BLAST</b> search.<br />
<br />
This is an example of a <b>BLAST</b> command:<br />
<br />
<b style="color: #cfe2f3;">blastall -i blast.in -d nr -o blast.out </b><br />
<br />
blastall: program name.<br />
<br />
-i : input.<br />
<br />
blast.in & blast.out : input and output file containing the sequence.<br />
<br />
-d : database.<br />
<br />
nr : Nonredundant.<br />
<br />
You can read (<a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformatics-using-blast-to-search.html">using BLAST to search for similarities</a>) post to learn how to run a BLAST search against a database.<br />
<br />
You can read (<a href="http://bioinformatics-made-easy.blogspot.com/2010/01/bioinformatics-different-blast-programs.html">Different Blast Programs</a>) to learn about different BLAST programs.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-90566706577274589602010-02-27T05:29:00.000-08:002010-02-27T05:29:18.087-08:00Bioinformatics: Different methods used to build phylogenetic trees<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Different methods used to build phylogenetic trees</span></b></div><br />
In <b>Bioinformatics</b> there are <b>three</b> major methods used in building <a href="http://bioinformatics-made-easy.blogspot.com/2009/11/bioinformatics-phylogenetic-trees.html"><b>phylogenetic trees</b></a>, every one of these methods have its own weaknesses and strengths as the case with every bioinformatics program or method.<br />
<br />
<span style="color: #cfe2f3; font-size: large;"><b>These methods are:</b></span><br />
<br />
<b style="color: #cfe2f3;">1- Distance methods:</b> In this method the algorithm takes the data (sequences) and construct a distance matrix between each 2 sequences, after that the sequences are regrouped depending on their relative distance, the last step is to construct a tree that matches this data.<br />
<br />
<b style="color: #cfe2f3;">2- Parcimony methods:</b> This method searches in all possible phylogenetic trees that needs the minimum number of substitutions of nucleic acids or amino acids (mutations), so the best tree is the one that have the minimum number of mutations.<br />
<br />
<b style="color: #cfe2f3;">3- Likelihood methods:</b> This method means that the best estimate of a parameter is that giving the highest probability that the observed set of measurements will be obtained.<br />
<br />
<b>Bioinformaticians</b> say that <b>Likelihood methods</b> are the most accurate and the best, because most researchers use them, but the problem is that they run very slow because of their long algorithms.<br />
<br />
<b>Parcimony methods</b> have great results but they have probably the same negative side of Likelihood methods.<br />
<br />
<b>Distance methods</b> or distance based trees are easy to set up, and you can apply them in most situations, but they aren't necessarily the most accurate.<br />
<br />
<br />
<br />
<a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformatics-how-to-prepare-your.html"><span style="font-size: small;"><b>How to prepare your sequences for a phylogenetic tree</b></span></a><br />
<h3 class="post-title" style="font-weight: normal;"></h3><br />
<b><a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsphylogeny-what.html">What Phylogenetic Trees can do for you?</a></b><br />
<br />
<br />
<br />
Any question comment.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-10045725077634560372010-02-24T03:01:00.000-08:002010-02-24T03:01:00.649-08:00Bioinfrmatics:video tutorial:using Genomescan to parse genomes (find exons)<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinfrmatics:Video Tutorial:using Genomescan to parse genomes (find exons)</span></b></div><br />
In this video tutorial we are going to see how to use <b>Genomescan</b> to parse large <b>DNA</b> sequences and find coding regions or <b>Exons</b>.<br />
<br />
As you know higher organisms genes like vertebrates are more complex then others, because they contain coding regions called <b>Exons</b> and between these <b>Exons</b> we find non coding regions called <b>Introns</b>.<br />
<br />
To predict these genes which contain several <b>Exons</b>, you have to use a very sophisticated algorithms, that can locate <b>Exons</b> and <b>Introns</b> and by that locating genes.<br />
<br />
You can read this post (<a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsopen-reading-frame-orf.html"><b>Open Reading Frame (ORF)</b></a>) to understand what are <b>ORF</b>s.<br />
<br />
You can read this post (<a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformatics-using-orf-finder-to.html"><b>Using ORF Finder to locate open reading frames</b></a>) for a basic software that can find <b>ORF</b>s.<br />
<br />
You can read this post (<a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformatics-sophisticated-orf.html"><b>Sophisticated ORF prediction with GenMark</b></a>) for a more sophisticated <b>ORF</b> prediction software.<br />
<br />
<object height="344" width="425"><param name="movie" value="http://www.youtube.com/v/P2umkEo1Uxk&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/P2umkEo1Uxk&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-31761002631184939212010-02-21T08:07:00.000-08:002010-02-21T08:07:32.423-08:00Bioinformatics: What Is PSI-BLAST?<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: What Is PSI-BLAST?</span></b></div><br />
<b>PSI-BLAST (Position-Specific Iterative BLAST)</b> is a software designed for proteins, and it's a <b>BLAST</b> search that uses a <b>PSSM (position-specific scoring matrix)</b>.<br />
<br />
<b style="color: #cfe2f3;">What is PSSM?</b><br />
<br />
<b><a href="http://en.wikipedia.org/wiki/Position-specific_scoring_matrix">PSSM (position-specific scoring matrix)</a></b> is a <b>matrix</b> used for biological data, and its main role in PSI-BLAST search is to increase the sensitivity of results.<br />
<br />
<b>PSI-BLAST</b> search uses <b>PSSM</b> as a query instead of individual sequence, it's like a matrix constructed from a multiple sequence alignment and then each position of the alignment will have its own position specific score.<br />
<br />
<b style="color: #cfe2f3;">How PSI-BLAST works?</b><br />
<br />
It begins with a normal <b>BLAST</b> search (the more match, the more score), but in this case a regular <b>BLAST</b> search will probably miss more distant and may be interesting homologies, so next <b>PSI-BLAST</b> will construct a <b>PSSM (position-specific scoring matrix)</b> and repeat the search until no new matches are found, this will result in finding new distant sequences that you are may be interested in.<br />
<br />
You can read this post (<a href="http://bioinformatics-made-easy.blogspot.com/2010/01/bioinformatics-different-blast-programs.html">Different Blast Programs</a>) to understand all types of BLAST programs including PSI-BLAST and what each one do.<br />
<br />
You can access PSI-BLAST from EBI website <a href="http://www.ebi.ac.uk/Tools/psiblast/">HERE</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-7424612400334761532010-02-19T09:13:00.000-08:002010-02-19T09:13:13.260-08:00Bioinformatics: Perl and BioPerl<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Perl and BioPerl</span></b></div><br />
As you all know, <b>Bioinformaticians</b> are 2 types:<br />
<br />
1- That use ready softwares to analyse biological data.<br />
<br />
2- That design new softwares for them or for other <b>Bioinformaticians</b>.<br />
<br />
As we discussed on earlier post about <a href="http://bioinformatics-made-easy.blogspot.com/2010/01/best-bioinformatics-programming.html">The best programming language for bioinformatics HERE</a>, we said that <i><b>Perl</b></i> (<b>Practical Extraction an Report Language</b>) is the most powerful because:<br />
<br />
1- It is installed or included in almost every Linux distribution.<br />
<br />
2- The scripts written by Perl doesn't require compilation (They are portable from one system to another).<br />
<br />
3- It supports regular expressions (a very powerful controle and manipulation of strings).<br />
<br />
4- And what makes it very unique programming language comparing to others, its support to Hashes or Table Hashes (association of values with keys).<br />
<br />
5- It contains an unlimited number of ready modules on internet that anybody can use.<br />
<br />
6- It is available also for Windows. <br />
<br />
You can read this post about the best book to begin programming with Perl for bioinformatics called <a href="http://bioinformatics-made-easy.blogspot.com/2010/01/books-beginning-perl-for-bioinformatics.html">Beginning Perl for Bioinformatics</a>.<br />
<br />
<span style="color: #cfe2f3; font-size: large;"><b>What is BioPerl?</b></span><br />
<br />
<b>BioPerl</b> is a project developed by <b>Open Bioinformatics Foundation</b> and is a collection of modules that you can use to easily contruct <b>Perl</b> scripts to automate tasks for <b>bioinformatics</b>.<br />
<br />
With <b>BioPerl</b> you don't have to do anything from scratch, so you use ready modules that suites your needs (what do you want more than that???).<br />
<br />
In my opinion i see that <b>Perl</b> is the best programming language for <b>bioinformatics</b>, if you have a different point of view, you can suggest it in comments.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-3172864904300737015.post-19662907966106796002010-02-16T08:14:00.000-08:002010-02-24T03:04:48.953-08:00Bioinformatics: Sophisticated ORF prediction with GenMark<div style="background-color: orange; color: #d9ead3; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Sophisticated ORF prediction with GenMark</span></b></div><br />
<i><b>Orf</b></i> prediction programs are a key to locate <b>ORF</b>s (<b>Open Reading Frames</b>), and if we locate <b>ORF</b>s we have an approximative idea of the location of your gene that is coding for a protein.<br />
<br />
To read about <b>ORF</b>s or <b>Open Reading Frames</b> click <a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsopen-reading-frame-orf.html">HERE.</a> <br />
<br />
In the <a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformatics-using-orf-finder-to.html">how to work with <b>ORF finder</b> program to predict ORFs video tutorial</a>, i've showed you how to use <b>ORF Finder</b> program developed by <b>NCBI</b> to locate <b>ORFs</b>, but i've said that this software is very basic, so we can use it only with simple genomes (Viral, Bacterial...etc), bacause these kind of programs can identify only about 80 percent of Protein Coding regions that you may be interested in.<br />
<br />
You can see <b>ORF Finder</b> video tutorial <a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformatics-using-orf-finder-to.html">HERE.</a><br />
<br />
In this video tutorial i'm going to show you a more sophisticated approach that can predict <b>ORF</b> of (Bacteria, Viruses, Eucaryotes...etc), this software is a familly of different programs that use a very sophisticated method.<br />
<br />
<object height="344" width="425"><param name="movie" value="http://www.youtube.com/v/NUpK9FbSBTk&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/NUpK9FbSBTk&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-76224909117260498412010-02-14T07:54:00.000-08:002010-02-14T08:26:49.723-08:00Bioinformatics: Linux Vs Windows (What's Better For Bioinformatics)?<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Linux Vs Windows (What's Better For Bioinformatics)?</span></b></div><br />
People have 2 big choices when it comes to use operating systems especially <b>Bioinformaticians</b>, Linux and Windows, but there is a huge difference between these 2 operating systems.<br />
<br />
<i><span style="font-size: large;"><b>Windows:</b></span></i><br />
<br />
<b>Windows</b> is known for its simplicity (Anyone with a basic knowledge can work with windows), so it's user friendly, great interface, great media support, but it is less adapted to Bioinformaticians needs and:<br />
<br />
- Its not free.<br />
- Its source is not open to buplic.<br />
- Most of its softwares are not free.<br />
- You can't automate instructions...etc<br />
<br />
<br />
I'm not saying that <b>Windows</b> isn't good for you, because i work with it most time, but if you are a <b>Bioinformatician</b> and you want to program new softwares or automate some instructions, than Linux is <b><i>definitely</i></b> for you, if you want to use ready softwares to analyse your data you can use <b>Windows</b>.<br />
<br />
<i><b><span style="font-size: large;">UNIX (Linux): </span></b></i><br />
<br />
<b>Linux</b> is a very powerful operating system especially for <b>programmers</b> because it gives you full controle over your machine:<br />
<br />
- It has a lot of programming tools (languages and interfaces).<br />
- Other free softwares as (Webservers, Database management system, visualisation softwares, text editing...etc).<br />
- Statistic analysis (like R).<br />
- Unix is more stable and runs fast.<br />
- Vast ducumentation for softwares (How to use stuff!!!).<br />
<br />
So if you are a <b>bioinformatician</b> that is more likely attracted to biology (you use <b>bioinformatics</b> softwares only for analysing your biological data) then you can use Windows, but if you are a cyber geek!!! that wants to develop new softwares for bioinformaticians then you can use Linux, i personally prefer Linux but in the end it's up to you to decide.<br />
<br />
<br />
If you want to use BioLinux 5.0 you can read a post abou it <a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsbio-linux-50.html">HERE.</a><br />
<br />
If you want to know how to have BioLinux 5.0 working on you computer you can read this post <a href="http://bioinformatics-made-easy.blogspot.com/2010/01/bioinformatics-how-to-install-biolinux.html">HERE.</a><br />
<br />
If you have any question, put it in comment.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-61573593254240875742010-02-12T06:08:00.000-08:002010-02-12T06:08:23.484-08:00Bioinformatics: OMIM (Online Mendelian Inheritance in Man) Overview<div style="background-color: orange; color: white; text-align: center;"><span style="font-size: x-large;"><b>Bioinformatics: OMIM (Online Mendelian Inheritance in Man) Overview</b></span></div><br />
While <b>bioinformatics</b> is a key in analyzing genes, proteins, genomes, mutations...etc, researchers use these information to understand genetic diseases especially in human, this is where <b>bioinformatics</b> is playing a major role in finding, analyzing, and treating these genetic disorders and for that <a href="http://www.ncbi.nlm.nih.gov/">NCBI</a> has developed OMIM and made it available to public.<br />
<br />
<b>OMIM</b> (<b>Online Mendelian Inheritance in Man</b>) is a database which contains a catalog for human genes and genetic disorders, the database was developed by <b>NCBI</b> (<b>National Center for Biotechnology Information</b>) and it is hosted on their server. <br />
<br />
<b>OMIM</b> contains information about all known genetic disorders and it links to other resources like <b>MEDLINE</b> (Citations and abstracts) and even links to other <b>NCBI</b> databases entries that are responsible for certain diseases. <br />
<br />
<b>OMIM</b> has three ways to search for genetic disorders or related information:<br />
<br />
<b>1- Through a normal search:</b> by typing a keyword like in the case of most databases.<br />
<br />
<b>2- By using the Gene map:</b> where you can browse a table of genes organized by <b>cytogenetic</b> map location.<br />
<br />
<b>3- By using the Morbid map:</b> which is a table of all alphabetically listed genetic disorders featured in <b>OMIM</b>.<br />
<br />
To access <b>OMIM</b> <a href="http://www.ncbi.nlm.nih.gov/omim/">click HERE</a>.<br />
<br />
Any questions you're welcome.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-9020598532121608152010-02-10T03:58:00.000-08:002010-02-24T03:04:48.954-08:00Bioinformatics: Using ORF Finder to locate open reading frames<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Using ORF Finder to locate open reading frames</span></b></div><br />
In this video tutorial, i'm going to show you haw to use the <b>ORF Finder</b> software to find or locate open reading frames (possible protein coding genes).<br />
<br />
<b>ORF Finder</b> is a software located at the <b>NCBI Website</b> and it is designed to locate <b>open reading frames</b> in a given DNA sequence in all the <b>six reading frames</b>. <br />
<br />
To know more about <b>Open Reading Frames</b>,you can read this post <a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsopen-reading-frame-orf.html">HERE.</a><br />
<br />
<i><b>Note:</b></i> This software (<b>ORF Finder</b>) is a basic software, so you can use it in the case of non complex genes (Microbial genomes).<br />
<br />
There is a more sophisticated softwares that can handle the complexity of higher organisms genomes like <i><b>GenMark</b></i>.<br />
<br />
<object height="344" width="425"><param name="movie" value="http://www.youtube.com/v/GHh0eX9Oqcg&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/GHh0eX9Oqcg&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-78746487863702254982010-02-08T07:32:00.000-08:002010-02-08T07:38:25.176-08:00Bioinformatics: What is MEDLINE and PubMed?<div style="background-color: orange; color: white; text-align: center;"><b style="background-color: orange;"><span style="font-size: x-large;">Bioinformatics: What is MEDLINE and PubMed?</span></b></div><br />
Researchers citations are very important in any research at any given field. For <b>bioinformatics</b> these citations are indispensable in any research, for this reason the united states National Library of Medicine (NLM) is providing biomedical literature to researchers or students online.<br />
<br />
Since 1879, the NLM has published the <i><b>Index Medicus</b></i> which is an index or guide to articles, but with the evolution of information technology, Index Medicus has became a database now known as <b>MEDLINE</b>.<br />
<br />
<div style="background-color: #666666; color: blue;"><b><span style="background-color: white;">What is MEDLINE: </span></b></div><br />
<b>MEDLINE</b> or (Medical Literature Analysis and Retrieval System Online) is a huge bibliographic database that contains articles from academic journals covering : biology, all branches of medecine, health, molecular biology, biochemistery, microbiology...etc, this data is accessible free over the internet via PubMed.<br />
<br />
<b><span style="color: #cfe2f3;">What is PubMed?</span></b><br />
<br />
<b>PubMed</b> is a part of <i><b>Entrez</b></i> retrieval system, and is a search engine or retrieval system to access <b>MEDLINE</b> citations, abstracts, and full text articles. In addition to <b>MEDLINE</b> citations which are the most found by <b>PubMed</b>, <b>PubMed</b> provides access to other records including in-process citations, some life science journals that submit full text to <b>PubMed Central </b>and may not have been recommended for inclusion in <b>MEDLINE</b>.<br />
<br />
As we said before <b>PubMed</b> is part of <i><b>Entrez</b></i> retrieval system which is part or the <b>NCBI</b> Website and you can access it from the <b>NCBI</b> website from <a href="http://www.ncbi.nlm.nih.gov/pubmed/">HERE</a>.<br />
<br />
<br />
You can find more information about <b>MEDLINE</b> and <b>PubMed</b>, Tutorials and quick tours <a href="http://www.nlm.nih.gov/bsd/disted/pubmedtutorial/">HERE</a>. <br />
<br />
Any question, comment.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-231110433343242482010-02-06T06:45:00.000-08:002010-02-06T06:47:09.761-08:00Bioinformatics: How to prepare your sequences for a phylogenetic tree<div style="text-align: center;"><b style="color: #ea9999;"><span style="font-size: x-large;">Bioinformatics: How to prepare your sequences for a phylogenetic tree</span></b></div><div style="text-align: center;"></div><div style="text-align: left;"></div><br />
In order to make a <a href="http://bioinformatics-made-easy.blogspot.com/2009/11/bioinformatics-phylogenetic-trees.html">phylogenetic tree</a>, we have to do a <a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsmultiple-sequence.html">multiple sequence alignment</a> first, because you can't make a good and accurate tree without an accurate multiple sequence alignment.<br />
<br />
To learn haw to build a multiple sequence alignment, you can see this video tutorial <a href="http://bioinformatics-made-easy.blogspot.com/2010/01/bioinformatics-tutorials-lessons-using.html">HERE.</a><br />
<br />
To build a multiple seqeunce alignment and then a phylogenetic tree, you have to prepare you sequences considering some factors:<br />
<br />
<b style="color: #cfe2f3;">1- Avoid using sequence fragments:</b> you have to align the complete sequences not only fragments, and if you want to align fragments, you have to use fragments for all sequences that you want to align.<br />
<br />
<b><span style="color: #cfe2f3;">2- Avoid using a lot of sequences:</span></b> large datasets or large number of sequences can make your phylogenetic tree not accurate, because most algorithms can't handle large datasets especially softwares that are used online, because it will take a lot of time and hurt your phylogenetic tree accuracy.<br />
<br />
<b style="color: #cfe2f3;">3- Avoid aligning Xenologs:</b> because they are produced by lateral transfer by a virus or bacteria, and they can't make the original history of your gene, if you want more information about Xenologs you can read this post <a href="http://bioinformatics-made-easy.blogspot.com/2010/02/bioinformarics-different-types-of.html">HERE.</a><br />
<br />
<b style="color: #cfe2f3;">4- Avoid recombinant sequences:</b> because recombinant sequences are a result of two species (may be very distinct species), Phlogenetic trees builders can't handle the history of two distinct species in the same time.<br />
<br />
<b style="color: #cfe2f3;">5- Add a distant sequence to your alignment:</b> it has to be similar but diverged long time ago, because it will work as the first common ancestor to you phylogenetic tree.<br />
<br />
<b><span style="color: #cfe2f3;">6- Don't depend on guide trees:</span></b> On EBI server for example, when you make a multiple sequence alignment with ClustalW, a guide tree is included in the results, don't use this tree because its not a phylogenetic tree, it's a guide tree that ClustalW uses to assemble the multiple sequence alignment, if you use it in place of phylogenetic tree, it will give you false results.<br />
<br />
Any question you're welcome.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-33088086406018286462010-02-04T06:13:00.000-08:002010-02-04T06:17:03.965-08:00Bioinformarics: different types of homologous genes<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">Bioinformarics: different types of homologous genes</span></b></div><br />
<br />
<a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformaticsphylogeny-what.html">The main purpose of phylogeny</a> is to pick what we call <b>Homologous genes</b> and compare them to construct a <a href="http://bioinformatics-made-easy.blogspot.com/2009/11/bioinformatics-phylogenetic-trees.html">phylogenetic tree</a> of their history, according to their similarities.<br />
<br />
<span style="color: #3d85c6; font-size: small;"><b>Homologous genes</b></span> are genes that derive from a common ancestor. To understand the homologous genes types and how exactly they derive, we have to know couple of things<br />
<br />
<i style="color: #9fc5e8;"><b>* Speciation:</b></i> is the phenomenon during which a common ancestor gives birth to two subgroups that slowly drift away from their common genetic makeup to become distinct species.<br />
<br />
<i style="color: #9fc5e8;"><b>* Duplication:</b></i> Means that within the same genome of the same species, the gene was duplicated, in this case, may be one of the genes remain the same with the same function, and the other may change.<br />
<br />
<br />
<b style="color: #9fc5e8;">Homologous genes</b> have three types: <br />
<br />
<i style="color: #9fc5e8;"><b>1- Orthologs:</b></i> Orthologs are 2 genes that are separated by <i><b>speciation</b></i>, it means generally that 2 genes exist in 2 different species, but they were in the same common ancestor.<br />
<br />
<i style="color: #9fc5e8;"><b>2- Paralogs:</b></i> Paralogs are 2 genes separated by <i><b>duplication</b></i>, this means that the same gene in one genome was duplicated to 2 genes or more.<br />
<br />
<i style="color: #9fc5e8;"><b>3- Xenologs:</b></i> Xenologs result from <i><b>Lateral Transfer</b></i> between 2 species or organisms, a DNA transfer from species to another, like the transfer of a DNA sequence from a virus or bacteria to another species.<br />
<br />
<br />
In bioinlformatics collecting these genes from <a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformatics-using-blast-to-search.html">Blast</a> searches, and aligning them into a <a href="http://bioinformatics-made-easy.blogspot.com/2010/01/bioinformatics-tutorials-lessons-using.html">multiple sequence alignment</a> is the main tool to construct a phylogenetic tree.<br />
<br />
Any questions, you are welcome.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-70914975085717659442010-02-02T09:10:00.000-08:002010-02-02T09:21:51.168-08:00How studying rRNA can help us studying evolution in Bioinformatics<div style="text-align: center; color: rgb(255, 153, 0);"><span style="font-weight: bold;font-size:180%;" >How studying rRNA can help us studying evolution in Bioinformatics</span><br /></div><br />Many of you are asking, how scientists have made an approximate tree of life that have almost all discovered species, well this is the answer:<br /><br />In <span style="font-weight: bold;">Evolutionary Bioinformatics</span> scientists have tried to find a gene that exists in all living organisms, well the very appropriate gene in this case will be the <span style="font-weight: bold;">rRNA</span> coding gene.<br /><br /><span style="font-weight: bold;">rRNA</span> or <span style="font-weight: bold;">ribosomal RNA</span> is the central component of the ribosome, its where proteines are manufactured in all living organisms, it's the one that interacts with<span style="font-weight: bold;"> tRNA</span> or <span style="font-weight: bold;">Transfert RNA</span> to produce a protein from amino acids and <span style="font-weight: bold;">mRNA</span> or <span style="font-weight: bold;">messenger RNA</span>.<br /><br />So the main criteria to study evolution is finding a conserved gene that exists in all living organisms, so the main thing scientists do when they discover new bacterium for example is to sequence its <span style="font-weight: bold;">rRNA</span> to identify its taxonomic group and estimate rates of species divergence.<br /><br />As <span style="font-weight: bold;">rRNAs</span> have played and are playing a major role in <span style="font-weight: bold;">Evolutionary Bioinformatrics</span>, scientists and researchers have made specialized databases like <span style="font-weight: bold;">RDP</span> and the <span style="font-weight: bold;">European database </span>that have thousands of <span style="font-weight: bold;">rRNA</span> sequences stored.<br /><br />Any question, comment.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-9760272858857949602010-01-31T07:44:00.000-08:002010-01-31T07:49:29.222-08:005 Things any Bioinformatician should know<div style="background-color: orange; color: white; text-align: center;"><b><span style="font-size: x-large;">5 Things any Bioinformatician should know</span></b></div><br />
<i style="color: #6fa8dc;"><b><span style="font-size: small;">1- How to work with a computer:</span></b></i> And i mean by that, how to work with at least one operating system like Windows for example, most of bioinformatics students and researchers like Linux because its open source and all of its softwares are free, but i tell you that Windows is not bad at all for Bioinformatics, because most softwares designed for linux are availible for Windows too.<br />
<br />
<i style="color: #6fa8dc;"><b>2- How to use internet browsers:</b></i><span style="color: #6fa8dc;"> </span>This is indispensable because the internet is what made Bioinformatics move so fast, so if you want to be a bioinformatician, you have to know how to work with internet browsers like (internet explorer, netscape, chrome, firefox), i personally prefer Firefox, i find it very easy and powerful.<br />
<br />
<i style="color: #6fa8dc;"><b>3- How to install a new software:</b></i> you should have this easy knowledge, because installing a Windows based software is a peace of cake comparing to Linux based one.<br />
<br />
<i style="color: #6fa8dc;"><b>4- A little knowledge of Molecular Biology:</b></i> You can't be a Bioinformatician without having a litlle knowledge in Biology especially Molecular Biology and genetics, it will be like you want to play guitar and you don't know what is a guitar...!<br />
<br />
<i style="color: #6fa8dc;"><b>5- How to surf the internet:</b></i> This is very important as most of bioinformatics operations are made online, so you have to know how to open a website, surf it, download from it...etc<br />
<br />
The most important knowledge that you should have about the How to surf the internet, is how to use Search Engines, because they will provide you with anything you will need.<br />
<br />
These are the basic skills that any Bioinformatics student should have.<br />
<br />
For more suggestions about this, please comment.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-76901463539406955642010-01-29T06:50:00.000-08:002010-01-29T06:50:29.642-08:00List of the most popular and useful Databases in Bioinformatics<div style="background-color: orange; text-align: center;"><span style="color: white; font-size: x-large;"><b>List of the most popular and useful Databases in Bioinformatics</b></span></div><br />
As Biological data is growing every day, maintaining this huge amount of data has became hard, so i'll give you what i call the best organized and maintained <b>bioinformatics databases.</b><br />
<br />
<b style="background-color: orange; color: black;">Genbank on NCBI :</b> this database is the most <span style="background-color: #f4cccc;"></span>powerful in <b>bioinformatics</b> because its designed for every thing : proteins genes genomes, structures, ………etc.<br />
To visit NCBI click <a href="http://www.ncbi.nlm.nih.gov/">HERE.</a><br />
<br />
<b style="background-color: orange; color: black;">Swissprot:</b> if your query is a protein sequence i advise you to use SwissProt that is located on the expasy proteomics server, in addition you'll find dozens of useful programs that you can use to analyze your sequence.<br />
To visit swissprot or the expasy proteomics server click <a href="http://www.expasy.ch/">HERE.</a><br />
<br />
<b style="background-color: orange; color: black;">Integrated Microbial Genomes:</b> this database is for complete genomes, i like it because its very organized and anyone can get used to it in a few minutes<br />
To visit the Integrated Microbial Genomes click <a href="http://img.jgi.doe.gov/cgi-bin/pub/main.cgi">HERE.</a><br />
<br />
<b style="color: black;"><span style="background-color: orange;">TIGR:</span></b> The Institute for Genomic Research founded by Craig Venter is a project for complete bacterial genomes, if you are a microbiologist, then this database is exactly for you, in addition to the database, bioinformaticiens working in the TIGR project had developped a set of very useful tools to analyses the database genomes such as : GLIMMER, MUMer...etc.<br />
To visit TIGR project click <a href="http://www.jcvi.org/">HERE.</a><br />
<br />
<b style="background-color: orange; color: black;">Enssembl:</b> for me its the best database for complete genomes because it containes a lot of graphic tools for interpreting and analyzing data, that means that you don't get boared while exploring it,all is visual!!!.<br />
To visit Enssembl click <a href="http://www.ensembl.org/index.html">HERE.</a><br />
<br />
There are more databases and project on the internet, but i found these databases very helpful in my reasearch.<br />
<br />
If you have more useful databases or projects you can post it in the comment section.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-85195050970201555352010-01-27T08:18:00.000-08:002010-01-27T08:18:30.298-08:00Bioinformatics: Transcriptomics<div style="background-color: orange; text-align: center;"><b><span style="font-size: x-large;">Bioinformatics: Transcriptomics</span></b><br />
</div><br />
In human DNA, less than 5% of the genome is transcribed, the rest of the genome is playing the role of watching and controlling and regulating the 5%, that's why the cellular processes are very precise.<br />
<br />
So now after the extencive sequencing projects of different genomes, the new challenge is to try to identify expression patterns of genes we have sequenced, thats when <b>Transcriptomics</b> will become very useful.<br />
<br />
<br />
<br />
<div style="text-align: center;"><b> <span style="background-color: orange; font-size: large;">So what is Transcriptomics?</span><br />
</b><br />
</div><br />
<b>Transcriptomics</b> is the study of the complete set of RNA transcripts produced by the genome (Transcriptome) at a given time.<br />
<br />
<b>Transcriptomics</b> also called gene expression profiling or genome-wide expression profiling sometimes provide solutions to understand genes and pathways involved in biological processes, so simply it examines the expression level of mRNAs.<br />
<br />
<br />
<br />
<div style="background-color: orange; text-align: center;"><span style="font-size: large;"><b>So what can transcriptomics do for us?</b></span><br />
</div><br />
As mentioned before <b>Transcriptomics</b> will give us answers as which gene is activated, and when its activated, by what its activated...etc<br />
<br />
In <b>Transcriptomics</b> identifying similarities in expression pattern give us clues that the genes are functionally related and they have the same genetic control mechanism.<br />
<br />
<br />
The most common technology used to study expression levels is <a href="http://bioinformatics-made-easy.blogspot.com/2009/11/bioinformaticsgenomicsmicroarrays.html">DNA Microarray</a>. <br />
<br />
To understand what Microarrays are used for or Microarrays main applications, please read <a href="http://bioinformatics-made-easy.blogspot.com/2009/11/bioinformatics-dna-microarrays.html">THIS POST.</a><br />
<br />
Any questions, be free to comment.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-81081594982318430462010-01-22T04:26:00.000-08:002010-01-26T01:13:10.744-08:00The best Bioinformatics programming language<div style="text-align: center;"><font style="color: rgb(255, 204, 51);" size="5"><font style="font-weight: bold;">The best Bioinformatics programming language</font></font><br /></div><br />As you now, <font style="font-weight: bold;">bioinformatics</font> is the use of computer hardware and software to analyze or interpret biological data, most of <font style="font-weight: bold;">bioinformaticiens</font> use ready programmed softwares, and most of these softwares can give you what you exactly want.<br /><br />But lets say that you want to extract some specific data from database files for example, what will you do than.<br /><br /><font style="font-weight: bold;">Bioinformatics</font> softwares are made or programmed by specialists in the programming field using programming languages (c, c++, perl, phython, java...etc), i'm not saying that you have to learn them all, but PERL (Practical Extraction and Report Language), is the most powerful and ideal in <font style="font-weight: bold;">Bioinformatics.</font><br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Why exactly PERL:</font><br /><br />You may say that we have a lot of programming languages choices, why <font style="font-weight: bold;">PERL</font>, well we have already seen <font style="font-weight: bold;">bioinformatics</font> programs written in other languages such as (c, java, phython, FORTRAN...etc), but <font style="font-weight: bold;">PERL</font> is the best in the field because it can highly detects data patterns especially what we call <font style="font-weight: bold;">STRINGs</font> of text, so <font style="font-weight: bold;">PERL</font> is the best programming language for <font style="font-weight: bold;">bioinformatics.</font><br /><br />We mean by <font style="font-weight: bold;">STRINGs</font> characters of DNA/RNA or protein sequences (ATGATCCAGT for example).<br /><br />I found this OREILLY book '<a href="http://bioinformatics-made-easy.blogspot.com/2010/01/books-beginning-perl-for-bioinformatics.html"><span style="font-weight: bold;">Beginning PERL For Bioinformatics</span></a>' very helpful, and i advise that you read it to understand better how to design your own programs that are suited to your needs instead of using others programs.<br /><br />Any question, comment.<br /><br />Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-3172864904300737015.post-57795043399236332112010-01-22T04:07:00.000-08:002010-01-24T08:13:39.206-08:00Books: Beginning Perl for Bioinformatics<div style="text-align: center;"><font style="font-weight: bold; color: rgb(255, 204, 51);" size="5"><font>Beginning Perl for Bioinformatics</font></font><br /></div><br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSzHfjYS-lvAzYkeoGsJ15llfqjjOlJfrCxq_6dxRF-botQS3tnTX4VdlReI9m4jvM33BAfJsCr2iF3c2KpDRrnUvd8vjTziiJLUQMX3KsjcoTahSS01-E0b00jyd_Efev7rN_2JKFKgJL/s1600-h/PERL.bmp"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 175px; height: 240px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSzHfjYS-lvAzYkeoGsJ15llfqjjOlJfrCxq_6dxRF-botQS3tnTX4VdlReI9m4jvM33BAfJsCr2iF3c2KpDRrnUvd8vjTziiJLUQMX3KsjcoTahSS01-E0b00jyd_Efev7rN_2JKFKgJL/s320/PERL.bmp" alt="" id="BLOGGER_PHOTO_ID_5429540674627836674" border="0"></a><br /><br /><div style="text-align: center;"><iframe src="http://rcm.amazon.com/e/cm?t=sciencarttech-20&o=1&p=8&l=as1&asins=0596000804&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr" style="width: 120px; height: 240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0"></iframe><br /></div><br /><br /><p class="p data"><strong class="strong">By: </strong><a target="_blank" href="http://www.oreillynet.com/pub/au/617?x-t=book.view">James Tisdall</a></p><p class="p data"><strong class="strong">Publisher: </strong><font>O'Reilly Media, Inc.</font></p><p class="p data"><font>I found this book very helpful to understand the basics of using PERL to design programs that you need, to extract or manipulate data.<br /></font></p><p class="p data">If you read this book you'll be able to use your own designed programs to parse database files and extract only what you need and even analyze DNA/RNA or protein data.<br /><font></font></p><p class="p data"><font><br /></font></p><font size="4"><font style="font-weight: bold;">Table of Contents</font></font><br /><br />Copyright<br /><br />Preface<br /><br />What Is Bioinformatics?<br /><br />About This Book<br /><br />Who This Book Is For<br /><br />Why Should I Learn to Program?<br /><br />Structure of This Book<br /><br />Conventions Used in This Book<br /><br />Comments and Questions<br /><br />Acknowledgments<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">1. Biology and Computer Science</font><br /><br /><font style="font-weight: bold;">Section 1.1.</font> The Organization of DNA<br /><br /><font style="font-weight: bold;">Section 1.2.</font> The Organization of Proteins<br /><br /><font style="font-weight: bold;">Section 1.3.</font> In Silico<br /><br /><font style="font-weight: bold;">Section 1.4.</font> Limits to Computation<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 2. Getting Started with Perl</font><br /><br /><font style="font-weight: bold;">Section 2.1.</font> A Low and Long Learning Curve<br /><br /><font style="font-weight: bold;">Section 2.2.</font> Perl's Benefits<br /><br /><font style="font-weight: bold;">Section 2.3.</font> Installing Perl on Your Computer<br /><br /><font style="font-weight: bold;">Section 2.4.</font> How to Run Perl Programs<br /><br /><font style="font-weight: bold;">Section 2.5.</font> Text Editors<br /><br /><font style="font-weight: bold;">Section 2.6.</font> Finding Help<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 3. The Art of Programming</font><br /><br /><font style="font-weight: bold;">Section 3.1.</font> Individual Approaches to Programming<br /><br /><font style="font-weight: bold;">Section 3.2.</font> Edit—Run—Revise (and Save)<br /><br /><font style="font-weight: bold;">Section 3.3.</font> An Environment of Programs<br /><br /><font style="font-weight: bold;">Section 3.4.</font> Programming Strategies<br /><br /><font style="font-weight: bold;">Section 3.5.</font> The Programming Process<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 4. Sequences and Strings</font><br /><br /><font style="font-weight: bold;">Section 4.1.</font> Representing Sequence Data<br /><br /><font style="font-weight: bold;">Section 4.2.</font> A Program to Store a DNA Sequence<br /><br /><font style="font-weight: bold;">Section 4.3.</font> Concatenating DNA Fragments<br /><br /><font style="font-weight: bold;">Section 4.4.</font> Transcription: DNA to RNA<br /><br /><font style="font-weight: bold;">Section 4.5.</font> Using the Perl Documentation<br /><br /><font style="font-weight: bold;">Section 4.6.</font> Calculating the Reverse Complement in Perl<br /><br /><font style="font-weight: bold;">Section 4.7.</font> Proteins, Files, and Arrays<br /><br /><font style="font-weight: bold;">Section 4.8.</font> Reading Proteins in Files<br /><br /><font style="font-weight: bold;">Section 4.9.</font> Arrays<br /><br /><font style="font-weight: bold;">Section 4.10.</font> Scalar and List Context<br /><br /><font style="font-weight: bold;">Section 4.11.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 5. Motifs and Loops</font><br /><br /><font style="font-weight: bold;">Section 5.1.</font> Flow Control<br /><br /><font style="font-weight: bold;">Section 5.2.</font> Code Layout<br /><br /><font style="font-weight: bold;">Section 5.3.</font> Finding Motifs<br /><br /><font style="font-weight: bold;">Section 5.4.</font> Counting Nucleotides<br /><br /><font style="font-weight: bold;">Section 5.5.</font> Exploding Strings into Arrays<br /><br /><font style="font-weight: bold;">Section 5.6.</font> Operating on Strings<br /><br /><font style="font-weight: bold;">Section 5.7.</font> Writing to Files<br /><br /><font style="font-weight: bold;">Section 5.8.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 6. Subroutines and Bugs</font><br /><br /><font style="font-weight: bold;">Section 6.1.</font> Subroutines<br /><br /><font style="font-weight: bold;">Section 6.2.</font> Scoping and Subroutines<br /><br /><font style="font-weight: bold;">Section 6.3.</font> Command-Line Arguments and Arrays<br /><br /><font style="font-weight: bold;">Section 6.4.</font> Passing Data to Subroutines<br /><br /><font style="font-weight: bold;">Section 6.5.</font> Modules and Libraries of Subroutines<br /><br /><font style="font-weight: bold;">Section 6.6.</font> Fixing Bugs in Your Code<br /><br /><font style="font-weight: bold;">Section 6.7.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 7. Mutations and Randomization</font><br /><br /><font style="font-weight: bold;">Section 7.1.</font> Random Number Generators<br /><br /><font style="font-weight: bold;">Section 7.2.</font> A Program Using Randomization<br /><br /><font style="font-weight: bold;">Section 7.3.</font> A Program to Simulate DNA Mutation<br /><br /><font style="font-weight: bold;">Section 7.4.</font> Generating Random DNA<br /><br /><font style="font-weight: bold;">Section 7.5.</font> Analyzing DNA<br /><br /><font style="font-weight: bold;">Section 7.6.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 8. The Genetic Code</font><br /><br /><font style="font-weight: bold;">Section 8.1.</font> Hashes<br /><br /><font style="font-weight: bold;">Section 8.2.</font> Data Structures and Algorithms for Biology<br /><br /><font style="font-weight: bold;">Section 8.3.</font> The Genetic Code<br /><br /><font style="font-weight: bold;">Section 8.4.</font> Translating DNA into Proteins<br /><br /><font style="font-weight: bold;">Section 8.5.</font> Reading DNA from Files in FASTA Format<br /><br /><font style="font-weight: bold;">Section 8.6.</font> Reading Frames<br /><br /><font style="font-weight: bold;">Section 8.7.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 9. Restriction Maps and Regular Expressions</font><br /><br /><font style="font-weight: bold;">Section 9.1.</font> Regular Expressions<br /><br /><font style="font-weight: bold;">Section 9.2.</font> Restriction Maps and Restriction Enzymes<br /><br /><font style="font-weight: bold;">Section 9.3.</font> Perl Operations<br /><br /><font style="font-weight: bold;">Section 9.4.</font> Exercises<br /><br /><font style="font-weight: bold;">Chapter 10. GenBank</font><br /><br /><font style="font-weight: bold;">Section 10.1.</font> GenBank Files<br /><br /><font style="font-weight: bold;">Section 10.2.</font> GenBank Libraries<br /><br /><font style="font-weight: bold;">Section 10.3.</font> Separating Sequence and Annotation<br /><br /><font style="font-weight: bold;">Section 10.4.</font> Parsing Annotations<br /><br /><font style="font-weight: bold;">Section 10.5.</font> Indexing GenBank with DBM<br /><br /><font style="font-weight: bold;">Section 10.6.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 11. Protein Data Bank</font><br /><br /><font style="font-weight: bold;">Section 11.1.</font> Overview of PDB<br /><br /><font style="font-weight: bold;">Section 11.2.</font> Files and Folders<br /><br /><font style="font-weight: bold;">Section 11.3.</font> PDB Files<br /><br /><font style="font-weight: bold;">Section 11.4.</font> Parsing PDB Files<br /><br /><font style="font-weight: bold;">Section 11.5.</font> Controlling Other Programs<br /><br /><font style="font-weight: bold;">Section 11.6.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 12. BLAST</font><br /><br /><font style="font-weight: bold;">Section 12.1.</font> Obtaining BLAST<br /><br /><font style="font-weight: bold;">Section 12.2.</font> String Matching and Homology<br /><br /><font style="font-weight: bold;">Section 12.3.</font> BLAST Output Files<br /><br /><font style="font-weight: bold;">Section 12.4.</font> Parsing BLAST Output<br /><br /><font style="font-weight: bold;">Section 12.5.</font> Presenting Data<br /><br /><font style="font-weight: bold;">Section 12.6.</font> Bioperl<br /><br /><font style="font-weight: bold;">Section 12.7.</font> Exercises<br /><br /><font style="font-weight: bold; color: rgb(255, 204, 51);">Chapter 13. Further Topics</font><br /><br /><font style="font-weight: bold;">Section 13.1.</font> The Art of Program Design<br /><br /><font style="font-weight: bold;">Section 13.2.</font> Web Programming<br /><br /><font style="font-weight: bold;">Section 13.3.</font> Algorithms and Sequence Alignment<br /><br /><font style="font-weight: bold;">Section 13.4.</font> Object-Oriented Programming<br /><br /><font style="font-weight: bold;">Section 13.5.</font> Perl Modules<br /><br /><font style="font-weight: bold;">Section 13.6.</font> Complex Data Structures<br /><br /><font style="font-weight: bold;">Section 13.7.</font> Relational Databases<br /><br /><font style="font-weight: bold;">Section 13.8.</font> Microarrays and XML<br /><br /><font style="font-weight: bold;">Section 13.9.</font> Graphics Programming<br /><br /><font style="font-weight: bold;">Section 13.10.</font> Modeling Networks<br /><br /><font style="font-weight: bold;">Section 13.11.</font> DNA Computers<br /><br /><font style="font-weight: bold;">Appendix A.</font> Resources<br /><br /><font style="font-weight: bold;">Section A.1.</font> Perl<br /><br /><font style="font-weight: bold;">Section A.2.</font> Computer Science<br /><br /><font style="font-weight: bold;">Section A.3.</font> Linux<br /><br /><font style="font-weight: bold;">Section A.4.</font> Bioinformatics<br /><br /><font style="font-weight: bold;">Section A.5.</font> Molecular Biology<br /><br /><font style="font-weight: bold;">Appendix B.</font> Perl Summary<br /><br /><font style="font-weight: bold;">Section B.1.</font> Command Interpretation<br /><br /><font style="font-weight: bold;">Section B.2.</font> Comments<br /><br /><font style="font-weight: bold;">Section B.3.</font> Scalar Values and Scalar Variables<br /><br /><font style="font-weight: bold;">Section B.4.</font> Assignment<br /><br /><font style="font-weight: bold;">Section B.5.</font> Statements and Blocks<br /><br /><font style="font-weight: bold;">Section B.6.</font> Arrays<br /><br /><font style="font-weight: bold;">Section B.7.</font> Hashes<br /><br /><font style="font-weight: bold;">Section B.8.</font> Operators<br /><br /><font style="font-weight: bold;">Section B.9.</font> Operator Precedence<br /><br /><font style="font-weight: bold;">Section B.10.</font> Basic Operators<br /><br /><font style="font-weight: bold;">Section B.11.</font> Conditionals and Logical Operators<br /><br /><font style="font-weight: bold;">Section B.12.</font> Binding Operators<br /><br /><font style="font-weight: bold;">Section B.13.</font> Loops<br /><br /><font style="font-weight: bold;">Section B.14.</font> Input/Output<br /><br /><font style="font-weight: bold;">Section B.15.</font> Regular Expressions<br /><br /><font style="font-weight: bold;">Section B.16.</font> Scalar and List Context<br /><br /><font style="font-weight: bold;">Section B.17.</font> Subroutines and Modules<br /><br /><font style="font-weight: bold;">Section B.18.</font> Built-in Functions<br /><br /><font style="font-weight: bold;">Index</font><br /><br /><div style="text-align: center;"><iframe src="http://rcm.amazon.com/e/cm?t=sciencarttech-20&o=1&p=8&l=as1&asins=0596000804&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr" style="width: 120px; height: 240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0"></iframe><br /></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3172864904300737015.post-34481163127804495992010-01-22T02:11:00.000-08:002010-01-22T02:19:09.050-08:00Bioinformatics: Different Blast Programs<div style="text-align: center;"><span style="font-weight: bold; color: rgb(255, 204, 0);font-size:180%;" >Bioinformatics: Different Blast Programs</span><br /></div><br /><span style="font-weight: bold;">BLAST</span> or (<span style="font-weight: bold;">Basic Local Alignment Search Tool</span>) is a set of programs that search for similar sequences to your query sequence, so you can find hundreds of similar sequences to yours in about 20 seconds.<br /><br />Blast have a set of programs, each with a specific role:<br /><br /><span style="font-weight: bold;">BLASTN:</span> Nucleotide query sequence against nucleotide sequence database.<br /><br /><span style="font-weight: bold;">BLASTP:</span> Amino acid query sequence against a protein sequence database. you can find it <a href="http://www.expasy.ch/tools/blast/">HERE. </a><br /><br /><span style="font-weight: bold;">BLASTX:</span> Nucleotide query sequence translated in all six reading frames against a protein sequence database.<br /><br /><span style="font-weight: bold;">TBLASTX:</span> Six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.<br /><br /><span style="font-weight: bold;">TBLASTN:</span> Protein query sequence against a nucleotide sequence database translated in all six reading frames. you can find it <a href="http://www.expasy.ch/tools/blast/">HERE</a>.<br /><br />Or you can find them all at <a href="http://www.ch.embnet.org/software/aBLAST.html">ch.EMBnet.org</a><br /><br />You can find also other programs such as:<br /><br /><span style="font-weight: bold;">1- PSI-BLAST:</span> Position Specific Iterative BLAST detects weak homologs by building a profile from a multiple alignment of the highest scoring hits in an initial BLAST search.<br />Available at <a href="http://www.ncbi.nih.gov/BLAST">NCBI .<br /></a><br /><span style="font-weight: bold;">2- PHI-BLAST:</span> Pattern-Hit Initiated BLAST combines matching of regular expressions with local alignments surrounding the match.<br />Available at<a href="http://www.ncbi.nih.gov/BLAST"> NCBI </a>.<br /><br />To learn how to use Blast to search for similarities, you can see this Video Tutorial <a href="http://bioinformatics-made-easy.blogspot.com/2009/12/bioinformatics-using-blast-to-search.html">HERE</a>.<br /><br />Any questions, you are welcome.Unknownnoreply@blogger.com0