Bioinformatics: Nucleotide sequence databases names for use with BLAST
In most cases people like to use
BLAST that is hosted on servers like
NCBI, but sometimes you would like to use a
command line BLAST already installed on your computer on a
Windows or
Linux operating system.
In order to do that you have to know the databases names that you can use in the command line
BLAST.
Here are some of the nucleotide databases names that you can use with
BLAST:
1- nr : Nonredundant GenBank, a database that provides comprehensive collections of both amino acid and nucleotide sequence data, with redundancy reduced by merging sequences that are completely identical.
2- est : expressed sequence tags.
3- sts : sequence tagged sites.
4- htgs : high-throughput genomic sequences.
5- ecoli : Complete genomic sequence of
E. coli.
6- yeast : Complete genomic sequence of
S. cerevisiae.
7- drosoph : Complete genomic sequence of
D. melanogaster.
8- mito : Complete genomic sequences of vertebrate mitochondria.
9- vector : Collection of popular cloning vectors.
These are some of the most used nucleotide databases names in a
BLAST search.
This is an example of a
BLAST command:
blastall -i blast.in -d nr -o blast.out
blastall: program name.
-i : input.
blast.in & blast.out : input and output file containing the sequence.
-d : database.
nr : Nonredundant.
You can read (
using BLAST to search for similarities) post to learn how to run a BLAST search against a database.
You can read (
Different Blast Programs) to learn about different BLAST programs.