BLAST (Basic Local Alignment Search Tool), is an NCBI tool that finds regions of similarity between sequences by comparing the query sequence to a database of known sequences. There are multiple types of BLAST (read about them here), however, CZ ID currently only supports BLASTN and BLASTX.
Performing a BLAST
When viewing the sample report you may view a hit that you want to confirm using BLAST. This feature can be accessed from two locations, the taxon row, and the coverage visualization.
BLAST from the taxon row
1a. To use this feature, hover over the taxon row, and click the BLAST icon.
BLAST from the coverage visualization
1b. When you are in the coverage visualization (which can be accessed by hovering
over the taxon row and clicking the coverage visualization icon.
2. Once you click the BLAST icon, a modal will pop up
BLASTN is the standard BLAST program that searches the nucleotide database using a nucleotide query sequence. Itis an integral part of the metagenomic workflow that can be used to:
- confirm taxon hit since BLAST is the most accurate algorithm for identifying sequence homology.
- Gather info necessary for publication such as the e value, % identity, and bit score.
- Find closest relatives of the taxon for creating phylogenetic trees.
- Performing quality control on contigs generated by SPADES (de novo assembler). By blasting contigs, users can identify chimeras (contigs formed by two or more reads that have been incorrectly joined) and reorient any contigs that need it for downstream analyses.
CZ ID will allow up to 3 of the longest contigs that aligned to the nucleotide (NT) database in NCBI to BLASTN. If no contigs aligned to the NT database, CZID will send up to 5 reads that aligned to the NT database. You can view the number of contigs, and the number of reads that aligned to the nucleotide database under the ‘contig’ and ‘r’ columns, respectively. An example is highlighted in orange below.
- If NT contigs are available, up to 3 of the longest contigs will be available for BLASTN.
- You can select which contigs you would like to BLAST by checking the box beside each contig.
BLASTX translates the nucleotide sequence into the a protein and compares it to the protein database. It should be used for:
- Identifying encoded proteins
- Confirming novel viruses
CZ ID will allow up to 3 of the longest contigs that aligned to the nucleotide (NT) database or the NR database in NCBI to BLASTX. If no contigs aligned to the NT database, CZID will send up to 5 reads that aligned to the NT database, and the same is for the NR hits. You can view the number of contigs, and the number of reads that aligned to the nucleotide database under the ‘contig’ and ‘r’ columns, respectively.
If you would like to BLAST additional contigs or reads, you can download either file by hovering over the taxon row and selecting the download icon. From here you can choose whether you would like the reads or contigs fasta file. You can then BLAST the contigs/reads of your choice. Read about additional BLAST types and how to interpret the results here.
3. Once you are ready to BLAST the reads or contigs, select ‘continue’.
4. You will be notified you are leaving CZ ID. Click continue to send your sequences to NCBI.
5. A BLASTN report will appear with default parameters. To compete the BLASTN, simply click “view report”. If you would like to learn about each parameter, click the blue question mark at the right of the box.
7. Once you have your BLAST results, you can annotate whether the taxon was confirmed, not a hit, or inconclusive using our annotation feature.