Download Consensus Genome Data

Jump to Section:

Download Data for a Single Genome
Download Data for One or Multiple Genomes
Available Intermediate Files
Note on Data Submission to NCBI

Overview

You can easily download virus consensus genome data, including consensus genome sequences (FASTA format) and intermediate files produced throughout the pipeline. Here we outline steps to download consensus genome data.

After reading this guide, you will:

Learn different options for downloading consensus genome data
Become familiar with available intermediate files

Download Data for a Single Consensus Genome

You can download data for a single consensus genome from the Sample Report page. Here you can download the consensus genome sequence and generated intermediate files in a single folder.

To download a folder with consensus genome data:

1. Navigate to the Consensus Genome tab for the sample.

2. If you assembled multiple genomes using reads from the same sample, you can change the displayed consensus genome. To change the consensus genome, select the genome of interest from the dropdown menu. 3. To download all the data associated with the displayed consensus genome, click the "Download All” button on the right-hand side of the page.

Download Data for One or Multiple Consensus Genomes

You can download data for a single or multiple consensus genomes at the same time (bulk download) from the Consensus Genome tab for a project of interest. From this tab you can download the consensus genome sequence, assembly metrics, sample metadata, and intermediate files.

To download consensus genome files of interest:

Navigate to the Consensus Genomes tab found on the Project page of interest.
Select genomes of interest.
Click the download icon.
Select download type: A modal will appear to select the download type. Select the file of interest and click "Start Generating Download" button.
Find Downloads page: Some files will download directly to your device. However, most downloads will be available through the Downloads page. To get to the Downloads page, open the dropdown menu by your user name and select "Downloads".
Check download status: Once you navigate to the Downloads page, check the status of the download. Note that files available through the Downloads page will be deleted after 7 days of creating the download.
Download file: When the download is "complete", click the Download File link to download to your device.

Available Intermediate Files

You can download Intermediate files, including mapping information contained within BAM files, to troubleshoot genome quality issues that may need to be evaluated and submit data to public repositories. Below we describe available intermediate files for download through the Sample or Project pages for consensus genomes.

Filename	Description	Use
consensus.fa	Consensus genome sequence (FASTA format)	Assembled consensus genome that can be used for downstream analyses (e.g., phylogenetic tree builds)
depths.png	Image of coverage plot	Visualize genome coverage
report.tsv/report.txt	QUAST report in TSV and TXT format.	Evaluate assembly metrics
aligned_reads.bam	Initial reads that aligned to the reference genome	Can be used in a genome browser to view read-level alignments to the reference sequence and evaluate SNPs, ambiguous bases, etc.
primertrimmed.bam	Aligned reads after soft-clipping primer sequences. Note: The consensus genome pipeline available through the mNGS Sample Report does not include a primer trimming step. Therefore, the “aligned_reads.bam” and “primertrimmed.bam” files are the same for genomes generated through the mNGS Sample Report.	Can be used in a genome browser to view read-level alignments to the reference sequence and evaluate SNPs, ambiguous bases, etc.
primertrimmed.bam.bai	Companion index file for primertrimmed.bam (same as aligned_reads.bam file)	Used with primertrimmed.bam file to view read alignments in genome browser
sample.muscle.out.fasta	MUSCLE pairwise alignment between reference and consensus genome sequences in FASTA format.	Can be used to inspect alignment between reference and assembled consensus genomes
ercc_stats.txt	ERCC spike-in stats	Used for evaluating ERCC spike-in controls. Note that ERCC stats are also computed through the mNGS pipeline and are available in the sample details panel. The metrics may differ slightly due to different calculation methods.
no_host_1.fq.gz and no_host_2.fq.gz	Reads after subtracting host/human sequences (referred to as “non-host” reads)	Non-host reads cab be uploaded to the sequence read archive (SRA)
samtools_depth.txt	Text file summarizing read depth at each position of the reference sequence	Can be used to plot coverage
stats.json	Text file summarizing assembly metrics	Secondary quality control check for coverage
variants.vcf.gz	Single nucleotide polymorphism (SNP) data in variant call format (VCF)	Can be used to view variants and identify SNP locations. File can be viewed using the Integrative Genomics Viewer (IGV) to determine the number of variants within a host and identify SNP locations.

Note: BAM and VCF files can be viewed using the freely available Integrative Genomics Viewer (IGV), which includes a Web App for analyzing genomes online.

Note on Consensus Genome Data Submission to NCBI

Downloaded data can be submitted to NCBI’s public repositories for the benefit of the broader scientific community. Consensus genomes can be submitted to GenBank, whereas reads (non-host reads) and BAM files can be submitted to the Sequence Read Archive (SRA). Click here for an overview of how to submit sequence data to NCBI.

Articles in this section

Jump to Section:

Overview

Download Data for a Single Consensus Genome

To download a folder with consensus genome data:

Download Data for One or Multiple Consensus Genomes

To download consensus genome files of interest:

Available Intermediate Files

Note on Consensus Genome Data Submission to NCBI

Comments

Articles in this section

Jump to Section:

Overview

Download Data for a Single Consensus Genome

To download a folder with consensus genome data:

Download Data for One or Multiple Consensus Genomes

To download consensus genome files of interest:

Available Intermediate Files

Note on Consensus Genome Data Submission to NCBI

Related articles