Jump to Section:
Overview
The total number of rows in the sample report table can be overwhelming given that CZ ID defaults to showing everything that has been detected in a sample. Metagenomic next-generation sequencing is a non-targeted approach that may pick up contaminant and/or commensal microbes that are often irrelevant when looking for pathogenic microbes causing disease. Luckily, CZ ID's filtering functionality makes it easier to reduce non-relevant results in the sample report table (see Reducing Noise in Sample Report). Here we describe Category, Threshold, Read Specificity, Annotation, and Pathogen filters found above the sample report table.
Note that applied filter(s) will appear in a blue box above the report table. You can remove applied filter(s) at any point by clicking the "X" next to the filter you wish to remove. The text below the filter boxes will specify how many rows passed the specified filter(s).
Category Filter
The Category Filter gives you the ability to focus on microbial groups of interest (e.g., Archaea, Bacteria, Eukaryotes, or Viruses). Select which group(s) you would like to view in the report table from the Categories dropdown menu.
Threshold Filters
Threshold Filters enable you to filter out low-confidence taxa using thresholds for abundance and/or alignment metrics. These thresholds are useful for removing spurious matches (false-positives) or taxa that are not abundant enough to have an impact. Use the Threshold Filters dropdown menu to add thresholds for multiple metrics at once using "AND" logic. Click the Apply button after specifying desired threshold(s) from the dropdown menu.
The Threshold Filters dropdown menu includes:
- Metric menu: Choose desired metric to set threshold(s) from this dropdown.
- Minimum or maximum: Select if taxa should be filtered based on "greater than" or "smaller than" specified threshold value.
- Threshold value: Use this box to set value for threshold.
- Add threshold: Click here to specify thresholds for additional metrics.
Read Specificity Filter
By default, the sample report table will only display taxa that have been classified to the species level. You can change this using the Read Specificity Filter. Select "Specific Only" (default) to only view taxa that have been classified to the species level or "All" to view all taxa, including those that were classified at the genus level.
Annotation Filter
After annotating detected taxa as "Hit", "Not a hit", or "Inconclusive", you can filter the report based on those annotations. Select the annotation of interest from the Annotation Filter dropdown menu. Click here to learn more about how to annotate taxa in the report.
Pathogen Flag Filter
You can use the Pathogen Flag Filter to limit the report table to known pathogenic taxa. To do this, select the "Known Pathogens" option from the Pathogen Flag Filter dropdown menu. Taxa are flagged as "Known Pathogens" based on CZ ID's pathogen list. Note that this list is not fully comprehensive and does not distinguish pathogenic strains or subtypes from non-pathogenic ones in many cases. Therefore, CZ ID's pathogen flag should be viewed as a starting point to quickly identify potential pathogenic organisms in samples.
Reduce Noise in Sample Report
Simple ways to reduce noise in your report:
-
Add a threshold filter of NT L alignment length > 50bp. Short alignments (NT < 36 bp, NR < 10) are filtered out by the pipeline, largely reducing false positives, but depending on read length, a filter of 50 bp can remove additional false-positives.
-
Review each category (Virus, Bacteria, Eukaryote) separately to find taxa of interest.
-
Add a threshold filter of NT rPM > 10 . Note that some viral pathogens may be present at low levels - you may choose to use a lower threshold (i.e., NT rPM > 1) for viruses.
-
Create a background model for your samples and apply a threshold for Z-score to view taxa that are more abundant in samples than in negative controls (e.g., NT Z-score > 1).
-
Concordance between NT and NR matches provides more confidence in a given taxon match. Therefore, applying NT r > 0 and NR r > 0 thresholds reduces many spurious hits.
-
Check out the coverage visualization at the species level to see if there is adequate coverage to gain confidence in the match.
Comments
0 comments
Please sign in to leave a comment.