Jump to Section:
Overview
The AMR Sample Report contains the main AMR gene detection results, including AMR gene information, metrics for alignments against AMR reference sequences based on contig and read data, and pathogen-of-origin prediction results. Below we explain the information contained within the report using general categories, including AMR gene information and metrics for AMR sequence matches for contigs and reads. In addition, we describe how to find more information about detected AMR genes genes of interest through the AMR Gene Details Panel.
AMR Sample Report Layout
By default, the AMR Sample Report Table will be sorted by the Gene column and show a set of columns containing AMR gene information and metrics for contigs and reads. However, you can customize the number of columns, sort, and filter the table to suit your needs.
To view the AMR Sample Report, go to a Project page and click on a completed sample of interest.
The AMR Sample Report Table summarizes detected AMR gene information and alignment metrics for contigs and reads.
You may notice that some AMR genes are not supported by both contigs and reads. This is expected due to differences in the pipeline workflow for contigs and reads. See AMR FAQs for more details.
AMR Gene Information
The AMR pipeline organizes information regarding detected AMR sequences based on the Antibiotic Resistance Ontology (ARO) as implemented in the Comprehensive Antibiotic Resistance Database (CARD). The ARO provides a unified and controlled framework for naming and classifying known antibiotic resistance determinants (genes and mutations) and their associated products, mechanisms, and targets. Click here to see the ARO index listing known AMR determinants and their ARO accession numbers. The AMR Sample Report includes the following information regarding identified AMR genes:
Gene: Gene name for best match in CARD. Reported gene names are associated with ARO accession numbers that are not included in the report.
Gene Family: Refers to the ARO category for gene family. See CARD classification for a list of AMR gene families.
Drug Class: Refers to the ARO category specifying resistance to a given antibiotic molecule, including antibiotic/adjuvant combination medications. Below we summarized major antibiotic classes for your reference (see Merck Manual for more information).
High-level Drug Class: Refers to the antibiotic family of identified drug class. Antibiotic families are classified based on mechanism of action, chemical structure, or spectrum of activity. This antibiotic family level classification is only available for samples uploaded to projects created after April 19, 2023 (specifically those uploaded to AMR pipeline version 1.2.14 and later).
Mechanism: Refers to the ARO category specifying the resistance mechanism for a given AMR gene.
Model: Specifies the model used for AMR detection. For contigs, the detection models use predetermined BLAST bit-score cutt-offs to detect AMR-related sequences within a “Perfect”, “Strict”, and "Loose" match paradigm (see “Cutoff” definition below). There are four types of models used by the AMR pipeline for aligning contig sequences to AMR genes, including:
- Protein Homolog Models (PHM): Use BLASTP to detect AMR-associated protein sequences based on their similarity to a curated reference sequence. This is the only model reported for read alignments (see note below).
- Protein Variant Models (PVM): Designed to detect AMR acquired via mutation of house-keeping genes or antibiotic targets. In addition to using BLASTP to detect AMR-associated protein sequences, PVMs screen query sequences for curated sets of mutations that could differentiate them from antibiotic susceptible genotypes.
- Protein Overexpression Models (POM): Used to detect mutations within regulatory proteins alone. POMs report wild-type sequences and/or sequences with mutations leading to overexpression of efflux complexes.
- Ribosomal RNA (rRNA) Gene Variant Models (RVM): Designed to detect AMR acquired via mutation of genes encoding ribosomal RNAs (rRNA). In addition to using BLASTN to detect AMR-associated rRNA sequences, RVMs screen query sequences for curated sets of mutations that could differentiate them from antibiotic susceptible genotypes.
Note:
The AMR pipeline only aligns reads to reference sequences from CARD's protein homolog models (PHM). Therefore, the AMR Sample Report will only contain metrics at the read level for matches found with PHM.
Contig Metrics
The AMR Sample Report includes information regarding AMR sequence matches based on contig sequences, including:
Contigs: Refers to the total number of contig sequences matching a given AMR gene.
Cutoff: Indicates the cutoff(s) used to detect AMR-associated contigs based on a “Perfect”, “Strict” and "Loose" match paradigm using curated BLAST bit-scores. The "Nudged" cutoff specifies Loose matches that have at least 95% identity. Below we describe each cutoff specified in the sample report.
- Perfect: Detects perfect or identical matches to the curated reference sequences in CARD.
- Strict: Detects previously unknown variants of known AMR genes and includes a secondary screen for key mutations.
- Loose: Detects new and more distant homologs of AMR genes by working outside of established BLAST bit-score cut-offs allowing for AMR gene discovery. However, this cutoff also finds spurious partial matches that may not have a role in AMR and, thus, may lead to artificially increased AMR detection. CZ ID only reports Loose contig matches that have at least 95% identity to known AMR genes (identified as "Nudged" in the AMR Sample Report).
- Nudged: Indicates Loose matches that have at least 95% identity to known AMR genes. It is important to double-check these nudged matches because they are based on a percent identity threshold (as opposed to curated BLAST bit-scores) and do not take into account alignment length. We recommend checking the coverage breadth to build confidence in Nudged matches.
Nudge from Strict to Perfect:
RGI, as implemented in CZ ID, bumps (or nudges) Strict matches to Perfect in some cases. These nudged Strict to Perfect matches have nearly perfect alignments to AMR genes but lack coverage for the C- or N-terminus of AMR proteins or have alternate start codons. You will be not be able to distinguish these nudged Perfect matches from "true" Perfect matches based on the Cutoff column of the AMR Sample Report alone. If you want to distinguish nudged from true Perfect matches, look at the percent identity. In contrast to true Perfect matches, nudged matches will have < 100% identity to AMR genes. You can also view details regarding nudged contig matches by downloading the Comprehensive AMR Metric file from AMR Sample Report or Project pages (see Downloading AMR Data & Results for details).
Contig Coverage Breath (Contig % Cov): Refers to the percentage length of the reference sequence that was covered by contig sequences. If you would like to see details about coverage breadth for individual contigs (as opposed to the total coverage breadth), see “contig_amr_report.tsv” within the raw reports folder.
Contig Percent Identity (Contig % Id): Refers to the average percentage identity between contig sequences and their top match in CARD.
Contig Species: Refers to pathogen-of-origin prediction based on AMR-associated contig sequences (beta testing phase). Identified agents, including pathogens at the species or genus level and plasmids, are determined using a classification system that identifies k-mer sequences (default k-mer length of 61 bp) found uniquely in AMR alleles of pathogenic bacteria or plasmids.
Read Metrics
The AMR Sample Report includes information regarding AMR sequence matches based on read data, including:
Reads: Refers to the total number of reads mapping to a given AMR reference sequence.
Reads Per Million (rPM): Refers to the number of reads aligning to a reference sequence in CARD per million reads sequenced.
Read Coverage Breadth (Reads % Cov): Refers to the percentage length of the reference sequence that was covered by read sequences.
Read Coverage Depth (Reads Cov Depth): Indicates the mean read depth across the reference sequence.
Read Depth Per Million (dPM): Indicates the number of bases that mapped to the reference sequence in CARD, divided by sequence length, per million reads sequenced.
Read Species: Refers to pathogen-of-origin prediction based on AMR-associated read data (beta testing phase). Identified agents, including pathogens at the species or genus level and pathogen-associated plasmids, are determined using a classification system that identifies k-mer sequences (default k-mer length of 61 bp) found uniquely in AMR alleles of pathogenic bacteria or plasmids. Identified pathogenic agents are accompanied by the number of identified reads that match the reported species, genus, or plasmid. For a given read sequence to be classified, it needs to contain at least 10 k-mers that match a single category.
View AMR Gene Details Panel
You can find more information about AMR genes of interest listed in the Sample Report Table by opening the Gene Details Panel. The panel will show information gathered from the CARD Antibiotic Resistance Ontology, including a general description, drug resistances, AMR gene family, and publications for the gene of interest.
To view the AMR Gene Details Panel, click on a gene name of interest within the Gene column of the Sample Report Table. The Gene Details Panel will open on the right-hand of the page.
Click on a gene name of interest to view more information on the Gene Details Panel that will appear on the right-hand side of the page.
If there is no information for a given gene within CZ ID, you can search CARD directly using the gene name.
Use the CARD link to find more gene information by searching CARD directly.
AMR Heatmap
CZ ID does not offer a heatmap functionality to visualize AMR results. However, you can easily create a heatmap outside CZ ID by running a heatmap generator script that uses CZ ID's Combined AMR Reports as input data. The heatmap generator script can run locally on your machine or on the web via Google Colab. To learn more, visit the GitHub page for the AMR Heatmap Generator Script.
Notes regarding heatmap generator script:
- The heatmap generator script is not integrated into CZ ID. Google Colab is a separate service from CZ ID. Should you choose to upload your data to Google Colab, your data will be subject to Google’s Terms of Service and Privacy Policy. Click here to see frequently asked questions about Google Colab.
- We recommend uploading only the exact files specified in the instructions that are needed to generate the AMR heatmap. Do not upload host or raw sequencing data.
- Since the AMR heatmap generator script is external to CZ ID, our team will not be able to provide technical support. If you have any questions or comments, please post them in the GitHub page by creating a new issue.
Comments
0 comments
Please sign in to leave a comment.