In IDseq, each sample you upload belongs to a project. Select the Public Data tab to explore the publicly available projects on IDseq. To see all the samples within a project, click on a project name or search for it in the search bar and click on a suggestion. From the Project Page, you can see the list of samples within that project along with several metrics about the pipeline run displayed in the panel on the right.
To follow along with this documentation, navigate to the Medical Detectives project by searching for "Medical Detectives" in the search bar.
The number next to the Samples tab indicates that there are 15 samples in the Medical Detectives project. The table contains metrics to summarize the metadata and quality of the individual samples.
Customizing the View
The table below shows all possible columns for the Samples Table. The visible columns are marked with a check. The Samples Table defaults to a set list of columns but you can choose to add or remove columns by selecting the + button in the right-hand corner of the table and selecting column names.
You can also collapse the summary statistics and the filters on either side of the Samples Table by selecting the blue icons above the sidebars (see below). Hiding the sidebars can increase the available space to show more columns in the Samples Table.
Samples Metrics Summary Table
The Sample Metrics Summary Table displays a list of samples with metadata and information about the IDseq pipeline run.
Metrics and Their Meanings
You can reference the table below to better understand the meaning of each available metric (or column) in the Samples Table.
|Samples Table Metric (* fields are not default and need to be added to the table with the column selector)||Definition|
|Sample||The user-defined sample name.|
|Uploaded On||The date on which the sample was initially uploaded to IDseq.|
|Host||The user-supplied organism from which this sample was collected; this value is selected by the user at sample upload and dictates which genomes are used for initial host subtraction pipeline steps.|
|Location||The user-supplied location from which the sample was collected.|
|Passed Filters||The percentage of reads remaining after host and quality filtering, compared to the initial reads input. To learn more about pipeline steps you can check out our Wiki or read more in the Pipeline Details section.|
|Passed QC||The percentage of reads that passed the quality filtering thresholds imposed by Trimmomatic and PriceSeq. To learn more about pipeline steps, you can check out our Wiki or read more in the Pipeline Details section.|
|Total Reads*||The total number of reads uploaded.|
|DCR*||Duplicate Compression Ratio; The ratio of the total number of sequences present prior to running cd-hit-dup (duplicate identification) vs the number of unique sequences. High values indicate the presence of more duplicate reads, which in turn indicates lower library complexity.|
|ERCC reads*||The total number of reads aligning to ERCC (External RNA Controls Consortium) sequences. The number of ERCC reads will be proportionally higher in samples with relatively low input RNA as compared to those with high amounts of input RNA.|
|Nucleotide Type*||User-supplied metadata field indicating the nucleotide type (RNA, DNA).|
|Sample Type*||User-supplied metadata field indicating the sample type.|
|SubSampled Fraction*||After host filtration and QC, the remaining reads are subsampled to 1 million fragments (2 million paired reads). This field indicates the ratio of subsampled reads to total reads passing host filtration and QC steps.|
|Total Runtime*||The total time required by the IDseq pipeline to process the uploaded files into IDseq reports.|
To download the data in the Samples Table as a CSV file for further analysis, select the cloud icon above the sample table. A dropdown will appear. Select Sample Table to start the download.