Jump to Section:
Overview
You can upload metagenomic and/or whole genome sequence data to CZ ID through the command line interface (CLI) for antimicrobial resistance gene (AMR) detection. After uploading data, samples will run through the AMR pipeline. You can then view identified AMR genes through the AMR Sample Report found in the CZ ID web application. Note that you will have to complete your account profile the first time you log in to the web application.
The CLI feature for AMR analysis is available for short-read data obtained with Illumina sequencers. Here we describe how to upload samples to CZ ID for AMR analysis through the CLI and troubleshooting tips. We list steps for uploading samples using Mac and Windows operating systems (OS).
After reading this guide, you will be able to:
- Install the CZ ID CLI on your computer
- Set up a connection with your CZ ID account
- Upload short-read data to CZ ID for AMR analysis using the CLI
- View sample reports in the CZ ID web application
Why Use the CLI?
Although uploading Illumina samples through the CZ ID web application is straightforward, the CLI offers some advantages over the web interface. Uploading samples through the CLI may be faster than the web upload in some cases. Therefore, if sample upload is taking too long or is failing through the web interface, you may want to consider trying the CLI. The CLI also enables you to upload samples directly from systems with no user interface (e.g., remote servers) and may allow you to incorporate sample upload to CZ ID into automated workflows using other tools.
Install the CZ ID CLI and Upload Samples to CZ ID Using a Mac OS
Below we provide instructions to install CZ ID CLI on your Mac or Linux system and establish a connection with your CZ ID account. Note that you only need to perform these steps once. After setting up your connection, you only need to log in to CZ ID to work with samples through the CLI. We also describe how to upload files to CZ ID through the CLI for mNGS analysis. The instructions are divided into five general steps listed below. If you already installed and uploaded samples through the CLI, go to Step 4 to view upload codes for AMR samples.
Step 2: Set up an initial connection with your CZ ID account
Step 4: Prepare upload command (includes templates)
Step 5: Upload samples to CZ ID for mNGS analysis
Step 1: Install the CZ ID CLI
You can easily install the latest release for the CZ ID CLI using Homebrew. To install the CZ ID CLI through Homebrew:
1) Download and install Homebrew on your computer by following steps 2 through 4. Go to step 5, if you already have Homebrew on your computer.
2) Open your terminal.
3) Go to Homebrew and copy the installation command on the web page.
4) Paste the command into your Terminal and continue with the installation by following the prompts. Make sure to run the last two commands listed in the instructions to add Homebrew to your PATH environment variables.
5) After installing Homebrew, add the “chanzuckerberg tap” by typing the following command into your Terminal:
brew tap chanzuckerberg/tap
Note: If everything is going well, you should see a “Tapping chanzuckerberg/tap” message.
6) After adding the tap, install the CZ ID CLI package by typing the following into your Terminal:
brew install czid-cli
Note: If everything is going well, you should see messages regarding the progress of package download and installation.
MacOS Terminal highlighting CZ ID CLI installation commands using Homebrew, including (1) the chanzuckerberg tap and (2) CZ ID's CLI.
Step 2: Set up Initial Connection with Your CZ ID Account
Use your credentials to log in to CZ ID via CLI. To do this:
1) Open your Terminal.
2) Type the following command:
czid login
3) You will be provided a user code and directed to the web to log in to CZ ID with your username and password.
After typing "czid login" on your Terminal, you will be provided with a user code and directed to the web.
4) A Device Confirmation web page will automatically open on your browser. You should see the same user code that was provided through the CLI to confirm your device.
Look at your user code on the CLI and make sure it matches the one on the web page to confirm your device.
5) After confirming your device, a new page will appear for you to log in into CZ ID using your credentials.
Log in into CZ ID using your email and password.
6) An Authorization page will appear. Click "Accept" to authorize access to your CZ ID account.
7) A Confirmation page will appear indicating that your connection through the CLI has been established.
8) The first time you use the CLI you need to accept CZ ID's user agreement. To do this, go back to the Terminal and accept the user agreement by entering the following:
czid accept-user-agreement
Note: You will not be prompted to accept an agreement, simply type the command above. After you enter the command, the user agreement will be printed and you will be prompted to accept the agreement by typing "y" or "Y".
After setting up your CLI connection, go back to the the CLI to accept CZ ID's user agreement. To do this, (1) type the user agreement command to view the terms of the agreement and (2) accept the agreement by typing "y".
9) You are all set to use CZ ID's CLI! Type your upload command to begin uploading data.
Note: Next time you need to use the CLI, simply log in to CZ ID using the log in command (step 2) and confirm your device (step 4).
Step 3: Get Files Ready
To upload files, you need to have your project information and files ready. This information will be specified in your upload command (see Step 4). Make sure that all your files are in the same directory or folder.
In your upload command, you will specify project name, sample name, and filenames for metadata and read files. See details below.
Project name: Uploaded samples will be organized under a project.
-
- Uploading to an existing project: Reference the project by using the project name of interest in your upload command while uploading samples through the CLI.
- Uploading to a new project: If you would like to create a new project, you have to create it within your account using the CZ ID web interface first and use the new project name while uploading samples through the CLI. See Project Selection within Upload Data through the Web App for details.
Metadata file: Sample information can be provided in a comma-delimited file (“csv” file extension). See Metadata instructions and dictionary for details regarding metadata requirements and format.
-
- If you download metadata for samples on your CZ ID account, the metadata file will be already in the correct format.
- If you need to prepare a metadata file, we recommend using our Metadata template to generate your file. Not all metadata in the template is required. If you don’t have information for a given metadata field, simply leave it blank. Save your edited file as a comma-delimited file (“csv” file extension).
- Note that there are seven required metadata entries for samples, including:
-
- Sample Name
- Sample names should match filenames specified in the command or folder. You should remove the part of the filename indicating the sequenced end for Illumina data (i.e., _R1 or _R2) and file extension (e.g., fastq or fastq.gz). For example, for filename “Sample1_S001_ R1.fastq” you should only specify sample name “Sample1_S001” in your metadata file or command.
- Collection Location
- If possible, provide information specifying more than the country. However, don’t provide more than county-level information to protect personally identifiable information.
- Collection Date
- For privacy reasons, only include month and year in the following format: YYYY-MM.
- Nucleotide Type
- Sample Type
- Water Control
- Host Organism
- Sample Name
-
Example metadata file for uploading a single sample. Make sure sample names provided in the metadata file match sequencing filenames.
Read files: You will need to specify sequencing files by providing filenames (single sample upload) or the path to the directory containing sequencing files (bulk upload). Note the following regarding sequencing files:
-
- CZ ID mNGS pipelines support FASTQ formats, including: ".fastq", ".fq", ".fastq.gz", ".fq.gz"
- File names must be no longer than 120 characters and can only contain letters from the English alphabet (A-Z, upper and lower case), numbers (0-9), periods (.), hyphens (-) and underscores (_). Spaces are not allowed.
- When uploading a single sample, you can specify a single file for single-end reads or two files for paired-end reads. FASTQ files from multiple lanes associated with a single sample are automatically concatenated during upload.
-
If you are uploading more than 1 sample at a time, you have to specify the path to a directory containing sequencing files.
- The CZ ID CLI will search the directory for read files and automatically upload supported files types (.fastq/.fq/.fastq.gz/.fq.gz).
- Sample names will be assigned using file names. Sample names will include the base name of the file without the extension specifying the sequenced end ( e.g., _R1, _R2, _R1_001, and _R2_001 ) and file type. For example, the sample name for file “Sample1_R1_001.fq” will be “Sample1”.
Step 4: Prepare Sample Upload Command
Now that you have sample and file information ready, you can work on your code or command to upload sample files for mNGS analysis through the CLI. You will use this command on Step 5.
Write your upload command using a plain text editor. Below we provide code templates for uploading files. You can copy the commands that suit your needs and edit accordingly using your text editor of choice. DO NOT USE MICROSOFT WORD or text editors that are not in plain text format because these programs will disrupt the required format and your code will not work.
You can use TextEdit, a built-in text editor on Mac OS, to work on your upload code. However, make sure to set the format to plain text before pasting the code template. Use the Format dropdown menu to set the format to plain text.
Upload code templates for Mac
Single sample (no metadata file): Upload a single sample by directly providing metadata using the -m flag and specifying sequencing files (e.g., paired-end data).
czid amr upload-sample \
--project 'Your Project ID' \
-m 'Collection Date=YYYY-MM' -m 'Collection Location=Location' -m 'Nucleotide Type=RNA or DNA' -m 'Sample Type=Sample Type' -m 'Water Control=Yes or No' -m 'Host Organism=Host'\
'Your_Sample_File_R1.fastq.gz' 'Your_Sample_File_R2.fastq.gz'
Single sample (with metadata file): Upload a single sample by specifying metadata and sequencing files (e.g., paired-end data).
czid amr upload-sample \
--project 'Your Project ID' \
--metadata-csv 'Your_metadata_file.csv' \
'Your_Sample_File_R1.fastq.gz' 'Your_Sample_File_R2.fastq.gz'
Multiple samples (with metadata file): Upload multiple samples by specifying metadata file and path to sequencing file folder.
czid amr upload-samples \
--project 'Your Project ID' \
--metadata-csv 'Your_metadata_file.csv' \
'Path_to_samples_directory'
Step 5: Upload Files to CZ ID for mNGS Analysis
Now that you have sample information ready and upload command ready, you can upload sample files. To upload your data to CZ ID:
1) Log in to your CZ ID CLI account by opening your Terminal and typing the following command:
czid login
2) Set your directory to the folder containing sample files:
cd Path_to_directory
3) Copy and paste the upload code you edited in Step 4 into your Terminal and press enter. If everything is going well, you will see a "starting upload" message showing the upload progress.
Example command for uploading multiple samples (mNGS) and upload progress.
4) The AMR pipeline will begin running automatically after your files are uploaded.
5) Go to the CZ ID web interface and log in to your account (note that you will have to complete your account profile the first time you log in to the web application).
6) Go to the Project page of interest to check on the status of your sample. The image below highlights features of a Project page listing the status of AMR samples.
- Project Name
- Antimicrobial Resistance Tab
- Sample Status: Specifies sample progress. When the run is successfully completed, you will see a "Complete" status highlighted in green.
7) Once the AMR run is "Complete", click on the sample to go to the AMR Sample Report page. If you encounter issues, please get in touch with our team by selecting "Contact Us" from the Username dropdown menu in the upper right hand corner of your screen.
Install the CZ ID CLI and Upload Samples to CZ ID Using Windows OS
Below we provide instructions to install CZ ID CLI on your Windows device and establish a connection with your CZ ID account. Note that you only need to perform these steps once. After setting up your connection, you only need to log in to CZ ID to work with samples through the CLI. We also describe how to upload files to CZ ID through the CLI for consensus genome assembly. The instructions are divided into five general steps listed below. If you already installed and uploaded samples through the CLI, go to Step 4 to view upload codes for AMR samples.
Step 2: Set up an initial connection with your CZ ID account
Step 4: Prepare upload command (includes templates)
Step 5: Upload samples to CZ ID for mNGS analysis
Step 1: Install the CZ ID CLI
To install the CZ ID CLI on your Windows device you need to download the CZ ID CLI executable and run it on your computer. To do this:
1) Find the latest CLI release on the CZ ID CLI GitHub page and download the compressed file named "czid-cli_windows_amd64.zip".
2) Decompress or unzip the downloaded "czid-cli_windows_amd64.zip" file.
3) Move the czid executable ("czid.exe") to your desired directory and copy the path to your clipboard. You can copy the path by right clicking on the file and selecting “Copy as path”.
4) Add the "czid.exe" path to your environment variables by following steps 5 and 6.
5) Search for “environment variables” using the File explorer and select “Edit the system environment variables”.
6) A "System Properties" dialog box will appear where you can add the new path. Click "Environment Variables..." under the Advanced tab.
7) Select or highlight "Path" under the options for “System variables” and click "Edit". An "Environment Variables" dialog box will appear (Step 8).
8) Click "New” under the Edit environment variable dialog box and paste the path to the "czid.exe" file that you copied during step 3. Note that the file name should not be included in the path and you need to delete quotation marks.
Example:
C:\Users\UserX\Documents\CZID-CLI\czid-cli_windows_amd64\
9) Click “OK” on all the dialog boxes.
10) CZ ID's CLI ("czid.exe") path is now added to the Environment Variables.
Step 2: Set up an Initial Connection with Your CZ ID Account
Use your credentials to log in to CZ ID via CLI. To do this:
1) Open a Command Prompt window.
To open a Command Prompt window, search for “Command” using the File Explorer and click on the Command Prompt App.
2) Type the following command:
czid login
3) You will be provided a user code and directed to the web to log in to CZ ID with your username and password.
After typing "czid login" in your Terminal, you will be provided with a user code and directed to the web.
4) A Device Confirmation web page will automatically open on your browser. You should see the same user code that was provided through the CLI to confirm your device.
Look at your user code on the CLI and make sure it matches the one on the web page to confirm your device.
5) After confirming your device, a new page will appear for you to log in into CZ ID using your credentials.
Log in into CZ ID using your email and password.
6) An Authorization page will appear. Click "Accept" to authorize access to your CZ ID account.
7) A Confirmation page will appear indicating that your connection through the CLI has been established.
8) The first time you use the CLI you need to accept CZ ID's user agreement. To do this, go back to the Terminal and accept the user agreement by entering the following:
czid accept-user-agreement
Note: You will not be prompted to accept an agreement, simply type the command above. After you enter the command, the user agreement will be printed and you will be prompted to accept the agreement by typing "y" or "Y".
After setting up your CLI connection, go back to the the CLI to accept CZ ID's user agreement. To do this, (1) type the user agreement command to view the terms of the agreement and (2) accept the agreement by typing "y".
9) You are all set to use CZ ID's CLI! Type your upload command to begin uploading data.
Note: Next time you need to use the CLI, simply log in to CZ ID using the log in command (step 2) and confirm your device (step 4).
Step 3: Get Files Ready
To upload files, you need to have your project information and files ready. This information will be specified in your upload command (see Step 4). Make sure that all your files are in the same directory or folder.
In your upload command, you will specify project name, sample name, and filenames for metadata and read files. See details below.
Project name: Uploaded samples will be organized under a project.
-
- Uploading to an existing project: Reference the project by using the project name of interest in your upload command while uploading samples through the CLI.
- Uploading to a new project: If you would like to create a new project, you have to create it within your account using the CZ ID web interface first and use the new project name while uploading samples through the CLI. See Project Selection within Upload Data through the Web App for details.
Metadata file: Sample information can be provided in a comma-delimited file (“csv” file extension). See Metadata instructions and dictionary for details regarding metadata requirements and format.
-
- If you download metadata for samples on your CZ ID account, the metadata file will be already in the correct format.
- If you need to prepare a metadata file, we recommend using our Metadata template to generate your file. Not all metadata in the template is required. If you don’t have information for a given metadata field, simply leave it blank. Save your edited file as a comma-delimited file (“csv” file extension).
- Note that there are seven required metadata entries for samples, including:
-
- Sample Name
- Sample names should match filenames specified in the command or folder. You should remove the part of the filename indicating the sequenced end for Illumina data (i.e., _R1 or _R2) and file extension (e.g., fastq or fastq.gz). For example, for filename “Sample1_S001_ R1.fastq” you should only specify sample name “Sample1_S001” in your metadata file or command.
- Collection Location
- If possible, provide information specifying more than the country. However, don’t provide more than county-level information to protect personally identifiable information.
- Collection Date
- For privacy reasons, only include month and year in the following format: YYYY-MM.
- Nucleotide Type
- Sample Type
- Water Control
- Host Organism
- Sample Name
-
Example metadata file for uploading a single sample. Make sure sample names provided in the metadata file match sequencing filenames.
Read files: You will need to specify sequencing files by providing filenames (single sample upload) or the path to the directory containing sequencing files (bulk upload). Note the following regarding sequencing files:
-
- CZ ID mNGS pipelines support FASTQ formats, including: ".fastq", ".fq", ".fastq.gz", ".fq.gz"
- File names must be no longer than 120 characters and can only contain letters from the English alphabet (A-Z, upper and lower case), numbers (0-9), periods (.), hyphens (-) and underscores (_). Spaces are not allowed.
- When uploading a single sample, you can specify a single file for single-end reads or two files for paired-end reads. FASTQ files from multiple lanes associated with a single sample are automatically concatenated during upload.
-
If you are uploading more than 1 sample at a time, you have to specify the path to a directory containing sequencing files.
- The CZ ID CLI will search the directory for read files and automatically upload supported files types (.fastq/.fq/.fastq.gz/.fq.gz).
- Sample names will be assigned using file names. Sample names will include the base name of the file without the extension specifying the sequenced end ( e.g., _R1, _R2, _R1_001, and _R2_001 ) and file type. For example, the sample name for file “Sample1_R1_001.fq” will be “Sample1”.
Step 4: Prepare Sample Upload Command
Now that you have sample and file information ready, you can work on your code or command to upload sample files for mNGS analysis through the CLI. You will use this command on Step 5.
Write your upload command using a plain text editor (e.g., Notepad, WordPad). Below we provide code templates for uploading files. You can copy the commands that suit your needs and edit accordingly using your text editor of choice. DO NOT USE MICROSOFT WORD or text editors that are not in plain text format because these programs will disrupt the required format and your code will not work.
Upload code templates for Windows
Single sample (no metadata file): Upload a single sample by directly providing metadata using the -m flag and specifying sequencing files (e.g., paired-end data).
czid amr upload-sample "Your_Sample_file_R1.fastq.gz" "Your_Sample_file_R2.fastq.gz" --project "Your Project ID" -m "Collection Date=YYYY-MM" -m "Collection Location=Location" -m "Nucleotide Type=RNA or DNA" -m "Sample Type=Sample Type" -m "Water Control=Yes or No" -m "Host Organism=Host"
Single sample (with metadata file): Upload a single sample by specifying metadata and sequencing files (e.g., paired-end data).
czid amr upload-sample "Your_Sample_file_R1.fastq.gz" "Your_Sample_file_R2.fastq.gz" --project "Your Project ID" --metadata-csv "Your_metadata_file.csv"
Multiple samples (with metadata file): Upload multiple samples by specifying metadata file and path to sequencing file folder.
czid amr upload-samples "Path_to_your_sample_directory" --project "Your Project ID" --metadata-csv "Your_metadata_file.csv"
Step 5: Upload Files to CZ ID for mNGS Analysis
Now that you have sample information ready and upload command ready, you can upload sample files. To upload your data to CZ ID:
1) Log in to your CZ ID CLI account by opening your Terminal and typing the following command:
czid login
2) Set your directory to the folder containing sample files:
cd Path_to_directory
3) Copy and paste the upload code you edited in Step 4 into your Terminal and press enter. If everything is going well, you will see a "starting upload" message showing the upload progress.
Example command for uploading multiple samples (mNGS) and upload progress.
4)The AMR pipeline will begin running automatically after your files are uploaded.
5) Go to the CZ ID web interface and log in to your account (note that you will have to complete your account profile the first time you log in to the web application).
6) Go to the Project page of interest to check on the status of your sample. The image below highlights features of a Project page listing the status of AMR samples.
- Project Name
- Antimicrobial Resistance Tab
- Sample Status: Specifies sample progress. When the run is successfully completed, you will see a "Complete" status highlighted in green.
7) Once the AMR run is "Complete", click on the sample to go to the AMR Sample Report page. If you encounter issues, please get in touch with our team by selecting "Contact Us" from the Username dropdown menu in the upper right hand corner of your screen.
Troubleshooting Tips
- If you get an error message indicating "No such file or directory" after executing the upload command, make sure that:
-
- The spelling in your code matches the specified filenames
- The files are found in the correct directory
- There are no unexpected characters between command line arguments (make sure you edit the code using a plain text editor)
-
- If you are having problems and getting error messages, some of which may refer to the CZ ID CLI GitHub page, make sure that you have the latest release of the CZ ID CLI. Check for release updates on the CZ ID CLI GitHub page.
- If you see "czid API responded with error code 500", it is likely related to the Collection Location field of your metadata file. Try specifying locations without any spaces. For example, instead of "Texas, USA" specify "Texas,USA" (i.e., no spaces).
- If you see an error message that includes "...upload multipart failed...cause: operation error S3: UploadPart, exceeded maximum number of attempts...socket: too many open files" it means that you are trying to upload too many files at the same time. Try uploading in smaller batches.
- Use the czid login --help command to find more information about command options to upload data.
- Find more details about the CLI on the CZ ID CLI GitHub page.
Comments
0 comments
Article is closed for comments.