Nvidia Clara Parabricks is a GPU-accelerated software suite for performing secondary analysis of next generation sequencing (NGS) DNA and RNA data. It contains GPU-enabled versions of popular bioinformatics tools such as the aligners BWA-Mem and STAR.
Loading the container
On Rivanna, Clara Parabricks is available as a Singularity container. To load the
clara-parabricks container module, you can type:
module load singularity clara-parabricks
The load command will load a default version of Clara Parabricks, unless another version is specified. To see the available versions, type:
module spider clara-parabricks
The Clara Parabricks container on Rivanna includes many bioinformatics tools for genomics and transcriptomics. Each tool must be accessed using the Singularity
run command to activate the container, followed by the Clara Parabrics
pbrun command to call the designated tool, followed by arguments specific to each tool. See below for an example using the
fq2bam pipeline tool, which does a
BWA-Mem alignment, sorts reads by coordinates, marks duplicate reads with
GATK MarkDuplicates, and optionally generates a
#SBATCH -A <allocation> # allocation name
#SBATCH -p gpu # partition name
#SBATCH --gres=gpu:1 # request one gpu
#SBATCH -C "v100|a100" # constrain to a100 or v100 gpus
#SBATCH -N 1 # request 1 node
#SBATCH -c 8 # request 8 cores
#SBATCH -t 24:00:00 # set time limit of 24 hours
# prepare the environment
module load singularity clara-parabricks/4.0.3
# run parabricks fq2bam pipeline
singularity run --nv \
-B $PWD:/workdir \
-B $PWD:/outputdir \
pbrun fq2bam \
--ref /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
--in-fq /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz \
fq2bam Slurm script:
<allocation> with your allocation name.
- The singularity flag
--nv enables Nvidia GPU support inside the container.
- The singularity flag
-B binds a directory into the container.
- In this case, we are binding the present working directory (
$PWD) into both
/outputdir inside the container.
- The variable
$CONTAINERDIR is defined by the container module - you do not need to assign it a value. This line in the script points the singularity
run command to the appropriate
.sif file to call the desired container.
pbrun command tells Clara Parabricks you want to run the subsequent tool (in this case,
- The arguments following
pbrun fq2bam are specific to the Clara Parabricks tool being used. See the
fq2bam reference for more detailed information on these arguments.
- In this case, the reference fasta file (
Homo_sapiens_assembly38.fasta) and fastq data files (
sample_2.fq.gz) were downloaded ahead of time and stored in the referenced subdirectories. You should change these paths and file names as needed to point to your specific reference fasta and data files.
- This script should be saved in a file, called (for example)
job.slurm. To run your job, you would submit the script by typing
HPC, software, bioinformatics