* Please note that HISAT2 requires approval prior to installation on the VM
HISAT2 is a fast and sensitive tool for aligning short reads against the general human population (as well as single reference genome). It indexes the genome using a Hierarchical Graph FM Index (HGFM) strategy, i.e. a large set of small indexes that collectively cover the whole genome (each index representing a genomic region of 56 Kbp).
HISAT2 is a 2-step process - indexing the reference genome, followed by aligning the sequence data.
Create indexes of your reference genome of interest stored in
reference.fastafile: hisat2-build [option(s)] <reference.fasta>
This will create new files with the provided basename and extensions
*.ht2. These files constitute the index.
Align paired-end reads
sampleR2.fqto the reference genome indexed in the previous step, using
Ncores: hisat2 -x
-1 <sampleR1.fq> -2 <sampleR2.fq> -p -S <output.sam> The alignment results in SAM format are written to the file
Note on using the
Since Ivy VM’s do not allow outside connections,
--sra-acc option will not work with HISAT2.
If users plan to use SRA data, they will have to download it and move into Ivy prior to alignment.
Please refer to our Globus user guide for help on how to do that.
For more information, please refer to HISAT2 [manual] (https://ccb.jhu.edu/software/hisat2/manual.shtml).
If you use
HISAT2 for your work, please cite:
Kim D, Langmead B and Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods 2015