Module Category Description
R R is a free software environment for statistical computing and graphics.
abinit chem ABINIT is a package whose main program allows one to find the total energy, charge density and electronic structure of systems made of electrons and nuclei (molecules and periodic solids) within Density Functional Theory (DFT), using pseudopotentials and a planewave or wavelet basis.
amber bio A suite of biomolecular simulation programs. It began in the late 1970's, and is maintained by an active development community.
anaconda lang Built to complement the rich, open source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers companies to adopt a modern open data science analytics architecture.
ansys ANSYS simulation software enables organizations to confidently predict how their products will operate in the real world. We believe that every product is a promise of something greater.
ant devel Apache Ant is a Java library and command-line tool whose mission is to drive processes described in build files as targets and extension points dependent upon each other. The main known usage of Ant is the build of Java applications.
ants data ANTs extracts information from complex datasets that include imaging. ANTs is useful for managing, interpreting and visualizing multidimensional data.
apr tools The mission of the Apache Portable Runtime (APR) project is to create and maintain software libraries that provide a predictable and consistent interface to underlying platform-specific implementations.
apr-util tools The mission of the Apache Portable Runtime (APR) project is to create and maintain software libraries that provide a predictable and consistent interface to underlying platform-specific implementations.
armadillo numlib Armadillo is an open-source C++ linear algebra library (matrix maths) aiming towards a good balance between speed and ease of use. Integer, floating point and complex numbers are supported, as well as a subset of trigonometric and statistics functions.
arpack-ng numlib ARPACK is a collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.
ascmeme bio ASC+MEME is a fast motif discovery tool that is 10,000 times faster than MEME while preserving the same accuracy.
ase chem The Atomic Simulation Environment (ASE) is a set of tools and Python modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations.
aspera-connect tools Connect is an install-on-demand Web browser plug-in that facilitates high-speed uploads and downloads with an Aspera transfer server.
augustus bio AUGUSTUS is a program to find genes and their structures in one or more genomes.
autotools devel This bundle collect the standard GNU build tools: Autoconf, Automake and libtool
bamtools bio BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.
bart bio BART (Binding Analysis for Regulation of Transcription) is a bioinformatics tool for predicting functional transcription factors (TFs) that bind at genomic cis-regulatory regions to regulate gene expression in the human or mouse genomes, given a query gene set or a ChIP-seq dataset as input.
bbmap bio BBMap includes a short read aligner, and other bioinformatic tools.
bcftools bio BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF
bcl2fastq2 bio bcl2fastq Conversion Software both demultiplexes data and converts BCL files generated by Illumina sequencing systems to standard FASTQ file formats for downstream analysis.
bedops bio BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.
bedtools bio The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM.
berkeley_db tools Berkeley DB is a family of embedded key-value database libraries providing scalable high-performance data management services to applications.
bicseq2-norm bio BICseq2 is an algorithm developed for the normalization of high-throughput sequencing (HTS) data and detect copy number variations (CNV) in the genome. BICseq2 can be used for detecting CNVs with or without a control genome. BICseq2-norm is for normalizing potential biases in the sequencing data.
bicseq2-seg bio BICseq2 is an algorithm developed for the normalization of high-throughput sequencing (HTS) data and detect copy number variations (CNV) in the genome. BICseq2 can be used for detecting CNVs with or without a control genome. BICseq2-seg is for detecting CNVs based on the normalized data given by BICseq2-norm.
bioconda bio Bioconda is a channel for the conda package manager specializing in bioinformatics software.
bioperl bio Bioperl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and database searching objects.
biopython bio Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics.
bismark bio A tool to map bisulfite converted sequence reads and determine cytosine methylation states
blast bio Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences.
blender vis Blender is the free and open source 3D creation suite. It supports the entirety of the 3D pipeline-modeling, rigging, animation, simulation, rendering, compositing and motion tracking, even video editing and game creation.
blitz++ lib Blitz++ is a (LGPLv3+) licensed meta-template library for array manipulation in C++ with a speed comparable to Fortran implementations, while preserving an object-oriented interface
boost devel Boost provides free peer-reviewed portable C++ source libraries.
bowtie2 bio Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.
bwa bio Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome.
bzip2 tools bzip2 is a freely available, patent free, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression.
caffe2 lib Caffe is a deep learning framework made with expression, speed, and modularity in mind
caviar bio caviar is a statistical framework that quantifies the probability of each variant to be causal while allowing with arbitrary number of causal variants.
cd-hit bio CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.
cellprofiler bio CellProfiler is an image processing package to generate morphometric measurements.
cellranger bio A set of analysis piplines that perform sample demultiplexing, barcode processing, and single cell 3' gene counting.
cellranger-dna bio Cell Ranger DNA is a set of analysis pipelines that process Chromium single cell DNA sequencing output to align reads, identify copy number variation (CNV), and compare heterogeneity among cells.
cesm geo CESM is a fully-coupled, community, global climate model that provides state-of-the-art computer simulations of the Earth's past, present, and future climate states.
chemps2 chem CheMPS2 is a scientific library which contains a spin-adapted implementation of the density matrix renormalization group (DMRG) for ab initio quantum chemistry.
circos bio Circos is a software package for visualizing data and information. It visualizes data in a circular layout - this makes Circos ideal for exploring relationships between objects or positions.
clearcut bio Clearcut is the reference implementation for the Relaxed Neighbor Joining (RNJ) algorithm by J. Evans, L. Sheneman, and J. Foster from the Initiative for Bioinformatics and Evolutionary Studies (IBEST) at the University of Idaho.
clhep numlib The CLHEP project is intended to be a set of HEP-specific foundation and utility classes such as random generators, physics vectors, geometry and linear algebra. CLHEP is structured in a set of packages independent of any external package.
cloudcompare vis CloudCompare is a 3D point cloud (and triangular mesh) processing software. It has been originally designed to perform comparison between two dense 3D points clouds (such as the ones acquired with a laser scanner) or between a point cloud and a triangular mesh.
cmake devel CMake, the cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.
codeblocks lang Code::Blocks is a free C, C++ and Fortran IDE built to meet the most demanding needs of its users. It is designed to be very extensible and fully configurable.
cp-analyst bio CellProfiler Analyst (CPA) allows interactive exploration and analysis of data, particularly from high-throughput, image-based experiments. Included is a supervised machine learning system which can be trained to recognize complicated and subtle phenotypes, for automatic scoring of millions of cells. CellProfiler is an image processing package to generate morphometric measurements.
cp2k chem CP2K is a freely available (GPL) program, written in Fortran 95, to perform atomistic and molecular simulations of solid state, liquid, molecular and biological systems. It provides a general framework for different methods such as e.g. density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW), and classical pair and many-body potentials.
cppcheck tools Cppcheck is a static analysis tool for C/C++ code. It provides unique code analysis to detect bugs and focuses on detecting undefined behaviour and dangerous coding constructs.
cromwell tools Cromwell is a Workflow Management System geared towards scientific workflows.
cuda system CUDA (formerly Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.
cudnn numlib The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.
curl tools libcurl is a free and easy-to-use client-side URL transfer library, supporting DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet and TFTP. libcurl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, Kerberos), file transfer resume, http proxy tunneling and more.
cushaw3 bio CUSHAW is a well-established leading next-generation sequencing read alignment software package based on multi-core and many-core computing.
cutadapt bio Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
cytoscape tools Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data.
danpos bio Danpos is a toolkit for Dynamic Analysis of Nucleosome and Protein Occupancy by Sequencing, version 2
ddd vis DDD is a graphical front-end for command-line debuggers such as GDB, DBX, WDB, Ladebug, JDB, XDB, the Perl debugger, the bash debugger bashdb, the GNU Make debugger remake, or the Python debugger pydb.
deeptools bio deepTools contains useful modules to process the mapped reads data for multiple quality checks, creating normalized coverage files in standard bedGraph and bigWig file formats, that allow comparison between different files (for example, treatment and control). Finally, using such normalized and standardized files, deepTools can create many publication-ready visualizations to identify enrichments and for functional annotations of the genome.
diamond bio DIAMOND is a sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000.
doxygen devel Doxygen is a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D.
dragonn math DragoNN is a toolkit to learn how to model and interpret regulatory sequence data using deep learning.
eclipse tools Eclipse provides IDEs and platforms for many programming languages. This module includes Java support.
eigen math Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
eigensoft bio The EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification correction method (Price et al. 2006). The EIGENSTRAT method uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypes.
emboss bio EMBOSS is 'The European Molecular Biology Open Software Suite'. EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community.
epacts tools EPACTS (Efficient and Parallelizable Association Container Toolbox) is a versatile software pipeline to perform various statistical tests for identifying genome-wide association from sequence data through a user-friendly interface, both to scientific analysts and to method developers.
epic bio epic is a software package for finding medium to diffusely enriched domains in chip-seq data. It is a fast, parallel and memory-efficient implementation of the popular SICER algorithm.
esmf geo The Earth System Modeling Framework (ESMF) is software for building and coupling weather, climate, and related models.
exonerate bio Exonerate is a generic tool for pairwise sequence comparison. It allows you to align sequences using a many alignment models, using either exhaustive dynamic programming, or a variety of heuristics.
fasta bio The FASTA programs find regions of local or global (new) similarity between protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence.
fastqc bio FastQC is a Java application which takes a FastQ file and runs a series of tests on it to generate a comprehensive QC report.
fastx-toolkit bio The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
febio cae FEBio is a nonlinear finite element solver that is specifically designed for biomechanics and biophysics applications.
ffmpeg vis A complete, cross-platform solution to record, convert and stream audio and video.
fftw numlib FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data.
fgsl numlib FGSL: A Fortran interface to the GNU Scientific Library
fiji tools Fiji is an image processing distribution of ImageJ, bundling a lot of plugins which facilitate scientific image analysis.
fltk vis FLTK is a cross-platform C++ GUI toolkit for UNIX/Linux (X11), Microsoft Windows, and MacOS X. FLTK provides modern GUI functionality without the bloat and supports 3D graphics via OpenGL and its built-in GLUT emulation.
fortrangis geo FortranGIS project includes a collection of Fortran interfaces to some common Open Source GIS (Geographic Information System) software libraries, plus some more Fortran-specific tools.
freebayes bio FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.
freesurfer bio FreeSurfer is a set of tools for analysis and visualization of structural and functional brain imaging data. FreeSurfer contains a fully automatic structural imaging stream for processing cross sectional and longitudinal data.
freexl lib FreeXL is an open source library to extract valid data from within an Excel (.xls) spreadsheet.
fsa bio FSA:Fast Statistical Alignment, is a probabilistic multiple sequence alignment algorithm which uses a distance-based approach to aligning homologous protein, RNA or DNA sequences.
fsl bio FSL is a comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data.
gatk bio The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
gaussian chem Gaussian is a suite of electronic-structure codes.
gcc compiler The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages (libstdc++, libgcj,...).
gcloud-sdk tools The Cloud SDK is a set of tools for Cloud Platform. It contains gcloud, gsutil, and bq, which you can use to access Google Compute Engine, Google Cloud Storage, Google BigQuery, and other products and services from the command-line.
gd bio GD.pm - Interface to Gd Graphics Library
gdal data GDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing.
gdal-grass data The idea of the GDAL-GRASS plugin is to directly access GRASS raster and vector data from outside. Any GDAL enabled software (QGIS, R, ...) can read and write through the plugin from the GRASS database.
gdc-client bio The gdc-client provides several convenience functions over the GDC API which provides general download/upload via HTTPS.
geany tools Geany is a text editor using the GTK+ toolkit with basic features of an integrated development environment.
gemma bio Genome-wide Efficient Mixed Model Association
genometools bio The GenomeTools genome analysis system is a free collection of bioinformatics tools (in the realm of genome informatics) combined into a single binary named gt. It is based on a C library named “libgenometools” which consists of several modules.
geos math GEOS (Geometry Engine - Open Source) is a C++ port of the Java Topology Suite (JTS)
gftp tools gFTP is a free multithreaded file transfer client for *NIX based machines.
git tools Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
glade vis Glade is a RAD tool to enable quick & easy development of user interfaces for the GTK+ toolkit and the GNOME desktop environment.
glew lib The OpenGL Extension Wrangler Library (GLEW) is a cross-platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform.
globus_cli tools Globus CLI is a standalone application that can be installed on the user’s machine and used to access the Globus file transfer service.
gmap-gsnap bio GMAP: A Genomic Mapping and Alignment Program for mRNA and EST Sequences GSNAP: Genomic Short-read Nucleotide Alignment Program
gmp math GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers.
gmvapich2 toolchain GNU Compiler Collection (GCC) based compiler toolchain, including MVAPICH2 for MPI support.
gmvolf toolchain GNU Compiler Collection (GCC) based compiler toolchain, including MVAPICH2 for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
gnuplot vis Portable interactive, function plotting utility
go lang Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.
gobject-introspection devel GObject introspection is a middleware layer between C libraries (using GObject) and language bindings. The C library can be scanned at compile time and generate a metadata file, in addition to the actual native C library. Then at runtime, language bindings can read this metadata and automatically provide bindings to call into the C library.
gompi toolchain GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support.
google-api tools Google APIs give you programmatic access to Google Maps, Google Drive, YouTube, and many other Google products.
goolf toolchain GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
gparallel tools GNU parallel is a shell tool for executing jobs in parallel using one or more computers.
gperf base None
grace vis Grace is a WYSIWYG 2D plotting tool for X Windows System and Motif.
grass geo The Geographic Resources Analysis Support System - used for geospatial data management and analysis, image processing, graphics and maps production, spatial modeling, and visualization
grib_api data The ECMWF GRIB API is an application program interface accessible from C, FORTRAN and Python programs developed for encoding and decoding WMO FM-92 GRIB edition 1 and edition 2 messages. A useful set of command line tools is also provided to give quick access to GRIB messages.
gromacs chem GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
gsl numlib The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting.
gurobi math The Gurobi Optimizer is a state-of-the-art solver for mathematical programming. The solvers in the Gurobi Optimizer were designed from the ground up to exploit modern architectures and multi-core processors, using the most advanced implementations of the latest algorithms.
hdf data HDF (also known as HDF4) is a library and multi-object file format for storing and managing data between machines.
hdf-eos data The HDF-EOS2 is a software library designed built on HDF4* to support EOS-specific data structures, namely Grid, Point, and Swath.
hdf5 data HDF5 is a unique technology suite that makes possible the management of extremely large and complex data collections.
hexrd phys HEXRD provides a collection of resources for analysis of x-ray diffraction data, especially high-energy x-ray diffraction. HEXRD is comprised of a library and API for writing scripts, a command line interface, and an interactive graphical user interface.
hic-pro bio HiC-Pro is an optimized and flexible pipeline for Hi-C data processing.
hisat2 bio HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) against the general human population (as well as against a single reference genome).
homer tools HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and ChIP-Seq analysis. It is a collection of command line programs for unix-style operating systems written in mostly perl and c++. Homer was primarily written as a de novo motif discovery algorithm that is well suited for finding 8-12 bp motifs in large scale genomics data.
hoomd chem HOOMD-blue is a general-purpose particle simulation toolkit. It scales from a single CPU core to thousands of GPUs.
htslib bio A C library for reading/writing high-throughput sequencing data. This package includes the utilities bgzip and tabix
hydrator data Hydrator converts Twitter IDs into JSON files.
hypre numlib Hypre is a library for solving large, sparse linear systems of equations on massively parallel computers. The problems of interest arise in the simulation codes being developed at LLNL and elsewhere to study physical phenomena in the defense, environmental, energy, and biological sciences.
idl IDL is an interpreted programming language used to create analyses and visualizations of numerical data.
idr bio The IDR (Irreproducible Discovery Rate) framework is a unified approach to measure the reproducibility of findings identified from replicate experiments and provide highly stable thresholds based on reproducibility. The IDR method compares a pair of ranked lists of identifications (such as ChIP-seq peaks).
iintelmpi toolchain Intel C/C++ and Fortran compilers with IntelMPI.
imagemagick vis ImageMagick is a software suite to create, edit, compose, or convert bitmap images
imsl numlib IMSL Libraries provide optimized mathematical and statistical algorithms that can be embedded into C, C++, .NET, Java™, and Fortran applications, including many databases. IMSL enhances application performance, reliability, portability, scalability, and maintainability as well as developer productivity.
imvapich2 toolchain Intel C/C++ and Fortran compilers, alongside MVAPICH2.
inkscape lib Inkscape is a free and open source professional vector graphics editor.
intel compiler Intel C and C++ compilers
intelmpi mpi IntelMPI from Intel.
intervene bio Intervene is a tool for intersection and visualization of multiple genomic region sets.
intltool lang The Intltool is an internationalization tool used for extracting translatable strings from source files, collecting the extracted strings with messages from traditional source files, and merging the translations into .xml, .desktop and .oaf files.
irfinder bio IRFinder is a tool for detecting intron retention from RNA-Seq experiments.
jags math JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation
jasper vis The JasPer Project is an open-source initiative to provide a free software-based reference implementation of the codec specified in the JPEG-2000 Part-1 standard.
java lang Java Platform, Standard Edition (Java SE) lets you develop and deploy Java applications on desktops and servers.
jcuda bio Java bindings for NVIDIA CUDA and related libraries.
jtreeview vis TreeView is an open-source Java app for visualizing large data matrices. It can load a dataset, cluster it, browse it, customize its appearance and export it (or parts of it) into a figure.
juicer bio Juicer is a one-click pipeline for processing terabase scale Hi-C datasets.
julia lang Julia is a high-level, high-performance dynamic programming language for numerical computing. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library.
junit devel A programmer-oriented testing framework for Java.
kallisto bio Kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.
knime data KNIME is an analytics platform for data mining.
kraken bio Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.
lame data LAME is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.
lammps chem LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator.
lapack numlib LAPACK is written in Fortran90 and provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems.
leda lib RStudio is a set of integrated tools designed to help you be more productive with R. LEDA is a C++ class library for efficient data types and algorithms that provide algorithmic in-depth knowledge of graph- and network problems, geometric computations, combinatorial opimization and other.
levmar numlib Levmar is an implementation of Levenberg-Marquardt in C
lftp tools lftp is a sophisticated file transfer program supporting a number of network protocols.
libgcrypt system Libgcrypt is a general purpose cryptographic library originally based on code from GnuPG. It provides functions for all cryptograhic building blocks: symmetric cipher algorithms (AES, Arcfour, Blowfish, Camellia, CAST5, ChaCha20 DES, GOST28147, Salsa20, SEED, Serpent, Twofish) and modes (ECB,CFB,CBC,OFB,CTR,CCM,GCM,OCB,POLY1305,AESWRAP), hash algorithms (MD2, MD4, MD5, GOST R 34.11, RIPE-MD160, SHA-1, SHA2-224, SHA2-256, SHA2-384, SHA2-512, SHA3-224, SHA3-256, SHA3-384, SHA3-512, SHAKE-128, SHAKE-256, TIGER-192, Whirlpool), MACs (HMAC for all hash algorithms, CMAC for all cipher algorithms, GMAC-AES, GMAC-CAMELLIA, GMAC-TWOFISH, GMAC-SERPENT, GMAC-SEED, Poly1305, Poly1305-AES, Poly1305-CAMELLIA, Poly1305-TWOFISH, Poly1305-SERPENT, Poly1305-SEED), public key algorithms (RSA, Elgamal, DSA, ECDSA, EdDSA, ECDH), large integer functions, random numbers and a lot of supporting functions.
libgd lib GD is an open source code library for the dynamic creation of images by programmers.
libgeotiff lib Library for reading and writing coordinate system information from/to GeoTIFF files
libglade lib Libglade is a library for constructing user interfaces dynamically from XML descriptions.
libgpg-error system Libgpg-error is a small library that defines common error values for all GnuPG components.
libharu lib libHaru is a free, cross platform, open source library for generating PDF files.
libint chem Libint library is used to evaluate the traditional (electron repulsion) and certain novel two-body matrix elements (integrals) over Cartesian Gaussian functions used in modern atomic and molecular theory.
libmatheval lib GNU libmatheval is a library (callable from C and Fortran) to parse and evaluate symbolic expressions input as text.
libspatialite lib SpatiaLite is an open source library intended to extend the SQLite core to support fully fledged Spatial SQL capabilities.
libxsmm math LIBXSMM is a library for small dense and small sparse matrix-matrix multiplications targeting Intel Architecture (x86).
libyaml lib LibYAML is a YAML parser and emitter written in C.
llvm compiler The LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs (as well as some less common ones!) These libraries are built around a well specified code representation known as the LLVM intermediate representation ("LLVM IR"). The LLVM Core libraries are well documented, and it is particularly easy to invent your own language (or port an existing compiler) to use LLVM as an optimizer and code generator.
longranger bio Long Ranger is a set of analysis pipelines that processes Chromium sequencing output to align reads and call and phase SNPs, indels, and structural variants.
macs2 bio MACS (Model-based Analysis of ChIP-Seq) identifies transcript factor binding sites. MACS captures the influence of genome complexity to evaluate the significance of enriched ChIP regions, and MACS improves the spatial resolution of binding sites through combining the information of both sequencing tag position and orientation.
manta bio Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs. Manta discovers, assembles and scores large-scale SVs, medium-sized indels and large insertions within a single efficient workflow.
marge bio MARGE is a robust methodology that leverages a comprehensive library of genome-wide H3K27ac ChIP-seq profiles to predict key regulated genes and cis-regulatory regions in human or mouse.
mathematica
matlab
maven devel Binary maven install, Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.
mayavi vis A tool for easy and interactive visualization of data.
meme bio The MEME Suite allows you to: * discover motifs using MEME, DREME (DNA only) or GLAM2 on groups of related DNA or protein sequences, * search sequence databases with motifs using MAST, FIMO, MCAST or GLAM2SCAN, * compare a motif to all motifs in a database of motifs, * associate motifs with Gene Ontology terms via their putative target genes, and * analyse motif enrichment using SpaMo or CentriMo.
metis math METIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices. The algorithms implemented in METIS are based on the multilevel recursive-bisection, multilevel k-way, and multi-constraint partitioning schemes.
moist tools moist is a Python database adaptor for MySQL, MariaDB, and (eventually) Drizzle. It is a continuation of the development fork of MySQLdb, i.e. pre-2.0.
mothur bio Mothur is a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
mpi4py lib MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.
mummer bio MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. AMOS makes use of it.
muscle bio MUSCLE is one of the best-performing multiple alignment programs according to published benchmark tests, with accuracy and speed that are consistently better than CLUSTALW. MUSCLE can align hundreds of sequences in seconds. Most users learn everything they need to know about MUSCLE in a few minutes—only a handful of command-line options are needed to perform common alignment tasks.
mvapich2 mpi The MVAPICH2 software, based on MPI 3.1 standard, delivers the best performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE networking technologies.
mysqlclient lib Python interface to MySQL
nasm lang NASM: General-purpose x86 assembler
ncbi-vdb bio The SRA Toolkit and SDK from NCBI is a collection of tools and libraries for using data in the INSDC Sequence Read Archives.
ncl data NCL is an interpreted language designed specifically for scientific data analysis and visualization.
netcdf data NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This module bundles the C++ and Fortran libaries.
netcdf-cxx data NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.
netperf base None
neuron bio Empirically-based simulations of neurons and networks of neurons.
nextflow tools Nextflow is a reactive workflow framework and a programming DSL that eases the writing of computational pipelines with complex data.
ngs bio NGS is a new, domain-specific API for accessing reads, alignments and pileups produced from Next Generation Sequencing.
ngsplot bio ngs.plot allows easy visualization of next-generation sequencing (NGS) samples at functional genomic regions.
nseg bio Nseg is used to identify low complexity sequencesi.
ntl math NTL is a high-performance, portable C++ library providing data structures and algorithms for manipulating signed, arbitrary length integers, and for vectors, matrices, and polynomials over the integers and over finite fields.
openblas numlib OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
openbugs math OpenBUGS is a software application for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) methods.
openjpeg lib OpenJPEG is an open-source JPEG 2000 codec written in C.
openmpi mpi The Open MPI Project is an open source MPI-3 implementation.
openms bio OpenMS is an open-source software C++ library for LC-MS data management and analyses. It offers an infrastructure for rapid development of mass spectrometry related software.
openslide bio OpenSlide is a C library that provides a simple interface to read whole-slide images.
openslide-python bio Python bindings for the OpenSlide libary
openspeedshop tool
p3dfft numlib P3DFFT is a library for large-scale computer simulations on parallel platforms.
p4vasp vis Visualization suite for VASP
p7zip tools p7zip is a quick port of 7z.exe and 7za.exe (command line version of 7zip) for Unix. 7-Zip is a file archiver with highest compression ratio.
paintor bio PAINTOR is a statistical fine-mapping method that integrates functional genomic data with association strength from potentially multiple populations (or traits) to prioritize variants for follow-up analysis.
paraview vis ParaView is a scientific parallel visualizer.
parmetis numlib ParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices.
patric bio PATRIC is an integration of different types of data and software tools that support research on bacterial pathogens.
pcmsolver chem An API for the Polarizable Continuum Model.
peakseq bio PeakSeq is a program for identifying and ranking peak regions in ChIP-Seq experiments. It takes as input, mapped reads from a ChIP-Seq experiment, mapped reads from a control experiment and outputs a file with peak regions ranked with increasing Q-values.
perl lang Larry Wall's Practical Extraction and Report Language
petsc numlib PETSc, pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.
pgi compiler C, C++ and Fortran compilers from The Portland Group - PGI
phono3py chem phono3py calculates phonon-phonon interaction and related properties using the supercell approach.
phonopy lib Phonopy is an open source package of phonon calculations based on the supercell approach.
picard bio A set of tools (in Java) for working with next generation sequencing data in the BAM (http://samtools.github.io/hts-specs) format.
platform-mpi mpi Platform MPI is an MPI-2 implementation from IBM.
plink bio PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.
plumed chem PLUMED is an open source library for free energy calculations in molecular systems which works together with some of the most popular molecular dynamics engines. Free energy calculations can be performed as a function of many order parameters with a particular focus on biological problems, using state of the art methods such as metadynamics, umbrella sampling and Jarzynski-equation based steered MD. The software, written in C++, can be easily interfaced with both fortran and C/C++ codes.
pompi toolchain Toolchain with PGI C, C++ and Fortran compilers, alongside OpenMPI.
postgresql data PostgreSQL is a powerful, open source object-relational database system. It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures (in multiple languages). It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR, VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video. It has native programming interfaces for C/C++, Java, .Net, Perl, Python, Ruby, Tcl, ODBC, among others, and exceptional documentation.
proj lib Program proj is a standard Unix filter function which converts geographic longitude and latitude coordinates into cartesian coordinates
prokka bio Prokka is a software tool for the rapid annotation of prokaryotic genomes.
proteowiz bio ProteoWizard provides a set of open-source, cross-platform software libraries and tools (e.g. msconvert, Skyline, IDPicker, SeeMS) that facilitate proteomics data analysis. The libraries enable rapid tool creation by providing a robust, pluggable development framework that simplifies and unifies data file access, and performs standard chemistry and LCMS dataset computations.
pslib tools pslib is a C-library to create PostScript files on the fly. It offers many drawing primitives, inclusion of png and eps images and a very sophisticated text rendering including hyphenation, kerning and ligatures.
pybind11 lib pybind11 is a lightweight header-only library that exposes C++ types in Python and vice versa, mainly to create Python bindings of existing C++ code.
pycairo vis Python bindings for the cairo library
pygobject vis Python Bindings for GLib/GObject/GIO/GTK+
pygtk vis PyGTK lets you to easily create programs with a graphical user interface using the Python programming language.
pyopengl vis PyOpenGL is the most common cross platform Python binding to OpenGL and related APIs.
python lang Python is a programming language that lets you work more effectively.
pyyaml lib PyYAML is a YAML parser and emitter for the Python programming language.
qiime bio QIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data.
qt5 devel Qt is a comprehensive cross-platform C++ application framework.
quantumespresso chem Quantum ESPRESSO is an integrated suite of computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials (both norm-conserving and ultrasoft).
qwt lib The Qwt library contains GUI Components and utility classes which are primarily useful for programs with a technical background.
raxml bio RAxML search algorithm for maximum likelihood based inference of phylogenetic trees.
rdp-classifier bio The RDP Classifier is a naive Bayesian classifier that can rapidly and accurately provides taxonomic assignments from domain to genus, with confidence estimates for each assignment.
readosm lib ReadOSM is an open source library to extract valid data from within an Open Street Map input file.
reframe system ReFrame Regression Testing Suite
relion bio RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).
repast_hpc lang Repast for High Performance Computing (Repast HPC) is a next generation agent-based modeling system intended for large-scale distributed computing platforms.
rsem bio RNA-Seq by Expectation-Maximization
rstudio lang RStudio is a set of integrated tools designed to help you be more productive with R.
ruby lang Ruby is a dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.
rust lang Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.
sagemath math SageMath is a free open-source mathematics software system licensed under the GPL. It builds on top of many existing open-source packages: NumPy, SciPy, matplotlib, Sympy, Maxima, GAP, FLINT, R and many more
saint Significance Analysis of INTeractome (SAINT) consists of a series of software tools for assigning confidence scores to protein-protein interactions based on quantitative proteomics data in AP-MS experiments.
saintexpress Significance Analysis of INTeractome (SAINT) consists of a series of software tools for assigning confidence scores to protein-protein interactions based on quantitative proteomics data in AP-MS experiments.
salmon Salmon is a tool for quantifying the expression of transcripts using RNA-seq data. Salmon uses new algorithms (specifically, coupling the concept of quasi-mapping with a two-phase inference procedure) to provide accurate expression estimates very quickly (i.e. wicked-fast) and while using little memory.
sambamba Sambamba is a tool for processing BAM files.
samtools SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
sas math Statistical analysis package
sbt A build tool for Scala.
scalapack numlib The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers.
scotch math Software package and libraries for sequential and parallel graph partitioning, static mapping, and sparse matrix block ordering, and sequential mesh and hypergraph partitioning.
sdl2 lib SDL: Simple DirectMedia Layer, a cross-platform multimedia library
shapelib lib The Shapefile C Library provides the ability to write simple C programs for reading, writing and updating (to a limited extent) ESRI Shapefiles, and the associated attribute file (.dbf).
shengbte ShengBTE is a software package for solving the Boltzmann Transport Equation for phonons.
sibil-env vis This module sets up the environment for the SIBIL application.
sicerpy bio SICER.py is a Python wrapper for the SICER peak caller software.
siesta SIESTA is both a method and its computer program implementation, to perform efficient electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids.
silo data Silo is a library for reading and writing a wide variety of scientific data to binary, disk files
singularity Singularity enables users to have full control of their environment. Singularity containers can be used to package entire scientific workflows, software and libraries, and even data.
slatec Fortran 77 numerical library.
slepc numlib SLEPc (Scalable Library for Eigenvalue Problem Computations) is a software library for the solution of large scale sparse eigenvalue problems on parallel computers. It is an extension of PETSc and can be used for either standard or generalized eigenproblems, with real or complex arithmetic. It can also be used for computing a partial SVD of a large, sparse, rectangular matrix, and to solve quadratic eigenvalue problems.
slicer tools 3D Slicer is an open source software platform for medical image informatics, image processing, and three-dimensional visualization.
slim bio SLiM is an evolutionary simulation package that provides facilities for very easily and quickly constructing genetically explicit individual-based evolutionary models.
smrtlink PacBio’s open-source SMRT Analysis software suite is designed for use with Single Molecule, Real-Time (SMRT) Sequencing data. You can analyze, visualize, and manage your data through an intuitive GUI or command-line interface. You can also integrate SMRT Analysis in your existing data workflow through the extensive set of APIs provided
snakemake tools The Snakemake workflow management system is a tool to create reproducible and scalable data analyses.
snap SNAP is a general purpose gene finding program suitable for both eukaryotic and prokaryotic genomes. SNAP is an acroynm for Semi-HMM-based Nucleic Acid Parser.
snap-stanford vis Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges.
snap-stanford-py Snap.py is a Python interface for SNAP. SNAP is a general purpose, high performance system for analysis and manipulation of large networks. SNAP is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges.
spack Spack is a package manager for supercomputers, Linux, and macOS. It makes installing scientific software easy. With Spack, you can build a package with multiple versions, configurations, platforms, and compilers, and all of these builds can coexist on the same machine.
sparsehash An extremely memory-efficient hash_map implementation. 2 bits/entry overhead! The SparseHash library contains several hash-map implementations, including implementations that optimize for space or speed.
spglib Spglib is a library for finding and handling crystal symmetries written in C.
sprng Scalable Parallel Pseudo Random Number Generators Library
sqlite SQLite: SQL Database Engine in a C Library
sratoolkit The SRA Toolkit, and the source-code SRA System Development Kit (SDK), will allow you to programmatically access data housed within SRA and convert it from the SRA format
stacks Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.
star STAR aligns RNA-seq reads to a reference genome using uncompressed suffix arrays.
stata math Stata is a complete, integrated statistical software package that provides everything you need for data analysis, data management, and graphics.
suitesparse numlib SuiteSparse is a collection of libraries manipulate sparse matrices.
sundials math SUNDIALS: SUite of Nonlinear and DIfferential/ALgebraic Equation Solvers
superlu SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations.
superlu_mt SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations.
swig SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages.
szip Szip compression software, providing lossless compression of scientific data
tabix bio Generic indexer for TAB-delimited genome position files
taggraph bio TagGraph is a computational tool that provides an unrestricted string-based search method that is as much as 350-fold faster than existing approaches, and a probabilistic validation model that was optimized for post-translational modification assignments.
tensorflow data TensorFlow is an open-source software library for Machine Intelligence.
thirdorder chem Python package for creating input for ShengBTE.
tmux tools tmux is a terminal multiplexer. It lets you switch easily between several programs in one terminal, detach them (they keep running in the background) and reattach them to a different terminal.
tophat bio TopHat is a fast splice junction mapper for RNA-Seq reads.
totalview debugger TotalView is a GUI-based source code defect analysis tool that gives you unprecedented control over processes and thread execution and visibility into program state and variables. It allows you to debug one or many processes and/or threads in a single window with complete control over program execution. This allows you to set breakpoints, stepping line by line through the code on a single thread, or with coordinated groups of processes or threads, and run or halt arbitrary sets of processes or threads. You can reproduce and troubleshoot difficult problems that can occur in concurrent programs that take advantage of threads, OpenMP, MPI, GPUs or coprocessors.
trimgalore bio Trim Galore is a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data.
trimmomatic bio Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.
ucsc-tools bio A set of genome utilities developed at the University of California Santa Cruz.
udunits tools UDUNITS supports conversion of unit specifications between formatted and binary forms, arithmetic manipulation of units, and conversion of values between compatible scales of measurement.
vapor vis VAPOR is the Visualization and Analysis Platform for Ocean, Atmosphere, and Solar Researchers. VAPOR provides an interactive 3D visualization environment that can also produce animations and still frame images
vasp chem The Vienna Ab initio Simulation Package (VASP) is a computer program for atomic scale materials modelling.
vcftools bio The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
velvet bio Sequence assembler for very short reads
vep bio VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
viennarna bio The Vienna RNA Package consists of a C code library and several stand-alone programs for the prediction and comparison of RNA secondary structures.
vigra vis VIGRA stands for "Vision with Generic Algorithms". It's an image processing and analysis library that puts its main emphasis on customizable algorithms and data structures. VIGRA is especially strong for multi-dimensional images, because many algorithms (e.g. filters, feature computation, superpixels) are implemented for arbitrary high dimensions.
visit vis VisIt is an Open Source, interactive, scalable, visualization, animation and analysis tool.
vsearch bio VSEARCH which supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
vtk vis The Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, image processing and visualization. VTK consists of a C++ class library and several interpreted interface layers including Tcl/Tk, Java, and Python. VTK supports a wide variety of visualization algorithms including: scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as: implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation.
wannier90 chem Wannier90 is a package to calculate maximally-localised Wannier functions.
wdltool tools A Java command-line tool co-developed with WDL that performs utility functions, including syntax validation and generation of input JSON templates.
wxpython vis wxPython is a GUI toolkit for the Python programming language. It allows Python programmers to create programs with a robust, highly functional graphical user interface, simply and easily. It is implemented as a Python extension module (native code) that wraps the popular wxWidgets cross platform GUI library, which is written in C++.
wxwidgets vis wxPython is a GUI toolkit for the Python programming language. It allows Python programmers to create programs with a robust, highly functional graphical user interface, simply and easily. It is implemented as a Python extension module (native code) that wraps the popular wxWidgets cross platform GUI library, which is written in C++.
xcrysden vis XCrySDen is a crystalline and molecular structure visualisation program aiming at display of isosurfaces and contours, which can be superimposed on crystalline structures and interactively rotated and manipulated.
xerces tools Xerces-C++ is a validating XML parser written in a portable subset of C++.
xxdiff tools xxdiff is a graphical file and directories comparator and merge tool.
yasm lang Yasm: Complete rewrite of the NASM assembler with BSD license
zstd lib Zstandard is a real-time compression algorithm, providing high compression ratios. It offers a very wide range of compression/speed trade-off, while being backed by a very fast decoder. It also offers a special mode for small data, called dictionary compression, and can create dictionaries from any sample set.