Module Category Description
7zip tools 7-Zip is a file archiver with a high compression ratio.
R lang R is a free software environment for statistical computing and graphics.
abinit chem ABINIT is a package whose main program allows one to find the total energy, charge density and electronic structure of systems made of electrons and nuclei (molecules and periodic solids) within Density Functional Theory (DFT), using pseudopotentials and a planewave or wavelet basis.
abseil lib Abseil is an open-source collection of C++ library code designed to augment the C++ standard library. The Abseil library code is collected from Google's own C++ code base, has been extensively tested and used in production, and is the same code we depend on in our daily coding lives.
afni bio AFNI (Analysis of Functional NeuroImages) is a leading software suite of C, Python, R programs and shell scripts primarily developed for the analysis and display of anatomical and functional MRI (FMRI) data. It is freely available (both in source code and in precompiled binaries) for research purposes. The software is made to run on virtually an Unix system with X11 and Motif displays. Binary Packages are provided for MacOS and Linux systems including Fedora, Ubuntu (including Ubuntu under the Windows Subsytem for Linux)
agrep tools AGREP - an approximate GREP.
alamode chem ALAMODE is designed for analyzing lattice anharmonicity and lattice thermal conductivity of solids. By using an external DFT package such as VASP and Quantum ESPRESSO, you can extract harmonic and anharmonic force constants straightforwardly with ALAMODE. Using the anharmonic force constants, you can also calculate lattice thermal conductivity from first principles.
alphafold bio Open source code for AlphaFold
alphagenome bio Meson is a cross-platform build system designed to be both as fast and as user friendly as possible.
alphapulldown bio AlphaPulldown is a Python package that streamlines protein-protein interaction screens and high-throughput modelling of higher-order oligomers using AlphaFold-Multimer
alsa-lib lib The Advanced Linux Sound Architecture (ALSA) provides audio and MIDI functionality to the Linux operating system.
amber chem Amber (originally Assisted Model Building with Energy Refinement) is software for performing molecular dynamics and structure prediction.
angsd bio Program for analysing NGS data.
ansys cae ANSYS simulation software enables organizations to confidently predict how their products will operate in the real world. We believe that every product is a promise of something greater.
ant devel Apache Ant is a Java library and command-line tool whose mission is to drive processes described in build files as targets and extension points dependent upon each other. The main known usage of Ant is the build of Java applications.
ants data ANTs extracts information from complex datasets that include imaging. ANTs is useful for managing, interpreting and visualizing multidimensional data.
anvio bio Anvi'o is an open-source, community-driven analysis and visualization platform for microbial 'omics. It brings together many aspects of today's cutting-edge strategies including genomics, metagenomics, metatranscriptomics, pangenomics, metapangenomics, phylogenomics, and microbial population genetics in an integrated and easy-to-use fashion through extensive interactive visualization capabilities.
aocc compiler AMD Optimized C/C++ & Fortran compilers (AOCC) based on LLVM
apptainer tools Apptainer/Singularity is an application containerization solution for High-Performance Computing (HPC). The goal of Apptainer is to allow for "mobility of computing": an application containerized on one Linux system should be able to run on another system, as it is, and without the need to reconcile software dependencies and Linux version differences between the source and target systems.
archspec tools A library for detecting, labeling, and reasoning about microarchitectures
aria2 tools aria2 is a lightweight multi-protocol & multi-source, cross platform download utility operated in command-line. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.
armadillo numlib Armadillo is an open-source C++ linear algebra library (matrix maths) aiming towards a good balance between speed and ease of use. Integer, floating point and complex numbers are supported, as well as a subset of trigonometric and statistics functions.
arpack-ng numlib ARPACK is a collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.
ase chem The Atomic Simulation Environment (ASE) is a set of tools and Python modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations.
aspera-connect tools Connect is an install-on-demand Web browser plug-in that facilitates high-speed uploads and downloads with an Aspera transfer server.
assimp vis Open Asset Import Library (assimp) is a library to import and export various 3d-model-formats including scene-post-processing to generate missing render data.
at-spi2-atk vis AT-SPI 2 toolkit bridge
at-spi2-core vis Assistive Technology Service Provider Interface.
atat chem The Alloy-Theoretic Automated Toolkit (ATAT) is a generic name that refers to a collection of alloy theory tools
atk vis ATK provides the set of accessibility interfaces that are implemented by other toolkits and applications. Using the ATK interfaces, accessibility tools have full access to view and control running applications.
attrdict3 lib AttrDict is a Python library that provides mapping objects that allow their elements to be accessed both as keys and as attributes.
augustus bio AUGUSTUS is a program that predicts genes in eukaryotic genomic sequences
autoconf devel Autoconf is an extensible package of M4 macros that produce shell scripts to automatically configure software source code packages. These scripts can adapt the packages to many kinds of UNIX-like systems without manual user intervention. Autoconf creates a configuration script for a package from a template file that lists the operating system features that the package can use, in the form of M4 macro calls.
automake devel Automake: GNU Standards-compliant Makefile generator
autotools devel This bundle collect the standard GNU build tools: Autoconf, Automake and libtool
awscli tools This package provides a unified command line interface to Amazon Web Services.
axel tools Lightweight CLI download accelerator
bamtools bio BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.
bart bio BART (Binding Analysis for Regulation of Transcription) is a bioinformatics tool for predicting functional transcription factors (TFs) that bind at genomic cis-regulatory regions to regulate gene expression in the human or mouse genomes, given a query gene set or a ChIP-seq dataset as input.
bart-mri bio The Berkeley Advanced Reconstruction Toolbox (BART) toolbox is a free and open-source image-reconstruction framework for Computational Magnetic Resonance Imaging developed by the research groups of Martin Uecker (Goettingen University), Jon Tamir (UT Austin), and Michael Lustig (UC Berkeley). It consists of a programming library and a toolbox of command-line programs. The library provides common operations on multi-dimensional arrays, Fourier and wavelet transforms, as well as generic implementations of iterative optimization algorithms. The command-line tools provide direct access to basic operations on multi-dimensional arrays as well as efficient implementations of many calibration and reconstruction algorithms for parallel imaging and compressed sen.
bazel devel Bazel is a build tool that builds code quickly and reliably. It is used to build the majority of Google's software.
bbmap bio BBMap includes a short read aligner, and other bioinformatic tools.
bcftools bio SAMtools is a suite of programs for interacting with high-throughput sequencing data. BCFtools - Reading/writing BCF2/VCF/gVCF files and calling/filtering/summarising SNP and short indel sequence variants
bcl2fastq2 bio bcl2fastq Conversion Software both demultiplexes data and converts BCL files generated by Illumina sequencing systems to standard FASTQ file formats for downstream analysis.
beagle bio Beagle is a software package for phasing genotypes and for imputing ungenotyped markers.
bedops bio BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale. Tasks can be easily split by chromosome for distributing whole-genome analyses across a computational cluster.
bedtools bio The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM.
berkeley_db tools Berkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects.
bicseq2-norm bio BICseq2 is an algorithm developed for the normalization of high-throughput sequencing (HTS) data and detect copy number variations (CNV) in the genome. BICseq2 can be used for detecting CNVs with or without a control genome. BICseq2-norm is for normalizing potential biases in the sequencing data.
bicseq2-seg bio BICseq2 is an algorithm developed for the normalization of high-throughput sequencing (HTS) data and detect copy number variations (CNV) in the genome. BICseq2 can be used for detecting CNVs with or without a control genome. BICseq2-seg is for detecting CNVs based on the normalized data given by BICseq2-norm.
binutils tools binutils: GNU binary utilities
bioawk bio Bioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names. It also adds a few built-in functions and an command line option to use TAB as the input/output delimiter. When the new functionality is not used, bioawk is intended to behave exactly the same as the original BWK awk.
bioconda bio Bioconda is a channel for the conda package manager specializing in bioinformatics software.
bioperl bio Bioperl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and database searching objects.
biopython bio Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics.
bismark bio A tool to map bisulfite converted sequence reads and determine cytosine methylation states
bison lang Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser employing LALR(1) parser tables.
blast bio Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences.
blat bio BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 25 bases or more.
blender vis Blender is the free and open source 3D creation suite. It supports the entirety of the 3D pipeline, modeling, rigging, animation, simulation, rendering, compositing and motion tracking, even video editing and game creation.
blitz++ lib Blitz++ is a (LGPLv3+) licensed meta-template library for array manipulation in C++ with a speed comparable to Fortran implementations, while preserving an object-oriented interface
boost devel Boost provides free peer-reviewed portable C++ source libraries.
boost.mpi devel Boost provides free peer-reviewed portable C++ source libraries.
bowtie2 bio Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.
bracken bio Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
brotli lib Brotli is a generic-purpose lossless compression algorithm that compresses data using a combination of a modern variant of the LZ77 algorithm, Huffman coding and 2nd order context modeling, with a compression ratio comparable to the best currently available general-purpose compression methods. It is similar in speed with deflate but offers more dense compression. The specification of the Brotli Compressed Data Format is defined in RFC 7932.
brunsli lib Brunsli is a lossless JPEG repacking library.
build devel A simple, correct Python build frontend.
busco bio BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs
bwa bio Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome.
bzip2 tools bzip2 is a freely available, patent free, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression.
cairo vis Cairo is a 2D graphics library with support for multiple output devices. Currently supported output targets include the X Window System (via both Xlib and XCB), Quartz, Win32, image buffers, PostScript, PDF, and SVG file output. Experimental backends include OpenGL, BeOS, OS/2, and DirectFB
canu bio Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing
cargo-c tools Applet for cargo to build and install C-ABI compatible dynamic and static libraries. It produces and installs a correct pkg-config file, a static library and a dynamic library, and a C header to be used by any C (and C-compatible) software.
catch2 lib A modern, C++-native, header-only, test framework for unit-tests, TDD and BDD - using C++11, C++14, C++17 and later
cc3d bio CompuCell3D is a flexible scriptable modeling environment, which allows the rapid construction of sharable Virtual Tissue in silico simulations of a wide variety of multi-scale, multi-cellular problems including angiogenesis, bacterial colonies, cancer, developmental biology, evolution, the immune system, tissue engineering, toxicology and even non-cellular soft materials.
cd-hit bio CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.
cellpose bio A generalist algorithm for cellular segmentation.
cellprofiler bio CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.
cellranger bio A set of analysis piplines that perform sample demultiplexing, barcode processing, and single cell 3' gene counting.
cellranger-arc bio Cell Ranger ARC is a set of analysis pipelines that process Chromium Single Cell Multiome ATAC + Gene Expression sequencing data to generate a variety of analyses pertaining to gene expression, chromatin accessibility and their linkage.
cellranger-atac bio Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.
cellranger-dna bio Cell Ranger DNA is a set of analysis pipelines that process Chromium single cell DNA sequencing output to align reads, identify copy number variation (CNV), and compare heterogeneity among cells.
cereal lib cereal is a header-only C++11 serialization library. cereal takes arbitrary data types and reversibly turns them into different representations, such as compact binary encodings, XML, or JSON. cereal was designed to be fast, light-weight, and easy to extend - it has no external dependencies and can be easily bundled with other code or used standalone.
cesm geo CESM is a fully-coupled, community, global climate model that provides state-of-the-art computer simulations of the Earth's past, present, and future climate states.
cffi tools C Foreign Function Interface for Python. Interact with almost any C code from Python, based on C-like declarations that you can often copy-paste from header files or documentation.
cfitsio lib CFITSIO is a library of C and Fortran subroutines for reading and writing data files in FITS (Flexible Image Transport System) data format.
cgal numlib The goal of the CGAL Open Source Project is to provide easy access to efficient and reliable geometric algorithms in the form of a C++ library.
chemps2 chem CheMPS2 is a scientific library which contains a spin-adapted implementation of the density matrix renormalization group (DMRG) for ab initio quantum chemistry.
chopper bio Rust implementation of NanoFilt+NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. Filtering is done on average read quality and minimal or maximal read length, and applying a headcrop (start of read) and tailcrop (end of read) while printing the reads passing the filter.
circos bio Circos is a software package for visualizing data and information. It visualizes data in a circular layout - this makes Circos ideal for exploring relationships between objects or positions.
clapack math C version of LAPACK
clara-parabricks bio NVIDIA Parabricks is the only GPU-accelerated computational genomics toolkit that delivers fast and accurate analysis for sequencing centers, clinical teams, genomics researchers, and next-generation sequencing instrument developers.
clhep numlib The CLHEP project is intended to be a set of HEP-specific foundation and utility classes such as random generators, physics vectors, geometry and linear algebra. CLHEP is structured in a set of packages independent of any external package.
cloudcompare geo 3D point cloud and mesh processing software
cmake devel CMake, the cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.
code-server tools Run VS Code on any machine anywhere and access it in the browser.
cp-analyst bio CellProfiler Analyst (CPA) allows interactive exploration and analysis of data, particularly from high-throughput, image-based experiments. Included is a supervised machine learning system which can be trained to recognize complicated and subtle phenotypes, for automatic scoring of millions of cells. CellProfiler is an image processing package to generate morphometric measurements.
cp2k chem CP2K is a freely available (GPL) program, written in Fortran 95, to perform atomistic and molecular simulations of solid state, liquid, molecular and biological systems. It provides a general framework for different methods such as e.g. density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW), and classical pair and many-body potentials.
cppcheck tools Cppcheck is a static analysis tool for C/C++ code. It provides unique code analysis to detect bugs and focuses on detecting undefined behaviour and dangerous coding constructs.
cppy tools A small C++ header library which makes it easier to write Python extension modules. The primary feature is a PyObject smart pointer which automatically handles reference counting and provides convenience methods for performing common object operations.
crest chem CREST is an utility/driver program for the xtb program. Originally it was designed as conformer sampling program, hence the abbreviation Conformer–Rotamer Ensemble Sampling Tool, but now offers also some utility functions for calculations with the GFNn–xTB methods. Generally the program functions as an IO based OMP scheduler (i.e., calculations are performed by the xtb program) and tool for the creation and analysation of structure ensembles.
crossftp tools CrossFTP is a free FTP, SFTP, WebDav, Amazon S3, Amazon Glacier, Microsoft Azure, Google storage, and OpenStack Swift client for Win, Mac, and Linux.
cryosparc data CryoSPARC is a state of the art scientific software platform for cryo-electron microscopy (cryo-EM) used in research and drug discovery pipelines.
cryptography tools cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
ctffind bio Program for finding CTFs of electron micrographs.
cuda system CUDA (formerly Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.
cudnn numlib The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.
cufflinks bio Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.
cuquantum lib NVIDIA cuQuantum is an SDK of libraries and tools for quantum computing workflows.
curl tools libcurl is a free and easy-to-use client-side URL transfer library, supporting DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet and TFTP. libcurl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, Kerberos), file transfer resume, http proxy tunneling and more.
cutadapt bio Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
cython lang Cython is an optimising static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex).
cytoscape bio Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data. A lot of Apps are available for various kinds of problem domains, including bioinformatics, social network analysis, and semantic web.
datamash data GNU datamash performs basic numeric, textual and statistical operations on input data files
db tools Berkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects.
db_file data Perl5 access to Berkeley DB version 1.x.
dbcsr chem DBCSR stands for Distributed Blocked Compressed Sparse Row. DBCSR is a library designed to efficiently perform sparse matrix-matrix multiplication, among other operations.
dbus devel D-Bus is a message bus system, a simple way for applications to talk to one another. In addition to interprocess communication, D-Bus helps coordinate process lifecycle; it makes it simple and reliable to code a "single instance" application or daemon, and to launch applications and daemons on demand when their services are needed.
ddd vis DDD is a graphical front-end for command-line debuggers such as GDB, DBX, WDB, Ladebug, JDB, XDB, the Perl debugger, the bash debugger bashdb, the GNU Make debugger remake, or the Python debugger pydb.
decontaminer bio decontaMiner, a tool for detecting contaminating organisms in human unmapped sequences.
deeplabcut bio DeepLabCut is a toolbox for markerless pose estimation of animals performing various tasks.
deeptools bio deepTools addresses the challenge of handling the large amounts of data that are now routinely generated from DNA sequencing centers. deepTools contains useful modules to process the mapped reads data for multiple quality checks, creating normalized coverage files in standard bedGraph and bigWig file formats, that allow comparison between different files (for example, treatment and control). Finally, using such normalized and standardized files, deepTools can create many publication-ready visualizations to identify enrichments and for functional annotations of the genome.
diamond bio DIAMOND is a sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000.
dotnet lang .NET is a free, cross-platform, open source developer platform for building many different types of applications. With .NET, you can use multiple languages, editors, and libraries to build for web, mobile, desktop, gaming, and IoT. Contains the SDK and the Runtime.
dotnet-sdk lang .NET is a free, cross-platform, open source developer platform for building many different types of applications.
double-conversion lib Efficient binary-decimal and decimal-binary conversion routines for IEEE doubles.
doxygen devel Doxygen is a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D.
drmaa tools DRMAA for Slurm Workload Manager (Slurm) is an implementation of Open Grid Forum Distributed Resource Management Application API (DRMAA) version 1 for submission and control of jobs to Slurm. Using DRMAA, grid applications builders, portal developers and ISVs can use the same high-level API to link their software with different cluster/resource management systems.
dssp bio This is a rewrite of DSSP, now offering full mmCIF support. The difference with previous releases of DSSP is that it now writes out an annotated mmCIF file by default, storing the secondary structure information in the _struct_conf category.
easybuild tools EasyBuild is a software build and installation framework written in Python that allows you to install software in a structured, repeatable and robust way.
ecbuild tools A CMake-based build system, consisting of a collection of CMake macros and functions that ease the managing of software build systems
eccodes tools ecCodes is a package developed by ECMWF which provides an application programming interface and a set of tools for decoding and encoding messages in the following formats: WMO FM-92 GRIB edition 1 and edition 2, WMO FM-94 BUFR edition 3 and edition 4, WMO GTS abbreviated header (only decoding).
eigen math Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
eigensoft bio The EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification correction method (Price et al. 2006). The EIGENSTRAT method uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypes.
elfutils lib The elfutils project provides libraries and tools for ELF files and DWARF data.
elpa math Eigenvalue SoLvers for Petaflop-Applications.
epstopdf tools Epstopdf is a Perl script that converts an EPS file to an ‘encapsulated’ PDF file (a single page file whose media box is the same as the original EPS’s bounding box).
esmf geo The Earth System Modeling Framework (ESMF) is a suite of software tools for developing high-performance, multi-component Earth science modeling applications.
expat tools Expat is an XML parser library written in C. It is a stream-oriented parser in which an application registers handlers for things the parser might find in the XML document (like start tags).
expressdiff bio Differential Analysis Pipeline for RNA-Seq Data
faad2 lib FAAD2 is a HE, LC, MAIN and LTP profile, MPEG2 and MPEG-4 AAC decoder. FAAD2 includes code for SBR (HE AAC) decoding.
fasta bio The FASTA programs find regions of local or global (new) similarity between protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence.
fastenloc bio fastENLOC: fast enrichment estimation aided colocalization analysis enables integrative genetic association analysis of molecular QTL data and GWAS data.
fastqc bio FastQC is a quality control application for high throughput sequence data. It reads in sequence data in a variety of formats and can either provide an interactive application to review the results of several different QC checks, or create an HTML based report which can be integrated into a pipeline.
fastx-toolkit bio The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
ffmpeg vis A complete, cross-platform solution to record, convert and stream audio and video.
fftw numlib FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data.
fiji vis Fiji is an image processing package—a 'batteries-included' distribution of ImageJ, bundling a lot of plugins which facilitate scientific image analysis. This release is based on ImageJ-2.1.0 and Fiji-2.1.1
file system The file command is 'a file type guesser', that is, a command-line tool that tells you in words what kind of data a file contains.
flex lang Flex (Fast Lexical Analyzer) is a tool for generating scanners. A scanner, sometimes called a tokenizer, is a program which recognizes lexical patterns in text.
flit tools A simple packaging tool for simple packages.
fltk vis FLTK is a cross-platform C++ GUI toolkit for UNIX/Linux (X11), Microsoft Windows, and MacOS X. FLTK provides modern GUI functionality without the bloat and supports 3D graphics via OpenGL and its built-in GLUT emulation.
fmriprep bio fMRIPrep is a NiPreps (NeuroImaging PREProcessing toolS) application (www.nipreps.org) for the preprocessing of task-based and resting-state functional MRI (fMRI).
fmt lib fmt (formerly cppformat) is an open-source formatting library.
fontconfig vis Fontconfig is a library designed to provide system-wide font configuration, customization and application access.
fonttools devel fontTools is a library for manipulating fonts, written in Python. The project includes the TTX tool, that can convert TrueType and OpenType fonts to and from an XML text format, which is also called TTX. It supports TrueType, OpenType, AFM and to an extent Type 1 and some Mac-specific formats.
fox-toolkit lib FOX is a C++ based Toolkit for developing Graphical User Interfaces easily and effectively. It offers a wide, and growing, collection of Controls, and provides state of the art facilities such as drag and drop, selection, as well as OpenGL widgets for 3D graphical manipulation. FOX also implements icons, images, and user-convenience features such as status line help, and tooltips.
freebayes bio FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.
freeglut lib freeglut is a completely OpenSourced alternative to the OpenGL Utility Toolkit (GLUT) library.
freesurfer bio FreeSurfer is a set of tools for analysis and visualization of structural and functional brain imaging data. FreeSurfer contains a fully automatic structural imaging stream for processing cross sectional and longitudinal data.
freetype vis FreeType 2 is a software font engine that is designed to be small, efficient, highly customizable, and portable while capable of producing high-quality output (glyph images). It can be used in graphics libraries, display servers, font conversion tools, text image generation tools, and many other products as well.
freexl lib FreeXL is an open source library to extract valid data from within an Excel (.xls) spreadsheet.
fribidi lang The Free Implementation of the Unicode Bidirectional Algorithm.
fsl bio FSL is a comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data.
g2clib data Library contains GRIB2 encoder/decoder ('C' version).
g2lib data Library contains GRIB2 encoder/decoder and search/indexing routines.
gatk bio The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
gaussian chem Gaussian is a suite of electronic-structure codes.
gawk tools The awk utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs with just a few lines of code.
gcc compiler The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages (libstdc++, libgcj,...).
gcloud tools The Cloud SDK is a set of tools for Cloud Platform. It contains gcloud, gsutil, and bq, which you can use to access Google Compute Engine, Google Cloud Storage, Google BigQuery, and other products and services from the command-line.
gd bio GD.pm - Interface to Gd Graphics Library
gdal data GDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing.
gdb debugger The GNU Project Debugger
gdc-client tools The gdc-client provides several convenience functions over the GDC API which provides general download/upload via HTTPS.
gdk-pixbuf vis The Gdk Pixbuf is a toolkit for image loading and pixel buffer manipulation. It is used by GTK+ 2 and GTK+ 3 to load and manipulate images. In the past it was distributed as part of GTK+ 2 but it was split off into a separate package in preparation for the change to GTK+ 3.
geany tools Geany is a text editor using the GTK+ toolkit with basic features of an integrated development environment.
gemma bio Genome-wide Efficient Mixed Model Association
genometools bio The GenomeTools genome analysis system is a free collection of bioinformatics tools (in the realm of genome informatics) combined into a single binary named gt. It is based on a C library named “libgenometools” which consists of several modules.
geos math GEOS (Geometry Engine - Open Source) is a C++ port of the Java Topology Suite (JTS)
gettext tools GNU 'gettext' is an important step for the GNU Translation Project, as it is an asset on which we may build many other steps. This package offers to programmers, translators, and even users, a well integrated set of tools and documentation
gffcompare bio The program gffcompare can be used to compare, merge, annotate, and estimate accuracy of one or more GFF files (the 'query' files), when compared with a reference annotation (also provided as GFF).
ghc compiler The Glorious/Glasgow Haskell Compiler
ghostscript tools Ghostscript is a versatile processor for PostScript data with the ability to render PostScript to different targets. It used to be part of the cups printing stack, but is no longer used for that.
giflib lib giflib is a library for reading and writing gif images. It is API and ABI compatible with libungif which was in wide use while the LZW compression algorithm was patented.
gildas astro GILDAS is a collection of state-of-the-art software oriented toward (sub-)millimeter radioastronomical applications (either single-dish or interferometer). It is daily used to reduce all data acquired with the IRAM 30M telescope and the NOrthern Extended Millimeter Array NOEMA (except VLBI observations).
git tools Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
git-annex tools git-annex allows managing large files with git, without storing the file contents in git. It can sync, backup, and archive your data, offline and online. Checksums and encryption keep your data safe and secure. Bring the power and distributed nature of git to bear on your large files with git-annex.
git-lfs tools Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com
gklib-metis ai A library of various helper routines and frameworks used by many of the lab's software
gl2ps vis GL2PS: an OpenGL to PostScript printing library
glew lib The OpenGL Extension Wrangler Library (GLEW) is a cross-platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform.
glfw lib GLFW is an Open Source, multi-platform library for OpenGL, OpenGL ES and Vulkan development on the desktop. It provides a simple API for creating windows, contexts and surfaces, receiving input and events.
glib vis GLib is one of the base libraries of the GTK+ project
globus_cli tools Globus CLI is a standalone application that can be installed on the user’s machine and used to access the Globus file transfer service.
glow tools Render markdown on the CLI, with pizzazz!
glslang-spirv compiler Glslang is the official reference compiler front end for the OpenGL ES and OpenGL shading languages. It implements a strict interpretation of the specifications for these languages. It is open and free for anyone to use, either from a command line or programmatically. The OpenGL and OpenGL ES working groups are committed to maintaining consistency between the reference compiler and the corresponding shading language specifications.
gmap-gsnap bio GMAP: A Genomic Mapping and Alignment Program for mRNA and EST Sequences GSNAP: Genomic Short-read Nucleotide Alignment Program
gmp math GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers.
gmvapich2 toolchain GNU Compiler Collection (GCC) based compiler toolchain, including MVAPICH2 for MPI support.
gnuplot vis Portable interactive, function plotting utility
go compiler Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.
gobject-introspection devel GObject introspection is a middleware layer between C libraries (using GObject) and language bindings. The C library can be scanned at compile time and generate a metadata file, in addition to the actual native C library. Then at runtime, language bindings can read this metadata and automatically provide bindings to call into the C library.
golf toolchain GNU Compiler Collection (GCC) based compiler toolchain, including OpenBLAS (BLAS and LAPACK support) and FFTW.
gompi toolchain GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support.
google-api tools Google APIs give you programmatic access to Google Maps, Google Drive, YouTube, and many other Google products.
googletest tools Google's framework for writing C++ tests on a variety of platforms
goolf toolchain GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
gperf devel GNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only.
gperftools tools gperftools is a collection of a high-performance multi-threaded malloc() implementation, plus some pretty nifty performance analysis tools. Includes TCMalloc, heap-checker, heap-profiler and cpu-profiler.
gpumd chem GPUMD stands for Graphics Processing Units Molecular Dynamics. It is a general-purpose molecular dynamics (MD) code fully implemented on graphics processing units (GPUs).
gpustat tools dstat-like utilization monitor for NVIDIA GPUs
grace vis Grace is a WYSIWYG 2D plotting tool for X Windows System and Motif.
grackle astro Grackle is a chemistry and radiative cooling library for astrophysical simulations and models. Grackle has interfaces for C, C++, Fortran, and Python codes
graphene lib Graphene is a thin layer of types for graphic libraries
graphite2 lib Graphite is a "smart font" system developed specifically to handle the complexities of lesser-known languages of the world.
grass geo The Geographic Resources Analysis Support System - used for geospatial data management and analysis, image processing, graphics and maps production, spatial modeling, and visualization
gromacs chem GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
gsea bio Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes).
gsl numlib The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting.
gst-plugins-bad vis GStreamer is a library for constructing graphs of media-handling components. The applications it supports range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing.
gst-plugins-base vis GStreamer is a library for constructing graphs of media-handling components. The applications it supports range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing.
gstreamer vis GStreamer is a library for constructing graphs of media-handling components. The applications it supports range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing.
gtk3 vis GTK+ is the primary library used to construct user interfaces in GNOME. It provides all the user interface controls, or widgets, used in a common graphical application. Its object-oriented API allows you to construct user interfaces without dealing with the low-level details of drawing and device interaction.
gtk4 vis GTK+ is the primary library used to construct user interfaces in GNOME. It provides all the user interface controls, or widgets, used in a common graphical application. Its object-oriented API allows you to construct user interfaces without dealing with the low-level details of drawing and device interaction.
gurobi math The Gurobi Optimizer is a state-of-the-art solver for mathematical programming. The solvers in the Gurobi Optimizer were designed from the ground up to exploit modern architectures and multi-core processors, using the most advanced implementations of the latest algorithms.
gzip tools gzip (GNU zip) is a popular data compression program as a replacement for compress
h5py data HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data.
harfbuzz vis HarfBuzz is an OpenType text shaping engine.
hatchling tools Extensible, standards compliant build backend used by Hatch, a modern, extensible Python project manager.
hdf data HDF (also known as HDF4) is a library and multi-object file format for storing and managing data between machines.
hdf5 data HDF5 is a unique technology suite that makes possible the management of extremely large and complex data collections.
hexrd phys HEXRD provides a collection of resources for analysis of x-ray diffraction data, especially high-energy x-ray diffraction. HEXRD is comprised of a library and API for writing scripts, a command line interface, and an interactive graphical user interface.
hic-pro bio HiC-Pro is an optimized and flexible pipeline for Hi-C data processing.
hisat2 bio HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) against the general human population (as well as against a single reference genome).
hmmer bio HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST. Patched according to https://github.com/google-deepmind/alphafold3/issues/525
homer bio HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and ChIP-Seq analysis. It is a collection of command line programs for unix-style operating systems written in mostly perl and c++. Homer was primarily written as a de novo motif discovery algorithm that is well suited for finding 8-12 bp motifs in large scale genomics data.
htslib bio A C library for reading/writing high-throughput sequencing data. This package includes the utilities bgzip and tabix
hunspell tools Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex word compounding or character encoding. Hunspell interfaces: Ispell-like terminal interface using Curses library, Ispell pipe interface, C++ class and C functions.
hwloc system The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.
hypothesis tools Hypothesis is an advanced testing library for Python. It lets you write tests which are parametrized by a source of examples, and then generates simple and comprehensible examples that make your tests fail. This lets you find more bugs in your code with less work.
hypre numlib Hypre is a library for solving large, sparse linear systems of equations on massively parallel computers. The problems of interest arise in the simulation codes being developed at LLNL and elsewhere to study physical phenomena in the defense, environmental, energy, and biological sciences.
icu lib ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications.
idl %!s(<nil>) IDL is an interpreted programming language used to create analyses and visualizations of numerical data.
igvtools bio This package contains command line utilities for preprocessing, computing feature count density (coverage), sorting, and indexing data files.
iimpi toolchain Intel C/C++ and Fortran compilers, alongside Intel MPI.
iintelmpi toolchain Intel C/C++ and Fortran compilers with IntelMPI.
imagemagick vis ImageMagick is a software suite to create, edit, compose, or convert bitmap images
imath lib Imath is a C++ and python library of 2D and 3D vector, matrix, and math operations for computer graphics
imb perf The Intel MPI Benchmarks perform a set of MPI performance measurements for point-to-point and global communication operations for a range of message sizes
imkl numlib Intel oneAPI Math Kernel Library
imkl-fftw numlib FFTW interfaces using Intel oneAPI Math Kernel Library
impi mpi Intel MPI Library, compatible with MPICH ABI
intel toolchain Compiler toolchain including Intel compilers, Intel MPI and Intel Math Kernel Library (MKL).
intel-compilers compiler Intel C, C++ & Fortran compilers
intelmpi mpi IntelMPI from Intel.
intltool lang The Intltool is an internationalization tool used for extracting translatable strings from source files, collecting the extracted strings with messages from traditional source files, and merging the translations into .xml, .desktop and .oaf files.
iqtree bio Efficient phylogenomic software by maximum likelihood
irfinder data IRFinder is a tool for detecting intron retention from RNA-Seq experiments.
isaacgym data NVIDIA’s physics simulation environment for reinforcement learning research.
isl math isl is a library for manipulating sets and relations of integer points bounded by linear constraints.
isoseq bio IsoDeq3 is a Scalable De Novo Isoform Discovery
jags math JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation
jasper vis The JasPer Project is an open-source initiative to provide a free software-based reference implementation of the codec specified in the JPEG-2000 Part-1 standard.
java lang Java Platform, Standard Edition (Java SE) lets you develop and deploy Java applications on desktops and servers.
jbigkit vis JBIG-KIT is a software implementation of the JBIG1 data compression standard (ITU-T T.82), which was designed for bi-level image data, such as scanned documents.
jellyfish bio Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA.
jemalloc lib jemalloc is a general purpose malloc(3) implementation that emphasizes fragmentation avoidance and scalable concurrency support.
jobstats tools Jobstats is a job monitoring platform composed of data exporters, Prometheus, Grafana and the Slurm database whereas jobstats is a command that operates on the Jobstats platform.
jq tools jq is a lightweight and flexible command-line JSON processor.
json-c lib JSON-C implements a reference counting object model that allows you to easily construct JSON objects in C, output them as JSON formatted strings and parse JSON formatted strings back into the C representation of JSON objects.
jsoncpp lib JsonCpp is a C++ library that allows manipulating JSON values, including serialization and deserialization to and from strings. It can also preserve existing comment in unserialization/serialization steps, making it a convenient format to store user input files.
judy lib A C library that implements a dynamic array.
julia lang Julia is a high-level, high-performance dynamic programming language for numerical computing
jupyterlab tools JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality.
kahip math The graph partitioning framework KaHIP -- Karlsruhe High Quality Partitioning.
kallisto bio Kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.
kent-tools bio A set of genome utilities developed at the University of California Santa Cruz.
kim-api chem Open Knowledgebase of Interatomic Models. KIM is an API and OpenKIM is a collection of interatomic models (potentials) for atomistic simulations. This is a library that can be used by simulation programs to get access to the models in the OpenKIM database. This EasyBuild only installs the API, the models can be installed with the package openkim-models, or the user can install them manually by running kim-api-collections-management install user MODELNAME or kim-api-collections-management install user OpenKIM to install them all.
knime data KNIME is an analytics platform for data mining.
kraken2 bio Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.
kubectl tools The Kubernetes command-line tool, kubectl, allows you to run commands against Kubernetes clusters. You can use kubectl to deploy applications, inspect and manage cluster resources, and view logs.
lame data LAME is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.
lammps chem LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS is a classical molecular dynamics simulation code designed to run efficiently on parallel computers.
lapack numlib LAPACK is written in Fortran90 and provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems.
leda lib RStudio is a set of integrated tools designed to help you be more productive with R. LEDA is a C++ class library for efficient data types and algorithms that provide algorithmic in-depth knowledge of graph- and network problems, geometric computations, combinatorial opimization and other.
leptonica vis Leptonica is a collection of pedagogically-oriented open source software that is broadly useful for image processing and image analysis applications.
lerc lib LERC is an open-source image or raster format which supports rapid encoding and decoding for any pixel type (not just RGB or Byte). Users set the maximum compression error per pixel while encoding, so the precision of the original input image is preserved (within user defined error bounds).
lftp tools lftp is a sophisticated file transfer program supporting a number of network protocols.
lhapdf phys Les Houches Parton Density Function LHAPDF is the standard tool for evaluating parton distribution functions (PDFs) in high-energy physics.
libaec lib Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). The library achieves best results for low entropy data as often encountered in space imaging instrument data or numerical model output from weather or climate simulations. While floating point representations are not directly supported, they can also be efficiently coded by grouping exponents and mantissa.
libaio lib Asynchronous input/output library that uses the kernels native interface.
libaom lib
libarchive tools Multi-format archive and compression library
libcerf math libcerf is a self-contained numeric library that provides an efficient and accurate implementation of complex error functions, along with Dawson, Faddeeva, and Voigt functions.
libcifpp bio This library contains code to work with mmCIF and PDB files
libclc lib libclc is an open source, BSD/MIT dual licensed implementation of the library requirements of the OpenCL C programming language, as specified by the OpenCL 1.1 Specification.
libcroco lib Libcroco is a standalone css2 parsing and manipulation library.
libde265 tools libde265 is an open source implementation of the h.265 video codec
libdeflate system Heavily optimized library for DEFLATE/zlib/gzip compression and decompression.
libdrm lib Direct Rendering Manager runtime library.
libepoxy lib Epoxy is a library for handling OpenGL function pointer management for you
libevent lib The libevent API provides a mechanism to execute a callback function when a specific event occurs on a file descriptor or after a timeout has been reached. Furthermore, libevent also support callbacks due to signals or regular timeouts.
libfabric lib Libfabric is a core component of OFI. It is the library that defines and exports the user-space API of OFI, and is typically the only software that applications deal with directly. It works in conjunction with provider libraries, which are often integrated directly into libfabric.
libffi lib The libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run-time.
libgd lib GD is an open source code library for the dynamic creation of images by programmers.
libgeotiff lib Library for reading and writing coordinate system information from/to GeoTIFF files
libgit2 devel libgit2 is a portable, pure C implementation of the Git core methods provided as a re-entrant linkable library with a solid API, allowing you to write native speed custom Git applications in any language which supports C bindings.
libglu vis The OpenGL Utility Library (GLU) is a computer graphics library for OpenGL.
libglvnd lib libglvnd is a vendor-neutral dispatch layer for arbitrating OpenGL API calls between multiple vendors.
libgsf lib libgsf -- The G Structured File Library aims to provide an efficient extensible i/o abstraction for dealing with different structured file formats.
libgtextutils bio ligtextutils is a dependency of fastx-toolkit and is provided via the same upstream
libharu lib libHaru is a free, cross platform, open source library for generating PDF files.
libibmad system libibmad is a convenience library to encode, decode, and dump IB MAD packets. It is implemented on top of and in conjunction with libibumad (the umad kernel interface library.)
libibumad system libibumad is the umad kernel interface library.
libiconv lib Libiconv converts from one character encoding to another through Unicode conversion
libint chem Libint library is used to evaluate the traditional (electron repulsion) and certain novel two-body matrix elements (integrals) over Cartesian Gaussian functions used in modern atomic and molecular theory.
libjpeg-turbo lib libjpeg-turbo is a fork of the original IJG libjpeg which uses SIMD to accelerate baseline JPEG compression and decompression. libjpeg is a library that implements JPEG image encoding, decoding and transcoding.
libmatheval lib GNU libmatheval is a library (callable from C and Fortran) to parse and evaluate symbolic expressions input as text.
libmcfp bio A library for parsing command line arguments and configuration files and making them available throughout a program.
libogg lib Ogg is a multimedia container format, and the native file and stream format for the Xiph.org multimedia codecs.
libopus lib Opus is a totally open, royalty-free, highly versatile audio codec. Opus is unmatched for interactive speech and music transmission over the Internet, but is also intended for storage and streaming applications. It is standardized by the Internet Engineering Task Force (IETF) as RFC 6716 which incorporated technology from Skype’s SILK codec and Xiph.Org’s CELT codec.
libpciaccess system Generic PCI access library.
libpng lib libpng is the official PNG reference library
libpsl lib C library for the Public Suffix List
libreadline lib The GNU Readline library provides a set of functions for use by applications that allow users to edit command lines as they are typed in. Both Emacs and vi editing modes are available. The Readline library includes additional functions to maintain a list of previously-entered command lines, to recall and perhaps reedit those lines, and perform csh-like history expansion on previous commands.
librmath lib The routines supporting the distribution and special functions in R and a few others are declared in C header file Rmath.h. These can be compiled into a standalone library for linking to other applications.
librttopo lib The RT Topology Library exposes an API to create and manage standard (ISO 13249 aka SQL/MM) topologies using user-provided data stores.
libspatialite lib SpatiaLite is an open source library intended to extend the SQLite core to support fully fledged Spatial SQL capabilities.
libtiff lib tiff: Library and tools for reading and writing TIFF data files
libtirpc lib Libtirpc is a port of Suns Transport-Independent RPC library to Linux.
libtommath lib LibTomMath is a free open source portable number theoretic multiple-precision integer (MPI) library written entirely in C.
libtool lib GNU libtool is a generic library support script. Libtool hides the complexity of using shared libraries behind a consistent, portable interface.
libtorch data A binary distribution of all headers, libraries and CMake configuration files required to depend on PyTorch.
libunwind lib The primary goal of libunwind is to define a portable and efficient C programming interface (API) to determine the call-chain of a program. The API additionally provides the means to manipulate the preserved (callee-saved) state of each call-frame and to resume execution at any point in the call-chain (non-local goto). The API supports both local (same-process) and remote (across-process) operation. As such, the API is useful in a number of applications
libvorbis lib Ogg Vorbis is a fully open, non-proprietary, patent-and-royalty-free, general-purpose compressed audio format
libwebp lib WebP is a modern image format that provides superior lossless and lossy compression for images on the web. Using WebP, webmasters and web developers can create smaller, richer images that make the web faster.
libxc chem Libxc is a library of exchange-correlation functionals for density-functional theory. The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals.
libxml++ lib libxml++ is a C++ wrapper for the libxml XML parser library.
libxml2 lib Libxml2 is the XML C parser and toolchain developed for the Gnome project (but usable outside of the Gnome platform).
libxslt lib Libxslt is the XSLT C library developed for the GNOME project (but usable outside of the Gnome platform).
libxsmm math LIBXSMM is a library for small dense and small sparse matrix-matrix multiplications targeting Intel Architecture (x86).
libyaml lib LibYAML is a YAML parser and emitter written in C.
lit tools lit is a portable tool for executing LLVM and Clang style test suites, summarizing their results, and providing indication of failures.
littlecms vis Little CMS intends to be an OPEN SOURCE small-footprint color management engine, with special focus on accuracy and performance.
llama-cpp-python tools LLM inference in C/C++
llama.cpp tools Inference of Meta's LLaMA model (and others) in pure C/C++
llvm compiler The LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs (as well as some less common ones!) These libraries are built around a well specified code representation known as the LLVM intermediate representation ("LLVM IR"). The LLVM Core libraries are well documented, and it is particularly easy to invent your own language (or port an existing compiler) to use LLVM as an optimizer and code generator.
lmdb lib LMDB is a fast, memory-efficient database. With memory-mapped files, it has the read performance of a pure in-memory database while retaining the persistence of standard disk-based databases.
longranger bio Long Ranger is a set of analysis pipelines that processes Chromium sequencing output to align reads and call and phase SNPs, indels, and structural variants.
lpsolve math Mixed Integer Linear Programming (MILP) solver
lua lang Lua is a powerful, fast, lightweight, embeddable scripting language. Lua combines simple procedural syntax with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, runs by interpreting bytecode for a register-based virtual machine, and has automatic memory management with incremental garbage collection, making it ideal for configuration, scripting, and rapid prototyping.
lz4 lib LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core. It features an extremely fast decoder, with speed in multiple GB/s per core.
lzo devel Portable lossless data compression library
macs2 bio With the improvement of sequencing techniques, chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) is getting popular to study genome-wide protein-DNA interactions. To address the lack of powerful ChIP-Seq analysis method, we presented the Model-based Analysis of ChIP-Seq (MACS), for identifying transcript factor binding sites. MACS captures the influence of genome complexity to evaluate the significance of enriched ChIP regions and MACS improves the spatial resolution of binding sites through combining the information of both sequencing tag position and orientation.
macs3 bio With the improvement of sequencing techniques, chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) is getting popular to study genome-wide protein-DNA interactions. To address the lack of powerful ChIP-Seq analysis method, we presented the Model-based Analysis of ChIP-Seq (MACS), for identifying transcript factor binding sites. MACS captures the influence of genome complexity to evaluate the significance of enriched ChIP regions and MACS improves the spatial resolution of binding sites through combining the information of both sequencing tag position and orientation.
maestro bio MAESTRO(Model-based AnalysEs of Single-cell Transcriptome and RegulOme) is a comprehensive single-cell RNA-seq and ATAC-seq analysis suit built using snakemake. MAESTRO combines several dozen tools and packages to create an integrative pipeline, which enables scRNA-seq and scATAC-seq analysis from raw sequencing data (fastq files) all the way through alignment, quality control, cell filtering, normalization, unsupervised clustering, differential expression and peak calling, celltype annotation and transcription regulation analysis.
mafft bio MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc.
make devel GNU version of make utility
makeinfo devel makeinfo is part of the Texinfo project, the official documentation format of the GNU project.
mako devel A super-fast templating language that borrows the best ideas from the existing templating languages
mariadb data MariaDB is an enhanced, drop-in replacement for MySQL. Included engines: myISAM, Aria, InnoDB, RocksDB, TokuDB, OQGraph, Mroonga.
mathematica %!s(<nil>)
matlab %!s(<nil>)
matlab-proxy tools A Python package which enables you to launch MATLAB and access it from a web browser.
matplotlib vis matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell, web application servers, and six graphical user interface toolkits.
maturin tools This project is meant as a zero configuration replacement for setuptools-rust and milksnake. It supports building wheels for python 3.5+ on windows, linux, mac and freebsd, can upload them to pypi and has basic pypy and graalpy support.
maven devel Binary maven install, Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.
maxquant bio MaxQuant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data. Several labeling techniques as well as label-free quantification are supported.
mayavi vis A tool for easy and interactive visualization of data.
mcr math The MATLAB Runtime is a standalone set of shared libraries that enables the execution of compiled MATLAB applications or components on computers that do not have MATLAB installed.
meme bio The MEME Suite allows you to: * discover motifs using MEME, DREME (DNA only) or GLAM2 on groups of related DNA or protein sequences, * search sequence databases with motifs using MAST, FIMO, MCAST or GLAM2SCAN, * compare a motif to all motifs in a database of motifs, * associate motifs with Gene Ontology terms via their putative target genes, and * analyse motif enrichment using SpaMo or CentriMo.
mesa vis Mesa is an open-source implementation of the OpenGL specification - a system for rendering interactive 3D graphics.
meson tools Meson is a cross-platform build system designed to be both as fast and as user friendly as possible.
meson-python tools Python build backend (PEP 517) for Meson projects
metis math METIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices. The algorithms implemented in METIS are based on the multilevel recursive-bisection, multilevel k-way, and multi-constraint partitioning schemes.
miniforge lang Miniforge is a free minimal installer for conda and Mamba specific to conda-forge.
minimap2 bio Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR). At present, it works efficiently with query sequences from a few kilobases to ~100 megabases in length at an error rate ~15%. Minimap2 outputs in the PAF or the SAM format. On limited test data sets, minimap2 is over 20 times faster than most other long-read aligners. It will replace BWA-MEM for long reads and contig alignment.
minizip lib Mini zip and unzip based on zlib
mm-common devel The mm-common module provides the build infrastructure and utilities shared among the GNOME C++ binding libraries.
mongosh tools The MongoDB Shell, mongosh, is a fully functional JavaScript and Node.js 14.x REPL environment for interacting with MongoDB deployments. You can use the MongoDB Shell to test queries and operations directly with your database.
mothur bio Mothur is a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
mpc math Gnu Mpc is a C library for the arithmetic of complex numbers with arbitrarily high precision and correct rounding of the result. It extends the principles of the IEEE-754 standard for fixed precision real floating point numbers to complex numbers, providing well-defined semantics for every operation. At the same time, speed of operation at high precision is a major design goal.
mpfr math The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.
mpi4py lib MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.
mrc tools Maartens Resource Compiler to store resources in the static section of an ELF binary.
mrtrix3 bio MRtrix3 provides a set of tools to perform various types of diffusion MRI analyses, from various forms of tractography through to next-generation group-level analyses. It is designed with consistency, performance, and stability in mind, and is freely available under an open-source license. It is developed and maintained by a team of experts in the field, fostering an active community of users from diverse backgrounds.
mrtrix3tissue bio MRtrix3Tissue is a fork of the MRtrix3 project. It aims to add capabilities for 3-Tissue CSD modelling and analysis to a complete version of the MRtrix3 software.
multiqc bio MultiQC searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.
mumax3 tools GPU accelerated micromagnetic simulator.
mummer bio MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. AMOS makes use of it.
mumps math A parallel sparse direct solver
muscle bio MUSCLE is one of the best-performing multiple alignment programs according to published benchmark tests, with accuracy and speed that are consistently better than CLUSTALW. MUSCLE can align hundreds of sequences in seconds. Most users learn everything they need to know about MUSCLE in a few minutes-only a handful of command-line options are needed to perform common alignment tasks.
mvapich2 mpi The MVAPICH2 software, based on MPI 3.1 standard, delivers the best performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE networking technologies.
mysqlclient lib Python interface to MySQL
namd chem NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.
nasm lang NASM: General-purpose x86 assembler
ncbi-vdb bio The SRA Toolkit and SDK from NCBI is a collection of tools and libraries for using data in the INSDC Sequence Read Archives.
nccl lib The NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs.
nccl-tests tools Tests check both the performance and the correctness of NCCL operations.
ncdu tools Ncdu is a disk usage analyzer with an ncurses interface. It is designed to find space hogs on a remote server where you don't have an entire graphical setup available, but it is a useful tool even on regular desktop systems. Ncdu aims to be fast, simple and easy to use, and should be able to run in any minimal POSIX-like environment with ncurses installed.
ncurses devel The Ncurses (new curses) library is a free software emulation of curses in System V Release 4.0, and more. It uses Terminfo format, supports pads and color and multiple highlights and forms characters and function-key mapping, and has all the other SYSV-curses enhancements over BSD Curses.
ncview vis Ncview is a visual browser for netCDF format files. Typically you would use ncview to get a quick and easy, push-button look at your netCDF files. You can view simple movies of the data, view along various dimensions, take a look at the actual data values, change color maps, invert the data, etc.
netcdf data NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This module bundles the C++ and Fortran libaries.
netcdf-c data NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.
netcdf-cxx data NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.
netcdf-fortran data NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.
netlogo math NetLogo is a multi-agent programmable modeling environment. It is used by tens of thousands of students, teachers and researchers worldwide. It also powers HubNet participatory simulations. It is authored by Uri Wilensky and developed at the CCL.
nettle lib Nettle is a cryptographic library that is designed to fit easily in more or less any context: In crypto toolkits for object-oriented languages (C++, Python, Pike, ...), in applications like LSH or GNUPG, or even in kernel space.
neuron bio Empirically-based simulations of neurons and networks of neurons.
nextflow tools Nextflow is a reactive workflow framework and a programming DSL that eases writing computational pipelines with complex data
ngsf bio ngsF is a program to estimate per-individual inbreeding coefficients under a probabilistic framework that takes the uncertainty of genotype's assignation into account. It avoids calling genotypes by using genotype likelihoods or posterior probabilities.
ninja tools Ninja is a small build system with a focus on speed.
nlohmann_json lib JSON for Modern C++
nlopt numlib NLopt is a free/open-source library for nonlinear optimization, providing a common interface for a number of different free optimization routines available online as well as original implementations of various other algorithms.
nodejs lang Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices.
nspr lib Netscape Portable Runtime (NSPR) provides a platform-neutral API for system level and libc-like functions.
nss lib Network Security Services (NSS) is a set of libraries designed to support cross-platform development of security-enabled client and server applications.
ntl math NTL is a high-performance, portable C++ library providing data structures and algorithms for manipulating signed, arbitrary length integers, and for vectors, matrices, and polynomials over the integers and over finite fields.
numactl tools The numactl program allows you to run your application program on specific cpu's and memory nodes. It does this by supplying a NUMA memory policy to the operating system before running your program. The libnuma library provides convenient ways for you to add NUMA memory policies into your own program.
nvhpc toolchain Complete toolchain based on NVIDIA HPC SDK. Includes C, C++ and FORTRAN compilers (nvidia-compilers), an MPI implementation based on OpenMPI (NVHPCX) and math libraries based on OpenBLAS and ScaLAPACK.
nvidia-compilers compiler C, C++ and Fortran compilers included with the NVIDIA HPC SDK
nvompi toolchain NVHPC Compiler including OpenMPI for MPI support.
nvshmem devel NVSHMEM is a parallel programming interface based on OpenSHMEM that provides efficient and scalable communication for NVIDIA GPU clusters. NVSHMEM creates a global address space for data that spans the memory of multiple GPUs and can be accessed with fine-grained GPU-initiated operations, CPU-initiated operations, and operations on CUDA streams.
nvtop tools htop-like GPU usage monitor
ocaml compiler OCaml is an industrial-strength programming language supporting functional, imperative and object-oriented styles
ollama data Ollama is the easiest way to get up and running with large language models such as gpt-oss, Gemma 3, Qwen3 and more.
ollama-python tools The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.
openbabel chem Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It's an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.
openblas numlib OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
openexr vis OpenEXR is a high dynamic-range (HDR) image file format developed by Industrial Light & Magic for use in computer imaging applications
openfoam cae OpenFOAM is a free, open source CFD software package. OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.
opengl vis Open Graphics Library (OpenGL) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. This module is a bundle of software required for OpenGL rendering. It provides Mesa as an open-source implementation of the OpenGL specification with software rendering and AMD GPU support, libglvnd for a vendor neutral dispatch layer for rendering with both NVIDIA GPUs & Mesa, Mesa-demos for sample applications, and GLU as an computer graphics library utilizing OpenGL.
openjpeg lib OpenJPEG is an open-source JPEG 2000 codec written in C.
openmpi mpi The Open MPI Project is an open source MPI-3 implementation.
openms bio OpenMS is an open-source software C++ library for LC-MS data management and analyses. It offers an infrastructure for rapid development of mass spectrometry related software.
opensim data OpenSim is software that lets users develop models of musculoskeletal structures and create dynamic simulations of movement
openslide vis OpenSlide is a C library that provides a simple interface to read whole-slide images (also known as virtual slides).
openslide-python vis Python bindings for the OpenSlide libary
openssl system The OpenSSL Project is a collaborative effort to develop a robust, commercial-grade, full-featured, and Open Source toolchain implementing the Secure Sockets Layer (SSL v2/v3) and Transport Layer Security (TLS v1) protocols as well as a full-strength general purpose cryptography library.
optix vis OptiX is NVIDIA SDK for easy ray tracing performance. It provides a simple framework for accessing the GPU’s massive ray tracing power using state-of-the-art GPU algorithms.
orca chem ORCA is a flexible, efficient and easy-to-use general purpose tool for quantum chemistry with specific emphasis on spectroscopic properties of open-shell molecules. It features a wide variety of standard quantum chemical methods ranging from semiempirical methods to DFT to single- and multireference correlated ab initio methods. It can also treat environmental and relativistic effects.
ospray vis Open, Scalable, and Portable Ray Tracing Engine
osu-micro-benchmarks perf OSU Micro-Benchmarks
p4vasp chem Variation graphs provide a succinct encoding of the sequences of many genomes.
pango vis Pango is a library for laying out and rendering of text, with an emphasis on internationalization. Pango can be used anywhere that text layout is needed, though most of the work on Pango so far has been done in the context of the GTK+ widget toolkit. Pango forms the core of text and font handling for GTK+-2.x.
parallel tools parallel: Build and execute shell commands in parallel
paraview vis ParaView is a scientific parallel visualizer.
paraview-catalyst vis ParaView Catalyst provides a small, easy-to-use, API that any simulation developed in C++, C, Fortran or Python can use to do in situ analysis without developing its own custom data analysis code.
parmetis math ParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices. ParMETIS extends the functionality provided by METIS and includes routines that are especially suited for parallel AMR computations and large scale numerical simulations. The algorithms implemented in ParMETIS are based on the parallel multilevel k-way graph-partitioning, adaptive repartitioning, and parallel multi-constrained partitioning schemes.
pasapipeline bio PASA, acronym for Program to Assemble Spliced Alignments, is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments.
pbwt bio The pbwt package provides a core implementation and development environment for PBWT (Positional Burrows-Wheeler Transform) methods for storing and computing on genome variation data sets.
pcre devel The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5.
pcre2 devel The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5.
pdal tools PDAL is Point Data Abstraction Library. It is a C/C++ open source library and applications for translating and processing point cloud data. It is not limited to LiDAR data, although the focus and impetus for many of the tools in the library have their origins in LiDAR.
perceptual_tsne data Code to make perceptual embedding plots
perf tools Performance analysis tools for Linux
perl lang Larry Wall's Practical Extraction and Report Language Includes a small selection of extra CPAN packages for core functionality.
petsc numlib PETSc, pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.
phonopy chem Phonopy is an open source package of phonon calculations based on the supercell approach. Phono3py calculates phonon-phonon interaction and related properties using the supercell approach.
picard bio A set of tools (in Java) for working with next generation sequencing data in the BAM format.
pigz tools pigz, which stands for parallel implementation of gzip, is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data. pigz was written by Mark Adler, and uses the zlib and pthread libraries.
pillow vis Pillow is the 'friendly PIL fork' by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lundh and Contributors.
pipenv tools Pipenv is a tool that aims to bring the best of all packaging worlds (bundler, composer, npm, cargo, yarn, etc.) to the Python world.
pixman vis Pixman is a low-level software library for pixel manipulation, providing features such as image compositing and trapezoid rasterization. Important users of pixman are the cairo graphics library and the X server.
pkg-config devel pkg-config is a helper tool used when compiling applications and libraries. It helps you insert the correct compiler options on the command line so an application can use gcc -o test test.c `pkg-config --libs --cflags glib-2.0` for instance, rather than hard-coding values on where to find glib (or other libraries).
pkgconf devel pkgconf is a program which helps to configure compiler and linker flags for development libraries. It is similar to pkg-config from freedesktop.org.
plink bio PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
plumed chem PLUMED is an open source library for free energy calculations in molecular systems which works together with some of the most popular molecular dynamics engines. Free energy calculations can be performed as a function of many order parameters with a particular focus on biological problems, using state of the art methods such as metadynamics, umbrella sampling and Jarzynski-equation based steered MD. The software, written in C++, can be easily interfaced with both fortran and C/C++ codes.
pmix lib Process Management for Exascale Environments PMI Exascale (PMIx) represents an attempt to provide an extended version of the PMI standard specifically designed to support clusters up to and including exascale sizes. The overall objective of the project is not to branch the existing pseudo-standard definitions - in fact, PMIx fully supports both of the existing PMI-1 and PMI-2 APIs - but rather to (a) augment and extend those APIs to eliminate some current restrictions that impact scalability, and (b) provide a reference implementation of the PMI-server that demonstrates the desired level of scalability.
pnetcdf data Parallel netCDF: A Parallel I/O Library for NetCDF File Access
poetry tools Python packaging and dependency management made easy. Poetry helps you declare, manage and install dependencies of Python projects, ensuring you have the right stack everywhere.
postgresql data PostgreSQL is a powerful, open source object-relational database system. It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures (in multiple languages). It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR, VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video. It has native programming interfaces for C/C++, Java, .Net, Perl, Python, Ruby, Tcl, ODBC, among others, and exceptional documentation.
proj lib Program proj is a standard Unix filter function which converts geographic longitude and latitude coordinates into cartesian coordinates
proteowiz bio ProteoWizard provides a set of open-source, cross-platform software libraries and tools (e.g. msconvert, Skyline, IDPicker, SeeMS) that facilitate proteomics data analysis. The libraries enable rapid tool creation by providing a robust, pluggable development framework that simplifies and unifies data file access, and performs standard chemistry and LCMS dataset computations.
protobuf devel Protocol Buffers (a.k.a., protobuf) are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data.
protobuf-python devel Python Protocol Buffers runtime library.
prrte lib PRRTE is the PMIx Reference RunTime Environment
psutil lib A cross-platform process and system utilities module for Python
pybind11 lib pybind11 is a lightweight header-only library that exposes C++ types in Python and vice versa, mainly to create Python bindings of existing C++ code.
pycairo vis Python bindings for the cairo library
pygobject vis PyGObject is a Python package which provides bindings for GObject based libraries such as GTK, GStreamer, WebKitGTK, GLib, GIO and many more.
pymol vis PyMOL is a user-sponsored molecular visualization system on an open-source foundation, maintained and distributed by Schrödinger.
pyspark devel PySpark is the Python API for Apache Spark. It enables you to perform real-time, large-scale data processing in a distributed environment using Python. It also provides a PySpark shell for interactively analyzing your data.
python lang Intel® Distribution for Python. Powered by Anaconda. Accelerating Python* performance on modern architectures from Intel.
python-bundle-pypi lang Bundle of Python packages from PyPI
pytorch data PyTorch is a deep learning framework that puts Python first. It provides Tensors and Dynamic neural networks in Python with strong GPU acceleration.
pyyaml lib PyYAML is a YAML parser and emitter for the Python programming language.
qgis geo A Free and Open Source Geographic Information System
qhull math Qhull computes the convex hull, Delaunay triangulation, Voronoi diagram, halfspace intersection about a point, furthest-site Delaunay triangulation, and furthest-site Voronoi diagram. The source code runs in 2-d, 3-d, 4-d, and higher dimensions. Qhull implements the Quickhull algorithm for computing the convex hull.
qiime2 bio QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.
qt5 devel Qt is a comprehensive cross-platform C++ application framework.
qt6 devel Qt is a comprehensive cross-platform C++ application framework.
qtltools bio QTLtools is a tool set for molecular QTL discovery and analysis. It allows to go from the raw sequence data to collection of molecular Quantitative Trait Loci (QTLs) in few easy-to-perform steps.
qualimap bio Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
quantumespresso chem Quantum ESPRESSO is an integrated suite of computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials (both norm-conserving and ultrasoft).
qwt lib The Qwt library contains GUI Components and utility classes which are primarily useful for programs with a technical background.
radmc3d astro RADMC-3D is a tool for astrophysical research. It computes the observational appearance of an astrophysical object on the sky of the observer. It solves the non-local radiative transfer problem of dusty media, including thermal radiative transport and scattering.
rapidsai data The RAPIDS suite of open source software libraries and APIs gives you the ability to execute end-to-end data science and analytics pipelines entirely on GPUs. Licensed under Apache 2.0, RAPIDS is incubated by NVIDIA based on extensive hardware and data science science experience. RAPIDS utilizes NVIDIA CUDA primitives for low-level compute optimization, and exposes GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
raxml bio RAxML search algorithm for maximum likelihood based inference of phylogenetic trees.
raxml-ng bio RAxML Next Generation is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively performing a series of Subtree Pruning and Regrafting (SPR) moves, which allows to quickly navigate to the best-known ML tree.
rclone tools Rclone is a command line program to sync files and directories to and from a variety of online storage services
rdp-classifier bio The RDP Classifier is a naive Bayesian classifier that can rapidly and accurately provides taxonomic assignments from domain to genus, with confidence estimates for each assignment.
re2c tools re2c is a free and open-source lexer generator for C and C++. Its main goal is generating fast lexers: at least as fast as their reasonably optimized hand-coded counterparts. Instead of using traditional table-driven approach, re2c encodes the generated finite state automata directly in the form of conditional jumps and comparisons.
redis data Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions, and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
reframe devel ReFrame is a framework for writing regression tests for HPC systems.
relion bio RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).
relion-env bio RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).
repeatmasker bio RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
rip-md bio RIP-MD allows to apply Residue Interaction Networks (RINs) to the analysis of molecular dynamics simulations of protein.
rmblast bio RMBlast is a RepeatMasker compatible version of the standard NCBI BLAST suite. The primary difference between this distribution and the NCBI distribution is the addition of a new program 'rmblastn' for use with RepeatMasker and RepeatModeler.
root data The ROOT system provides a set of OO frameworks with all the functionality needed to handle and analyze large amounts of data in a very efficient way.
rosetta bio The Rosetta software suite includes algorithms for computational modeling and analysis of protein structures. It has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes.
rstudio-server lang RStudio is an integrated development environment (IDE) for the R programming language.
ruby lang Ruby is a dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.
rust lang Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.
sagemath data SageMath is a free open-source mathematics software system licensed under the GPL. It builds on top of many existing open-source packages: NumPy, SciPy, matplotlib, Sympy, Maxima, GAP, FLINT, R and many more. Access their combined power through a common, Python-based language or directly via interfaces or wrappers.
salmon bio Salmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimate from RNA-seq data.
sambamba bio Sambamba is a tool for processing BAM files.
samtools bio SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
sas math Statistical analysis package
sbt lang A build tool for Scala.
scalapack numlib The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers.
scikit-build lib Scikit-Build, or skbuild, is an improved build system generator for CPython C/C++/Fortran/Cython extensions.
scikit-build-core lib Scikit-build-core is a complete ground-up rewrite of scikit-build on top of modern packaging APIs. It provides a bridge between CMake and the Python build system, allowing you to make Python modules with CMake.
scipy-bundle lang Bundle of Python packages for scientific software
scons devel SCons is a software construction tool.
scotch math Software package and libraries for sequential and parallel graph partitioning, static mapping, and sparse matrix block ordering, and sequential mesh and hypergraph partitioning.
sdl2 lib SDL: Simple DirectMedia Layer, a cross-platform multimedia library
seqkit bio A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
seqoutbias bio Molecular biology enzymes have nucleic acid preferences for their substrates; the preference of an enzyme is typically dictated by the sequence at or near the active site of the enzyme. This bias may result in spurious read count patterns when used to interpret high-resolution molecular genomics data. The seqOutBias program aims to correct this issue by scaling the aligned read counts by the ratio of genome-wide observed read counts to the expected sequence based counts for each k-mer.
setuptools-rust tools setuptools-rust is a plugin for setuptools to build Rust Python extensions implemented with PyO3 or rust-cpython.
shapeit4 bio SHAPEIT4 is a fast and accurate method for estimation of haplotypes (aka phasing) for SNP array and high coverage sequencing data. The version 4 is a refactored and improved version of the SHAPEIT algorithm.
shapelib lib The Shapefile C Library provides the ability to write simple C programs for reading, writing and updating (to a limited extent) ESRI Shapefiles, and the associated attribute file (.dbf).
shared-mime-info tools The shared-mime-info package contains: The core database of common MIME types, their file extensions and icon names. The update-mime-database command, used to extend the DB and install a new MIME data. The freedesktop.org shared MIME database spec. It is used by GLib, GNOME, KDE, XFCE and many others.
shengbte chem ShengBTE is a software package for solving the Boltzmann Transport Equation for phonons.
siesta phys SIESTA is both a method and its computer program implementation, to perform efficient electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids.
silo data Silo is a library for reading and writing a wide variety of scientific data to binary, disk files
slepc numlib SLEPc (Scalable Library for Eigenvalue Problem Computations) is a software library for the solution of large scale sparse eigenvalue problems on parallel computers. It is an extension of PETSc and can be used for either standard or generalized eigenproblems, with real or complex arithmetic. It can also be used for computing a partial SVD of a large, sparse, rectangular matrix, and to solve quadratic eigenvalue problems.
slicer tools 3D Slicer is an open source software platform for medical image informatics, image processing, and three-dimensional visualization.
slim bio SLiM is an evolutionary simulation package that provides facilities for very easily and quickly constructing genetically explicit individual-based evolutionary models.
smrtlink bio PacBio’s open-source SMRT Analysis software suite is designed for use with Single Molecule, Real-Time (SMRT) Sequencing data. You can analyze, visualize, and manage your data through an intuitive GUI or command-line interface. You can also integrate SMRT Analysis in your existing data workflow through the extensive set of APIs provided
snakemake tools The Snakemake workflow management system is a tool to create reproducible and scalable data analyses.
snap-stanford vis Snap.py is a Python interface for SNAP. SNAP is a general purpose, high performance system for analysis and manipulation of large networks. SNAP is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges.
snappy lib Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression.
soci lang SOCI is a database access library for C++ that makes the illusion of embedding SQL queries in the regular C++ code, staying entirely within the Standard C++.
sortmerna bio SortMeRNA is a biological sequence analysis tool for filtering, mapping and OTU-picking NGS reads.
spaceranger bio A set of analysis piplines that perform sample demultiplexing, barcode processing, and single cell 3' gene counting.
spades bio SPAdes - St. Petersburg genome assembler - is an assembly toolkit containing various assembly pipelines.
spark devel Spark is Hadoop MapReduce done in memory
spglib lib Spglib is a library for finding and handling crystal symmetries written in C.
spin devel Developer tool for scientific Python libraries
sprng lib Scalable Parallel Pseudo Random Number Generators Library
sqlite devel SQLite: SQL Database Engine in a C Library
sra-toolkit bio The SRA Toolkit, and the source-code SRA System Development Kit (SDK), will allow you to programmatically access data housed within SRA and convert it from the SRA format
sratoolkit bio The SRA Toolkit, and the source-code SRA System Development Kit (SDK), will allow you to programmatically access data housed within SRA and convert it from the SRA format
stack devel Stack is a cross-platform program for developing Haskell projects. It is intended for Haskellers both new and experienced.
stacks bio Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.
star bio STAR aligns RNA-seq reads to a reference genome using uncompressed suffix arrays.
stata math Stata is a complete, integrated statistical software package that provides everything you need for data analysis, data management, and graphics.
stringtie bio StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts.
suitesparse numlib SuiteSparse is a collection of libraries manipulate sparse matrices.
sumo data Simulation of Urban MObility" (SUMO) is an open source, highly portable, microscopic and continuous traffic simulation package designed to handle large networks. It allows for intermodal simulation including pedestrians and comes with a large set of tools for scenario creation.
sundials math SUNDIALS: SUite of Nonlinear and DIfferential/ALgebraic Equation Solvers
superlu numlib SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations.
superlu_dist numlib SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines.
superlu_mt numlib SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations.
svt-av1 tools The Scalable Video Technology for AV1 (SVT-AV1 Encoder) is an AV1-compliant software encoder library. The work on the SVT-AV1 encoder targets the development of a production-quality AV1-encoder with performance levels applicable to a wide range of applications, from premium VOD to real-time and live encoding/transcoding.
swig devel SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages.
szip tools Szip compression software, providing lossless compression of scientific data
tbb lib Intel(R) Threading Building Blocks (Intel(R) TBB) lets you easily write parallel C++ programs that take full advantage of multicore performance, that are portable, composable and have future-proof scalability.
tcl lang Tcl (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more.
tensorflow data TensorFlow is an open-source software library for Machine Intelligence.
tesseract vis Tesseract is an optical character recognition engine
texinfo devel Texinfo is the official documentation format of the GNU project.
texlive tools TeX Live is intended to be a straightforward way to get up and running with the TeX document production system. It provides a comprehensive TeX system with binaries for most flavors of Unix, including GNU/Linux, macOS, and also Windows. It includes all the major TeX-related programs, macro packages, and fonts that are free software, including support for many languages around the world.
thirdorder chem A Python script to help create input files for computing anhamonic interatomic force constants, harnessing the symmetries of the system to minimize the number of required DFT calculations. A second mode of operation allows the user to build the third-order IFC matrix from the results of those runs.
tk vis Tk is an open source, cross-platform widget toolchain that provides a library of basic elements for building a graphical user interface (GUI) in many different programming languages.
tkinter lang Tkinter module, built with the Python buildsystem
tmux tools tmux is a terminal multiplexer: it enables a number of terminals to be created, accessed, and controlled from a single screen. tmux may be detached from a screen and continue running in the background, then later reattached.
togl vis Togl is a Tk widget for OpenGL rendering
totalview debugger TotalView is a GUI-based source code defect analysis tool that gives you unprecedented control over processes and thread execution and visibility into program state and variables. It allows you to debug one or many processes and/or threads in a single window with complete control over program execution. This allows you to set breakpoints, stepping line by line through the code on a single thread, or with coordinated groups of processes or threads, and run or halt arbitrary sets of processes or threads. You can reproduce and troubleshoot difficult problems that can occur in concurrent programs that take advantage of threads, OpenMP, MPI, GPUs or coprocessors.
tree tools Tree is a recursive directory listing command that produces a depth indented listing of files, which is colorized ala dircolors if the LS_COLORS environment variable is set and output is to tty.
trf bio Tandem Repeats Finder: a program to analyze DNA sequences.
trimgalore bio Trim Galore is a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data.
trimmomatic bio Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.
trinity bio Trinity represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-Seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-Seq reads.
ucc lib UCC (Unified Collective Communication) is a collective communication operations API and library that is flexible, complete, and feature-rich for current and emerging programming models and runtimes.
ucc-cuda lib UCC (Unified Collective Communication) is a collective communication operations API and library that is flexible, complete, and feature-rich for current and emerging programming models and runtimes. This module adds the UCC CUDA support.
ucx lib Unified Communication X An open-source production grade communication framework for data centric and high-performance applications
ucx-cuda lib Unified Communication X An open-source production grade communication framework for data centric and high-performance applications This module adds the UCX CUDA support.
udunits tools UDUNITS supports conversion of unit specifications between formatted and binary forms, arithmetic manipulation of units, and conversion of values between compatible scales of measurement.
unrar tools RAR is a powerful archive manager.
util-linux tools Set of Linux utilities
uv tools An extremely fast Python package installer and resolver, written in Rust.
vapor vis VAPOR is the Visualization and Analysis Platform for Ocean, Atmosphere, and Solar Researchers. VAPOR provides an interactive 3D visualization environment that can also produce animations and still frame images
varscan bio VarScan - Variant calling and somatic mutation/CNV detection for next-generation sequencing data
vasp chem The Vienna Ab initio Simulation Package (VASP) is a computer program for atomic scale materials modelling.
vcell bio VCell (Virtual Cell) is a comprehensive platform for modeling cell biological systems that is built on a central database and disseminated as a web application.
vcftools bio The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
vesta vis VESTA is a 3D visualization program for structured models, volumetric data such as electron/nuclear densities, and crystal morphologies.
viennarna bio The Vienna RNA Package consists of a C code library and several stand-alone programs for the prediction and comparison of RNA secondary structures.
virtualenv tools A tool for creating isolated virtual python environments.
vmd chem VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.
voro++ math Voro++ is a software library for carrying out three-dimensional computations of the Voronoi tessellation. A distinguishing feature of the Voro++ library is that it carries out cell-based calculations, computing the Voronoi cell for each particle individually. It is particularly well-suited for applications that rely on cell-based statistics, where features of Voronoi cells (eg. volume, centroid, number of faces) can be used to analyze a system of particles.
vsearch bio VSEARCH which supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
vtk vis The Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, image processing and visualization. VTK consists of a C++ class library and several interpreted interface layers including Tcl/Tk, Java, and Python. VTK supports a wide variety of visualization algorithms including: scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as: implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation.
wannier90 chem A tool for obtaining maximally-localised Wannier functions
wayland vis Wayland is a project to define a protocol for a compositor to talk to its clients as well as a library implementation of the protocol. The compositor can be a standalone display server running on Linux kernel modesetting and evdev input devices, an X application, or a wayland client itself. The clients can be traditional applications, X servers (rootless or fullscreen) or other display servers.
wigtobigwig bio The bigWig format is useful for dense, continuous data that will be displayed in the Genome Browser as a graph. BigWig files are created from wiggle (wig) type files using the program wigToBigWig.
wxpython vis Wraps the wxWidgets C++ toolkit and provides access to the user interface portions of the wxWidgets API, enabling Python applications to have a native GUI on Windows, Macs or Unix systems, with a native look and feel and requiring very little (if any) platform specific code.
wxwidgets vis wxWidgets is a C++ library that lets developers create applications for Windows, Mac OS X, Linux and other platforms with a single code base. It has popular language bindings for Python, Perl, Ruby and many other languages, and unlike other cross-platform toolkits, wxWidgets gives applications a truly native look and feel because it uses the platform's native API rather than emulating the GUI.
x11 vis The X Window System (X11) is a windowing system for bitmap displays
x264 vis x264 is a free software library and application for encoding video streams into the H.264/MPEG-4 AVC compression format, and is released under the terms of the GNU GPL.
x265 vis x265 is a free software library and application for encoding video streams into the H.265 AVC compression format, and is released under the terms of the GNU GPL.
xalt lib XALT 2 is a tool to allow a site to track user executables and library usage on a cluster. When installed it can tell a site what are the top executables by Node-Hours or by the number of users or the number of times it is run. XALT 2 also tracks library usage as well. XALT 2 can also track package use by R, MATLAB or Python. It tracks both MPI and non-MPI programs.
xcrysden vis XCrySDen is a crystalline and molecular structure visualisation program aiming at display of isosurfaces and contours, which can be superimposed on crystalline structures and interactively rotated and manipulated.
xerces tools Xerces-C++ is a validating XML parser written in a portable subset of C++.
xerces-c++ lib Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data. A shared library is provided for parsing, generating, manipulating, and validating XML documents using the DOM, SAX, and SAX2 APIs.
xml-compile data Perl module for compilation based XML processing
xml-libxml data Perl binding for libxml2
xorg-macros devel X.org macros utilities.
xprop vis The xprop utility is for displaying window and font properties in an X server. One window or font is selected using the command line arguments or possibly in the case of a window, by clicking on the desired window. A list of properties is then given, possibly with formatting information.
xvfb vis Xvfb is an X server that can run on machines with no display hardware and no physical input devices. It emulates a dumb framebuffer using virtual memory.
xxd tools xxd is part of the VIM package and this will only install xxd, not vim! xxd converts to/from hexdumps of binary files.
xxdiff tools xxdiff is a graphical file and directories comparator and merge tool.
xxhash tools xxHash is an extremely fast non-cryptographic hash algorithm, working at RAM speed limit.
xz tools xz: XZ utilities
yambo phys YAMBO implements Many-Body Perturbation Theory (MBPT) methods (such as GW and BSE) and Time-Dependent Density Functional Theory (TDDFT), which allows for accurate prediction of fundamental properties as band gaps of semiconductors, band alignments, defect quasi-particle energies, optics and out-of-equilibrium properties of materials.
yaml-cpp tools yaml-cpp is a YAML parser and emitter in C++ matching the YAML 1.2 spec
yasm lang Yasm: Complete rewrite of the NASM assembler with BSD license
z3 tools Z3 is a theorem prover from Microsoft Research.
zlib lib zlib is designed to be a free, general-purpose, legally unencumbered -- that is, not covered by any patents -- lossless data-compression library for use on virtually any computer hardware and operating system.
zstd lib Zstandard is a real-time compression algorithm, providing high compression ratios. It offers a very wide range of compression/speed trade-off, while being backed by a very fast decoder. It also offers a special mode for small data, called dictionary compression, and can create dictionaries from any sample set.