DIAMOND (biotechnology)

DIAMOND
Original author(s)	Benjamin J. Buchfink
Developer(s)	Benjamin J. Buchfink
Initial release	November 17, 2014; 11 years ago
Stable release	2.1.23 / 15 February 2026; 5 months ago
Repository	github.com/bbuchfink/diamond
Written in	C++
Engine
Operating system	UNIX, Linux, Mac, MS-Windows
Type	Bioinformatics tool
License	GNU General Public License version 3

Search DIAMOND (biotechnology) on Amazon.

In bioinformatics, DIAMOND^[1]^[2] is an algorithm and program for sequence alignment of protein and translated DNA sequences, designed as a fast alternative to NCBI BLAST. It has more than 17,000 citations in scientific literature^[3] (as of February 2026) and has been built into many pipelines for functional and genome annotation, phylogenetics and other applications^[4].^{[citation needed]}

Background

File:US-DefenseThreatReductionAgency-Seal.svg

DTRA awarded a $1 million prize to fast alignment software defending against biothreats to the U.S. armed forces^[5]

Development of DIAMOND was begun in late 2012 by German computer scientist Benjamin J. Buchfink. An early predecessor version, then called SASS ("Sequence Alignment using Spaced Seeds"), was part of the winning entry of the U.S. Defense Threat Reduction Agency's $1 Million Algorithm Challenge^[5]. At the time, scientists were spending 800,000 CPU hours on a supercomputer to BLASTX their metagenomic reads against the KEGG database^[6], creating the need for faster software solutions.

The first major version of DIAMOND was published in November 2014^[1] and focused on alignment sensitivity at above 50% sequence identity and short read alignment. It reported significant performance gains vs state-of-the-art methods and quickly gained popularity across a large spectrum of applications.

The second major version of DIAMOND was published in April 2021^[2] and extended the tool towards full pairwise alignment sensitivity (on par with BLAST) down to 20% sequence identity, again reporting significant speedups.

Sequence clustering is an evident downstream application of alignment. It was released as a feature in January 2023. In addition to a cascaded clustering workflow, the tool also offers clustering with linear-time scaling and high sensitivity that can be run in parallel across many compute nodes.

Algorithm

Like all fast aligners, DIAMOND is based on the "seed-and-extend" concept^[7] to find short exact matches between query and target (also called subject) sequences, then extending them into longer gapped alignments. The seeds are usually chosen as k-mers or spaced k-mers in the case of DIAMOND, possibly also in a reduced alphabet.

DIAMOND is an acryonym for "Double Indexed Alignment of NGS Data", referring to its concept of building an index for both query and target sequences^[1]. Most aligners work by building an index data structure for the targets (database index), then linearly looking up query seeds in it^[8]. BLAST indexes the queries and linearly processes the database^[9]. DIAMOND indexes both queries and targets, then evaluates seed hits between them in a seed-by-seed order. The main advantage of this approach is better use of CPU caches by reusing cached data for all associated comparisons.

Seed hits are passed through several heuristic filter stages and highly optimized code before being subjected to gapped Smith Waterman^[10] extension computing the final alignments.

References

↑ ^1.0 ^1.1 ^1.2 Buchfink, Benjamin J.; Xie, Chao; Huson, Daniel H. (2014). "Fast and sensitive protein alignment using DIAMOND". Nature Methods. 12 (1): 59–60. doi:10.1038/nmeth.3176. ISSN 1548-7105.
↑ ^2.0 ^2.1 Buchfink, Benjamin J.; Reuter, Klaus; Drost, Hajk-Georg (2021). "Sensitive protein alignments at tree-of-life scale using DIAMOND". Nature Methods. 18 (4): 366–368. doi:10.1038/s41592-021-01101-x. ISSN 1548-7105.
↑ "Benjamin J. Buchfink - Google Scholar". Unknown parameter |url-status= ignored (help)
↑ "Applications - bbuchfink/diamond Wiki". Unknown parameter |url-status= ignored (help)
↑ ^5.0 ^5.1 Defense Threat Reduction Agency (DTRA) (2013). "DTRA/SCC-WMD Announces $1 Million Algorithm Challenge Winner". www.prweb.com. Retrieved 2025-12-24. Unknown parameter |url-status= ignored (help)
↑ Jansson, Janet (2011). "Towards tera terra: Terabase sequencing of terrestrial metagenomics". Lawrence Berkeley National Laboratory. Unknown parameter |url-status= ignored (help)
↑ Lipman, DJ; Pearson, WR (1985). "Rapid and sensitive protein similarity searches". Science. 227 (4693): 1435–41. Bibcode:1985Sci...227.1435L. doi:10.1126/science.2983426. PMID 2983426.
↑ Langmead, Ben; Cole Trapnell; Mihai Pop; Steven L Salzberg (4 March 2009). "Ultrafast and memory-efficient alignment of short DNA sequences to the human genome". Genome Biology. 10 (3): 10:R25. doi:10.1186/gb-2009-10-3-r25. PMC 2690996. PMID 19261174.
↑ Stephen Altschul; Warren Gish; Webb Miller; Eugene Myers; David J. Lipman (1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–410. doi:10.1016/S0022-2836(05)80360-2. PMID 2231712. Unknown parameter |s2cid= ignored (help)
↑ Smith, Temple F.; Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences" (PDF). Journal of Molecular Biology. 147 (1): 195–197. CiteSeerX 10.1.1.63.2897. doi:10.1016/0022-2836(81)90087-5. PMID 7265238. Unknown parameter |name-list-style= ignored (help)

External links

Official website

This article "DIAMOND (biotechnology)" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:DIAMOND (biotechnology). Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

[:0-1] 1.0 ^1.1 ^1.2 Buchfink, Benjamin J.; Xie, Chao; Huson, Daniel H. (2014). "Fast and sensitive protein alignment using DIAMOND". Nature Methods. 12 (1): 59–60. doi:10.1038/nmeth.3176. ISSN 1548-7105.

[:1-2] 2.0 ^2.1 Buchfink, Benjamin J.; Reuter, Klaus; Drost, Hajk-Georg (2021). "Sensitive protein alignments at tree-of-life scale using DIAMOND". Nature Methods. 18 (4): 366–368. doi:10.1038/s41592-021-01101-x. ISSN 1548-7105.

[3] "Benjamin J. Buchfink - Google Scholar". Unknown parameter |url-status= ignored (help)

[4] "Applications - bbuchfink/diamond Wiki". Unknown parameter |url-status= ignored (help)

[:2-5] 5.0 ^5.1 Defense Threat Reduction Agency (DTRA) (2013). "DTRA/SCC-WMD Announces $1 Million Algorithm Challenge Winner". www.prweb.com. Retrieved 2025-12-24. Unknown parameter |url-status= ignored (help)

[6] Jansson, Janet (2011). "Towards tera terra: Terabase sequencing of terrestrial metagenomics". Lawrence Berkeley National Laboratory. Unknown parameter |url-status= ignored (help)

[7] Lipman, DJ; Pearson, WR (1985). "Rapid and sensitive protein similarity searches". Science. 227 (4693): 1435–41. Bibcode:1985Sci...227.1435L. doi:10.1126/science.2983426. PMID 2983426.

[bowtie-paper-8] Langmead, Ben; Cole Trapnell; Mihai Pop; Steven L Salzberg (4 March 2009). "Ultrafast and memory-efficient alignment of short DNA sequences to the human genome". Genome Biology. 10 (3): 10:R25. doi:10.1186/gb-2009-10-3-r25. PMC 2690996. PMID 19261174.

[Altschul1990-9] Stephen Altschul; Warren Gish; Webb Miller; Eugene Myers; David J. Lipman (1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–410. doi:10.1016/S0022-2836(05)80360-2. PMID 2231712. Unknown parameter |s2cid= ignored (help)

[:6-10] Smith, Temple F.; Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences" (PDF). Journal of Molecular Biology. 147 (1): 195–197. CiteSeerX 10.1.1.63.2897. doi:10.1016/0022-2836(81)90087-5. PMID 7265238. Unknown parameter |name-list-style= ignored (help)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

v t e Bioinformatics
Databases	Sequence databases: GenBank, European Nucleotide Archive and DNA Data Bank of Japan Secondary databases: UniProt, database of protein sequences grouping together Swiss-Prot, TrEMBL and Protein Information Resource Other databases: Protein Data Bank, Ensembl and InterPro Specialised genomic databases: BOLD, Saccharomyces Genome Database, FlyBase, VectorBase, WormBase, PHI-base, Arabidopsis Information Resource and Zebrafish Information Network
Software	BLAST Bowtie Clustal EMBOSS HMMER MUSCLE SAMtools TopHat
Other	Server: ExPASy Ontology: Gene Ontology Rosalind (education platform)
Institutions	Broad Institute Computational Biology Department (CBD) Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI) Database Center for Life Science (DBCLS) DNA Data Bank of Japan (DDBJ) European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory (EMBL) Flatiron Institute J. Craig Venter Institute (JCVI) Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) US National Center for Biotechnology Information (NCBI) Japanese Institute of Genetics Netherlands Bioinformatics Centre (NBIC) Philippine Genome Center (PGC) Scripps Research Swiss Institute of Bioinformatics (SIB) Wellcome Sanger Institute Whitehead Institute
Organizations	African Society for Bioinformatics and Computational Biology (ASBCB) Australia Bioinformatics Resource (EMBL-AR) European Molecular Biology network (EMBnet) International Nucleotide Sequence Database Collaboration (INSDC) International Society for Biocuration (ISB) International Society for Computational Biology (ISCB) Student Council (ISCB-SC) Institute of Genomics and Integrative Biology (CSIR-IGIB) Japanese Society for Bioinformatics (JSBi)
Meetings	Basel Computational Biology Conference‎ ([BC²]) European Conference on Computational Biology (ECCB) Intelligent Systems for Molecular Biology (ISMB) International Conference on Bioinformatics (InCoB) ISCB Africa ASBCB Conference on Bioinformatics Pacific Symposium on Biocomputing (PSB) Research in Computational Molecular Biology (RECOMB)
File formats	CRAM format FASTA format FASTQ format NeXML format Nexus format Pileup format SAM format Stockholm format
Related topics	Computational biology List of biological databases Molecular phylogenetics Sequencing Sequence database Sequence alignment
Category Commons

DIAMOND (biotechnology)

Contents

Background

Algorithm

See also

References

External links

📰 Article(s) of the same category(ies)[edit]