You can edit almost every page by Creating an account and confirming your email.

DIAMOND (biotechnology)

From EverybodyWiki Bios & Wiki








DIAMOND
Original author(s)Benjamin J. Buchfink
Developer(s)Benjamin J. Buchfink
Initial releaseNovember 17, 2014; 11 years ago (2014-11-17)
Stable release
2.1.23 / 15 February 2026; 3 months ago (2026-02-15)
Repositorygithub.com/bbuchfink/diamond
Written inC++
Engine
    Operating systemUNIX, Linux, Mac, MS-Windows
    TypeBioinformatics tool
    LicenseGNU General Public License version 3

    Search DIAMOND (biotechnology) on Amazon.

    In bioinformatics, DIAMOND[1][2] is an algorithm and program for sequence alignment of protein and translated DNA sequences, designed as a fast alternative to NCBI BLAST. It has more than 17,000 citations in scientific literature[3] (as of February 2026) and has been built into many pipelines for functional and genome annotation, phylogenetics and other applications[4].[citation needed]

    Background

    DTRA awarded a $1 million prize to fast alignment software defending against biothreats to the U.S. armed forces[5]

    Development of DIAMOND was begun in late 2012 by German computer scientist Benjamin J. Buchfink. An early predecessor version, then called SASS ("Sequence Alignment using Spaced Seeds"), was part of the winning entry of the U.S. Defense Threat Reduction Agency's $1 Million Algorithm Challenge[5]. At the time, scientists were spending 800,000 CPU hours on a supercomputer to BLASTX their metagenomic reads against the KEGG database[6], creating the need for faster software solutions.

    The first major version of DIAMOND was published in November 2014[1] and focused on alignment sensitivity at above 50% sequence identity and short read alignment. It reported significant performance gains vs state-of-the-art methods and quickly gained popularity across a large spectrum of applications.

    The second major version of DIAMOND was published in April 2021[2] and extended the tool towards full pairwise alignment sensitivity (on par with BLAST) down to 20% sequence identity, again reporting significant speedups.

    Sequence clustering is an evident downstream application of alignment. It was released as a feature in January 2023. In addition to a cascaded clustering workflow, the tool also offers clustering with linear-time scaling and high sensitivity that can be run in parallel across many compute nodes.

    Algorithm

    Like all fast aligners, DIAMOND is based on the "seed-and-extend" concept[7] to find short exact matches between query and target (also called subject) sequences, then extending them into longer gapped alignments. The seeds are usually chosen as k-mers or spaced k-mers in the case of DIAMOND, possibly also in a reduced alphabet.

    DIAMOND is an acryonym for "Double Indexed Alignment of NGS Data", referring to its concept of building an index for both query and target sequences[1]. Most aligners work by building an index data structure for the targets (database index), then linearly looking up query seeds in it[8]. BLAST indexes the queries and linearly processes the database[9]. DIAMOND indexes both queries and targets, then evaluates seed hits between them in a seed-by-seed order. The main advantage of this approach is better use of CPU caches by reusing cached data for all associated comparisons.

    Seed hits are passed through several heuristic filter stages and highly optimized code before being subjected to gapped Smith Waterman[10] extension computing the final alignments.

    See also

    References

    1. 1.0 1.1 1.2 Buchfink, Benjamin J.; Xie, Chao; Huson, Daniel H. (2014). "Fast and sensitive protein alignment using DIAMOND". Nature Methods. 12 (1): 59–60. doi:10.1038/nmeth.3176. ISSN 1548-7105.
    2. 2.0 2.1 Buchfink, Benjamin J.; Reuter, Klaus; Drost, Hajk-Georg (2021). "Sensitive protein alignments at tree-of-life scale using DIAMOND". Nature Methods. 18 (4): 366–368. doi:10.1038/s41592-021-01101-x. ISSN 1548-7105.
    3. "Benjamin J. Buchfink - Google Scholar". Unknown parameter |url-status= ignored (help)
    4. "Applications - bbuchfink/diamond Wiki". Unknown parameter |url-status= ignored (help)
    5. 5.0 5.1 Defense Threat Reduction Agency (DTRA) (2013). "DTRA/SCC-WMD Announces $1 Million Algorithm Challenge Winner". www.prweb.com. Retrieved 2025-12-24. Unknown parameter |url-status= ignored (help)
    6. Jansson, Janet (2011). "Towards tera terra: Terabase sequencing of terrestrial metagenomics". Lawrence Berkeley National Laboratory. Unknown parameter |url-status= ignored (help)
    7. Lipman, DJ; Pearson, WR (1985). "Rapid and sensitive protein similarity searches". Science. 227 (4693): 1435–41. Bibcode:1985Sci...227.1435L. doi:10.1126/science.2983426. PMID 2983426. closed access
    8. Langmead, Ben; Cole Trapnell; Mihai Pop; Steven L Salzberg (4 March 2009). "Ultrafast and memory-efficient alignment of short DNA sequences to the human genome". Genome Biology. 10 (3): 10:R25. doi:10.1186/gb-2009-10-3-r25. PMC 2690996. PMID 19261174.
    9. Stephen Altschul; Warren Gish; Webb Miller; Eugene Myers; David J. Lipman (1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–410. doi:10.1016/S0022-2836(05)80360-2. PMID 2231712. Unknown parameter |s2cid= ignored (help)
    10. Smith, Temple F.; Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences" (PDF). Journal of Molecular Biology. 147 (1): 195–197. CiteSeerX 10.1.1.63.2897. doi:10.1016/0022-2836(81)90087-5. PMID 7265238. Unknown parameter |name-list-style= ignored (help)

    External links


    This article "DIAMOND (biotechnology)" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:DIAMOND (biotechnology). Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.