Chemical Data Processing Toolkit
| Developer(s) | Thomas Seidel |
|---|---|
| Initial release | 2023 |
| Stable release | 1.3.0
/ April 29, 2026 |
| Written in | C++ and Python |
| Engine | |
| Operating system | Linux, macOS, and Microsoft Windows |
| Platform | Many |
| Available in | English |
| Type | Chemoinformatics |
| License | LGPL-2.1-or-later |
| Website | cdpkit |
Search Chemical Data Processing Toolkit on Amazon.
The Chemical Data Processing Toolkit (CDPKit) is an open-source cheminformatics toolkit implemented in C++. CDPKit comprises a suite of command line and GUI tools as well as a programming library called the Chemical Data Processing Library (CDPL) which provides a high-quality and well-tested modular implementation of basic functionality typically required by any higher-level software application in the field of cheminformatics. In addition to the CDPL C++ API, an equivalent Python-interfacing layer is provided that allows to harness all of CDPL’s functionality easily from Python code.[1]
CDPKit is developed at the Department of Pharmaceutical Sciences/University of Vienna on behalf of the Christian Doppler Laboratory for Molecular Informatics in the Biosciences (CD-Lab MIB) and receives funding from the Ministry of Labor and Economics of the Republic of Austria (BMAW), the Christian Doppler Forschungsgesellschaft, BASF SE and Boehringer Ingelheim RCV.
CDPKit seamlessly integrates with machine learning (ML) libraries like scikit-learn, PyTorch, and TensorFlow. The utility of CDPKit in the context of ML is showcased by several published scientific software tools that predict attributes of potential drug candidates such as lipophilicity and solubility,[2] biological activity,[3][4] and site of metabolism.[5] Apo2Ph4[6], PharmacoMatch[7] and CHA[8] represent further examples of computer-aided drug design software projects that rely on CDPKit's functionality.
Key Features (excerpt)
- Data structures for the representation and processing of molecules, chemical reactions and pharmacophores
- Routines for all typical cheminformatics pre-processing tasks (e.g. ring and aromaticity perception, stereochemistry processing, …)
- Powerful methods for molecule and reaction substructure searching
- Readers/writers for various file formats (Mol, SDF, Rxn, RDF, Mol2, PDB, mmCIF, MMTF, SMILES, SMARTS, etc.) allowing the I/O of small molecule, macromolecular, reaction and pharmacophore data
- Molecule fragmentation algorithms such as RECAP,[9] and BRICS[10]
- Generation of molecule and pharmacophore fingerprints (e.g. ECFP[11])
- Large collection of implemented chemical structure descriptors
- 2D structure layout and rendering of molecules and reactions
- Gaussian shape-based molecule alignment and descriptor calculation[12]
- Pharmacophore generation, alignment and screening
- 3D structure and conformer generation[13]
- Prediction of a wide panel of physicochemical properties
- Test-suite compliant implementation of the MMFF94 force field
- C++ implementation follows best practices for a maximum of robustness and speed
References
- ↑ "Introduction — CDPKit 1.3.0 documentation". cdpkit.org. Retrieved 29 May 2026.
Text available under the CC BY-SA 4.0 license is not compatible with Wikipedia and cannot be freely copied into articles. Also licensed under the GNU Free Documentation License (unversioned, with no invariant sections, front-cover texts, or back-cover texts.
- ↑ Wieder, Oliver; Kuenemann, Mélaine; Seidel, Thomas; Meyer, Christophe; Bryant, Sharon D.; Langer, Thierry (2021). "Improved lipophilicity and aqueous solubility prediction with composite graph neural networks". Molecules. 26 (20): 6185. doi:10.3390/molecules26206185. PMC 8539502 Check
|pmc=value (help). PMID 34684766 Check|pmid=value (help). - ↑ Fellinger, Christian; Seidel, Thomas; Merget, Benjamin; Schleifer, Klaus-Jürgen; Langer, Thierry (2025). "GRADE and X-GRADE: unveiling novel protein–ligand interaction fingerprints based on GRAIL scores". Journal of Chemical Information and Modeling. 65 (5): 2456–2475. doi:10.1021/acs.jcim.4c01902. PMC 11898076 Check
|pmc=value (help). PMID 39980202 Check|pmid=value (help). - ↑ Kohlbacher, Stefan M.; Langer, Thierry; Seidel, Thomas (2021). "QPhAR: quantitative pharmacophore activity relationship: method and validation". Journal of Cheminformatics. 13 (1): 57. doi:10.1186/s13321-021-00537-9. PMC 8351372 Check
|pmc=value (help). PMID 34372940 Check|pmid=value (help). - ↑ Jacob, Roxane Axel; Gaskin, Leo; Seidel, Thomas; Chen, Ya; Mazzolari, Angelica; Kirchmair, Johannes (2026). "FAME3R: an efficient, practical and reliable open-source tool for predicting phase 1 and phase 2 sites of metabolism". Journal of Cheminformatics. 18 (1): 37. doi:10.1186/s13321-026-01161-1. PMC 13011438 Check
|pmc=value (help). PMID 41691256 Check|pmid=value (help). - ↑ Heider, Jörg; Kilian, Jonas; Garifulina, Aleksandra; Hering, Steffen; Langer, Thierry; Seidel, Thomas (2023). "Apo2ph4: a versatile workflow for the generation of receptor-based pharmacophore models for virtual screening". Journal of Chemical Information and Modeling. 63 (1): 101–110. doi:10.1021/acs.jcim.2c00814. PMC 9832483 Check
|pmc=value (help). PMID 36526584 Check|pmid=value (help). - ↑ Rose, Daniel; Wieder, Oliver; Seidel, Thomas; Langer, Thierry (2025). "PharmacoMatch: efficient 3D pharmacophore screening via neural subgraph matching" (PDF). International Conference on Learning Representation. 2025: 85726–85749.
- ↑ Wieder, Marcus; Garon, Arthur; Perricone, Ugo; Boresch, Stefan; Seidel, Thomas; Almerico, Anna Maria; Langer, Thierry (2017). "Common hits approach: combining pharmacophore modeling and molecular dynamics simulations". Journal of Chemical Information and Modeling. 57 (2): 365–385. doi:10.1021/acs.jcim.6b00674. PMID 28072524.
- ↑ Lewell, Xiao Qing; Judd, Duncan B.; Watson, Stephen P.; Hann, Michael M. (1998). "RECAP - retrosynthetic combinatorial analysis procedure: a powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry". Journal of Chemical Information and Computer Sciences. 38 (3): 511–522. doi:10.1021/ci970429i. PMID 9611787.
- ↑ Degen, Jörg; Wegscheid-Gerlach, Christof; Zaliani, Andrea; Rarey, Matthias (2008). "On the art of compiling and using 'drug-like' chemical fragment spaces". ChemMedChem. 3 (10): 1503–150. doi:10.1002/cmdc.200800178. PMID 18792903.
- ↑ Rogers, David; Hahn, Mathew (2010). "Extended-connectivity fingerprints". Journal of Chemical Information and Modeling. 50 (5): 742–754. doi:10.1021/ci100050t. PMID 20426451.
- ↑ Grant, J. A.; Gallardo, M. A.; Pickup, B. T. (1996). "A fast method of molecular shape comparison: a simple application of a gaussian description of molecular shape". Journal of Computational Chemistry. 17 (14): 1653–1666. doi:10.1002/(SICI)1096-987X(19961115)17:14<1653::AID-JCC7>3.0.CO;2-K.
- ↑ Seidel, Thomas; Permann, Christian; Wieder, Oliver; Kohlbacher, Stefan M.; Langer, Thierry (2023). "High-quality conformer generation with conforge: algorithm and performance assessment". Journal of Chemical Information and Modeling. 63 (17): 5549–5570. doi:10.1021/acs.jcim.3c00563. PMC 10498443 Check
|pmc=value (help). PMID 37624145 Check|pmid=value (help).
External links
This article "Chemical Data Processing Toolkit" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Chemical Data Processing Toolkit. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.
