GlyGen
Content | |
---|---|
Description | GlyGen is a Computational and Informatics Resources for Glycoscience. |
Data types captured | Glycans, Proteins, and Glycoproteins. |
Organisms | Homo sapiens, Mus musculus, and Rattus norvegicus. |
Contact | |
Primary citation | GlyGen announcement.[1] |
Access | |
Data format | FASTA, JSON. |
Website | www |
Web service URL | Yes – Python API |
Miscellaneous | |
License | Creative Commons General Public License |
Versioning | Yes |
Data release frequency | Portal: 12 weeks Data: 12 weeks |
Version | 1.4 (16/ Sep/2019) |
Curation policy | Yes – manual and automatic. Rules for automatic annotation generated by database curators and computational algorithms. |
Bookmarkable entities | Yes – individual protein and glycan entries and search results. |
GlyGen is a knowledge base for glycans, glycoconjugates and related gene, protein and other molecular biology information. GlyGen retrieves information from multiple international data sources such as PDB, RefSeq, and UniProt, and integrates and harmonizes content to allow unique searches that cannot be executed in any of the integrated databases alone.
Organization[edit]
The GlyGen project is an international multi-institutional effort. The effort is led by the University of Georgia (UGA) and the George Washington University (GW). The two institutions collaborate in the development of the GlyGen portal. GlyGen collaborates with international organizations such as the European Bioinformatics Institute (EMBL-EBI), the National Center for Biotechnology Information (NCBI), the Georgetown University, Soka University, and Griffith University (Institute for Glycomics) to gather and integrate data relevant to glycoscience.
Integrated databases[edit]
Currently GlyGen integrates data from the following publicly available databases:
- BioXpress[2]
- BioMuta[2]
- Disease Ontology
- GlyTouCan[3]
- Mouse Genome Database
- NCBI PubChem
- NCBI PubMed
- NCBI RefSeq
- NCBI Taxonomy
- Orthologous MAtrix
- Protein Ontology
- RCSB the Protein Data Bank
- The Monarch Initiative
- UniCarbKB[4]
- UniProt Knowledgebase
Content and features[edit]
The goal of the GlyGen project is to integrate and disseminate data describing glycoconjugate and complex carbohydrate structure, biosynthesis, and function. GlyGen accesses and retrieves data from international sources, integrates and harmonizes this data, and provides an interface for exploration. The GlyGen web portal allows users to execute unique searches of these integrated datasets to mine for new knowledge that cannot be acquired through queries of isolated databases.
- Data Collection - Data are collected with intensive data quality control. Metadata are captured using the BioCompute Object schema.
- Data Integration - Data from the different resources are accessed and downloaded in resource-specific formats (e.g. RDF, FASTA, CSV) and mapped to common identifiers (e.g., accession numbers).
- Quick Search - Complex multi-domain search queries can be performed using the "Quick Search" option, which is based on user-supplied use cases.
- Explore Searches - Filtered Glycan, Protein, and Glycoprotein lists are generated using simple or advanced search options.
- Data Visualization - GlyGen integrates Homo sapiens, Mus musculus, and Rattus norvegicus proteins, glycans, and glycoproteins.
- Resources - A library of Glycobiology resources, including databases, informatics tools, learning material and tutorials are provided.
- SPARQL Endpoint - All data sets are also RDFized using standard ontologies (e.g. UniProt RDF schema, GlycoCoO, FALDO) and made available via a public SPARQL endpoint.
- Feedback - An integrated feedback system allows users to submit comments and suggestions on every web page.
Availability[edit]
The Creative Commons Attribution 4.0 International (CC BY 4.0) license applies to all GlyGen datasets, thereby permitting users to copy, distribute, display and commercialize the data in all legislations, provided appropriate credit is given. Project source code is released under GNU General Public License v3 and is available at GlyGen GitHub repository. GlyGen data is available without cost and can be accessed via GlyGen GitHub repository, Portal, Data, API, SPARQL.
Funding[edit]
GlyGen is funded by the National Institutes of Health (NIH) of the United States of America through the NIH Glycoscience Common Fund Program and is managed by the NIH Office of Strategic Coordination (grant # 1U01GM125267-01).
See also[edit]
References[edit]
- ↑ GlyGen, Article. (October 2019). "GlyGen: Computational and Informatics Resources for Glycoscience". Glycobiology. 1 (Resources for Glycoscience). doi:10.1093/glycob/cwz080. PMID 31616925.
- ↑ 2.0 2.1 BioMuta and BioXpress, Article. (January 2018). "BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery". PubMed. 1 (Nucleic Acids Resources). doi:10.1093/nar/gkx907. PMID 30053270.
- ↑ GlyTouCan, Article. (October 2017). "GlyTouCan: an accessible glycan structure repository". Glycobiology. 1 (Resources for Glycoscience). doi:10.1093/glycob/cwx066. PMID 28922742.
- ↑ UniCarbKB, Article. (August 2016). "UniCarbKB: New database features for integrating glycan structure abundance, compositional glycoproteomics data, and disease associations". PubMed. 1 (Biochim Biophys Acta.). doi:10.1016/j.bbagen.2016.02.016. PMID 26940363.
External links[edit]
- Oficial
- Funding
- Content and features
- Availability
GlyGen[edit]
This article "GlyGen" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:GlyGen. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.