GlyGen

GlyGen
Content
Description	GlyGen is a Computational and Informatics Resource for Glycoscience.
Data types; captured	Glycans, Proteins, and Glycoproteins.
Organisms	Homo sapiens, Mus musculus, and Rattus norvegicus.
Contact
Primary citation	GlyGen announcement.
Access
Data format	FASTA, JSON.
Website	www.glygen.org
Web service URL	Yes – Python API
Miscellaneous
License	Creative Commons General Public License
Versioning	Yes
Data release; frequency	Portal: 12 weeks Data: 12 weeks
Version	1.4 (16/ Sep/2019)
Curation policy	Yes – manual and automatic. Rules for automatic annotation are generated by database curators and computational algorithms.
Bookmarkable; entities	Yes – individual protein and glycan entries and search results.

GlyGen is a knowledge base for glycans, glycoconjugates, and related gene, protein, and other molecular biology information. GlyGen retrieves information from multiple international data sources such as PDB, RefSeq, and UniProt, and integrates and harmonizes content to allow unique searches that cannot be performed in any of the integrated databases alone.

Organization

The GlyGen project is an international, multi-institutional effort. The effort is led by the University of Georgia (UGA) and the George Washington University (GW). The two institutions collaborate in the development of the GlyGen portal. GlyGen collaborates with international organizations such as the European Bioinformatics Institute (EMBL-EBI), the National Center for Biotechnology Information (NCBI), Georgetown University, Soka University, and Griffith University (Institute for Glycomics) to gather and integrate data relevant to glycoscience.

Integrated databases

Currently GlyGen integrates data from the following publicly available databases:

BioXpress^[2]
BioMuta^[2]
Disease Ontology
GlyTouCan^[3]
Mouse Genome Database
NCBI PubChem
NCBI PubMed
NCBI RefSeq
NCBI Taxonomy
Orthologous Matrix
Protein Ontology
RCSB the Protein Data Bank
The Monarch Initiative
UniCarbKB^[4]
UniProt Knowledgebase

Content and features

The goal of the GlyGen project is to integrate and disseminate data describing glycoconjugate and complex carbohydrate structure, biosynthesis, and function. GlyGen accesses and retrieves data from international sources, integrates and harmonizes this data, and provides an interface for exploration. The GlyGen web portal allows users to execute unique searches of these integrated datasets to discover new knowledge that cannot be acquired through queries of isolated databases.

Data Collection - Data are collected with rigorous data quality control. Metadata are captured using the BioCompute Object schema.
Data Integration - Data from the different resources are accessed and downloaded in resource-specific formats (e.g., RDF, FASTA, CSV) and mapped to common identifiers (e.g., accession numbers).
Quick Search - Complex multi-domain search queries can be performed using the "Quick Search" option, which is based on user-provided use cases.
Explore Searches - Filtered Glycan, Protein, and Glycoprotein lists are generated using simple or advanced search options.
Data Visualization - GlyGen integrates Homo sapiens, Mus musculus, and Rattus norvegicus proteins, glycans, and glycoproteins.
Resources - A library of Glycobiology resources, including databases, informatics tools, learning materials, and tutorials is provided.
SPARQL Endpoint - All data sets are also RDFized using standard ontologies (e.g., UniProt RDF schema, GlycoCoO, FALDO) and made available via a public SPARQL endpoint.
Feedback - An integrated feedback system allows users to submit comments and suggestions on every webpage.

Availability

The Creative Commons Attribution 4.0 International (CC BY 4.0) license applies to all GlyGen datasets, thereby permitting users to copy, distribute, display, and commercialize the data in all jurisdictions, provided appropriate credit is given. Project source code is released under the GNU General Public License v3 and is available at GlyGen GitHub repository. GlyGen data is available without cost and can be accessed via GlyGen GitHub repository, Portal, Data, API, and SPARQL.

Funding

GlyGen is funded by the National Institutes of Health (NIH) of the United States of America through the NIH Glycoscience Common Fund Program and is managed by the NIH Office of Strategic Coordination (grant # 1U01GM125267-01).

References

↑ GlyGen, Article. (October 2019). "GlyGen: Computational and Informatics Resources for Glycoscience". Glycobiology. 1 (Resources for Glycoscience). doi:10.1093/glycob/cwz080. PMID 31616925.
↑ ^2.0 ^2.1 BioMuta and BioXpress, Article. (January 2018). "BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery". PubMed. 1 (Nucleic Acids Resources). doi:10.1093/nar/gkx907. PMID 30053270.
↑ GlyTouCan, Article. (October 2017). "GlyTouCan: an accessible glycan structure repository". Glycobiology. 1 (Resources for Glycoscience). doi:10.1093/glycob/cwx066. PMID 28922742.
↑ UniCarbKB, Article. (August 2016). "UniCarbKB: New database features for integrating glycan structure abundance, compositional glycoproteomics data, and disease associations". PubMed. 1 (Biochim Biophys Acta.). doi:10.1016/j.bbagen.2016.02.016. PMID 26940363.

External links

Oficial

Official website

Funding

Content and features

GlycoCoO

Availability

GlyGen

This article "GlyGen" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:GlyGen. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

[1] GlyGen, Article. (October 2019). "GlyGen: Computational and Informatics Resources for Glycoscience". Glycobiology. 1 (Resources for Glycoscience). doi:10.1093/glycob/cwz080. PMID 31616925.

[bio-2] 2.0 ^2.1 BioMuta and BioXpress, Article. (January 2018). "BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery". PubMed. 1 (Nucleic Acids Resources). doi:10.1093/nar/gkx907. PMID 30053270.

[3] GlyTouCan, Article. (October 2017). "GlyTouCan: an accessible glycan structure repository". Glycobiology. 1 (Resources for Glycoscience). doi:10.1093/glycob/cwx066. PMID 28922742.

[4] UniCarbKB, Article. (August 2016). "UniCarbKB: New database features for integrating glycan structure abundance, compositional glycoproteomics data, and disease associations". PubMed. 1 (Biochim Biophys Acta.). doi:10.1016/j.bbagen.2016.02.016. PMID 26940363.

[1]

[2]

[3]

[4]

v t e Bioinformatics
Databases	Sequence databases: GenBank, European Nucleotide Archive and DNA Data Bank of Japan Secondary databases: UniProt, database of protein sequences grouping together Swiss-Prot, TrEMBL and Protein Information Resource Other databases: Protein Data Bank, Ensembl and InterPro Specialised genomic databases: BOLD, Saccharomyces Genome Database, FlyBase, VectorBase, WormBase, PHI-base, Arabidopsis Information Resource and Zebrafish Information Network
Software	BLAST Bowtie Clustal EMBOSS HMMER MUSCLE SAMtools TopHat
Other	Server: ExPASy Ontology: Gene Ontology Rosalind (education platform)
Institutions	Broad Institute Computational Biology Department (CBD) Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI) Database Center for Life Science (DBCLS) DNA Data Bank of Japan (DDBJ) European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory (EMBL) Flatiron Institute J. Craig Venter Institute (JCVI) Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) US National Center for Biotechnology Information (NCBI) Japanese Institute of Genetics Netherlands Bioinformatics Centre (NBIC) Philippine Genome Center (PGC) Scripps Research Swiss Institute of Bioinformatics (SIB) Wellcome Sanger Institute Whitehead Institute
Organizations	African Society for Bioinformatics and Computational Biology (ASBCB) Australia Bioinformatics Resource (EMBL-AR) European Molecular Biology network (EMBnet) International Nucleotide Sequence Database Collaboration (INSDC) International Society for Biocuration (ISB) International Society for Computational Biology (ISCB) Student Council (ISCB-SC) Institute of Genomics and Integrative Biology (CSIR-IGIB) Japanese Society for Bioinformatics (JSBi)
Meetings	Basel Computational Biology Conference‎ ([BC²]) European Conference on Computational Biology (ECCB) Intelligent Systems for Molecular Biology (ISMB) International Conference on Bioinformatics (InCoB) ISCB Africa ASBCB Conference on Bioinformatics Pacific Symposium on Biocomputing (PSB) Research in Computational Molecular Biology (RECOMB)
File formats	CRAM format FASTA format FASTQ format NeXML format Nexus format Pileup format SAM format Stockholm format
Related topics	Computational biology List of biological databases Molecular phylogenetics Sequencing Sequence database Sequence alignment
Category Commons

GlyGen

Contents

Organization

Integrated databases

Content and features

Availability

Funding

See also

References

External links

GlyGen

📰 Article(s) of the same category(ies)[edit]


Content
Description	GlyGen is a Computational and Informatics Resource for Glycoscience.
Data types captured	Glycans, Proteins, and Glycoproteins.
Organisms	Homo sapiens, Mus musculus, and Rattus norvegicus.
Contact
Primary citation	GlyGen announcement.^[1]
Access
Data format	FASTA, JSON.
Website	www.glygen.org
Web service URL	Yes – Python API
Miscellaneous
License	Creative Commons General Public License
Versioning	Yes
Data release frequency	Portal: 12 weeks Data: 12 weeks
Version	1.4 (16/ Sep/2019)
Curation policy	Yes – manual and automatic. Rules for automatic annotation are generated by database curators and computational algorithms.
Bookmarkable entities	Yes – individual protein and glycan entries and search results.