Modernized uniform representation of carbohydrate molecules in the Protein Data Bank

Author(s)

C. Shao, Z. Feng, J.D. Westbrook, E. Peisach, J. Berrisford, Y. Ikegawa, G. Kurisu, S. Velankar, S.K. Burley & J.Y. Young

Sources

Glycobiology, 2021, vol. 31, no. 9, 1204–1218 https://doi.org/10.1093/glycob/cwab039

Carbohydrate molecules present in more than 14,000 Protein Data Bank (PDB) structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improve the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules.

The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports:
(i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates
(ii) uniform representation of branched entities for oligosaccharides
(iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community
(iv) annotation of glycosylation sites in proteins.
pdb_capture.png
For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins.

Latest news

In 2024, several human infections with highly pathogenic clade 2.3.4.4b bovine influenza H5N1 viruses in...

DIONYSUS is a database of protein-carbohydrate interfaces annotated according to proteins and carbohydrates’ structural, chemical...

Cholera toxin (CT) is the etiological agent of cholera. The authors report that multiple classes...

As an abundant agricultural and forestry biomass resource, hemicelluloses are hard to effectively degrade and...