Understanding how carbohydrates regulate proteins in physiological and pathological processes provides opportunities to address key biological problems and develop new therapeutics. The diversity and complexity of carbohydrates pose a challenge in experimentally identifying the sites where carbohydrates bind to and act on proteins. The authors present a deep learning model, DeepGlycanSite, that can accurately predict carbohydrate binding sites on a given protein structure. By incorporating geometric and evolutionary features of proteins into a deep equivariant graph neural network with the transformer architecture, DeepGlycanSite remarkably outperforms previous state-of-the-art methods and effectively predicts binding sites for diverse carbohydrates. When integrated with a mutagenesis study, DeepGlycanSite reveals an important G protein-coupled receptor’s guanosine 5′-diphosphate sugar recognition site. These results demonstrate that DeepGlycanSite is an invaluable tool for predicting carbohydrate binding sites and could provide insights into the molecular mechanisms underlying the carbohydrate regulation of therapeutically of therapeutically important proteins.
(*) The official implementation of DeepGlycanSite, a state-of-the-art method for predicting carbohydrate binding sites, is available at https://github.com/xichengeva/DeepGlycanSite.This repository contains all the code, instructions and model weights needed to run the method or to retrain a model.