Thursday 14 March 2024, 1.00PM
Speaker(s): Dr Elisa Fadda, Maynooth University, Ireland
The introduction of machine learning (ML) for protein structure prediction 1,2 revolutionized structural biology, boosting our ability to source and resolve protein structures, and broadening the potential for therapeutic discovery 3.
One limitation affecting all ML-derived structures is the lack post-translational modifications 4, which are key to the correct folding, structural stability, and function of the underlying protein. Glycosylation is the most common post-translational modification of proteins, with an estimated 3 to 4% of the human genome dedicated exclusively to encode for glycosylation pathways 5. Yet, glycans remain largely ‘unseen’ due to their heterogeneity, complexity and highly dynamic nature 6.
In this talk, I will introduce and discuss the science behind GlycoShape 7, a unique resource based on high-performance computing that allows users to rapidly and easily restore glycoproteins from ML (AlphaFold/RoseTTAFold), as well as from the Protein Data Bank, to their native, functional state by adding the missing glycan 3D information in seconds. Because of the robustness of its 3D database and of the algorithm, GlycoShape can also predict N- glycosylation site occupancy with a 92% accuracy against all experimentally profiled glycoproteins in the PDB.
This remarkable level of agreement with glycoproteomics data provides further evidence that the type of glycosylation and occupancy depend on site accessibility and complementarity of the glycan to the protein surface 8,9, revealing a real potential of training upcoming ML algorithms with enormous impact on scientific and therapeutic advances.
To this end, I will provide some examples, ranging from pathogen infection to protein folding, to underscore the importance of rebuilding glycosylation to understand biomolecular structure and function in life sciences.
Location: B/K/018, Biology Building
Admission: In-person