top of page

Presentation: UND, NDSU, & ND-ACES bio and biomedical computation networking seminar 

November 20, 2024, Alerus Center, Grand Forks, North Dakota

Fine-Tuning ESM2 for Enhanced Protein Function Prediction via GO Term Associations

Yusuf

Akbulut

Doctoral Student
North Dakota State University

Co-authors: Mishkatur, Rahman, Doctoral Student, NDSU; Harun, Pirim, Assistant Professor, NDSU

Session

Presentation Session 1

The functional annotation of hypothetical proteins remains a significant challenge in bioinformatics, particularly given the rapid increase in sequenced but uncharacterized proteins. We propose a novel approach to enhance protein function prediction by fine-tuning the Evolutionary Scale Modeling 2 (ESM2) model. This study leverages protein sequences and Gene Ontology (GO) terms to train ESM2 in an end-to-end framework. By incorporating GO term annotations, our method aims to improve the model's ability to accurately predict functional categories for hypothetical proteins. Fine-tuning ESM2, a cutting-edge transformer model for protein analysis, on a curated dataset of protein sequences and GO annotations, allows it to discern complex relationships between sequence features and functional labels. Initial results show significant improvements in prediction accuracy over traditional sequence alignment-based methods. Our model also demonstrates potential in inferring functions for proteins without characterized homologs, addressing a longstanding challenge in protein annotation. This research underscores the power of transformer-based language models in protein function prediction, with implications for molecular biology and drug discovery.

bottom of page