“Enhancing Pathogenicity Prediction of BRCA Variants Using BERT-Based Genomic Language Models” è il titolo del terzo webinar scientifico che si svolgerà il 28 febbraio dalle ore 12 alle ore 13 a cura di Leonardo Masci dell’Università de L’Aquila. Laureato in Scienze e Tecnologie Fisiche e specializzato con lode in Data and Life Science presso l’Università degli Studi dell’Aquila, ha competenze avanzate nell’analisi dei dati, con una forte attenzione all’innovazione tecnologica e alla ricerca applicata.
Di seguito l’abstract: Accurately identifying a genetic mutation, especially VUS, is crucial in determining its possible e_ect on human health and in_uencing medical choices. Recent developments in deep learning, speci_cally transformer models that use attention mechanisms, have made great progress in NLP and o_er promising possibilities for the interpretation of genomic data. This study examines how DNABERT2, a Genomic Language Model based on BERT, can improve pathogenicity predictions for BRCA variants. We focus on BRCA genes because of their signi_cant connection to breast cancer and the large number of expert annotations available. The DNABERT2 model has been _ne-tuned using both benign and pathogenic labeled sequences, constructed starting from ClinVar variants annotations by modifying an initial wild-type sequence. The transformer-based model shows high accuracy in classi_cation tasks, surpassing traditional tools that can vary depending on the genomic regions and types, achieving around 90% accuracy. The model successfully detects hidden patterns in the genetic code, o_ering potential help to biological and medical professionals in their annotation work.
Per seguire il seminario, ci si può collegare al link: https://meet.google.com/vaa-oczm-yjy