For each citation that was shared on social media (LinkedIn, Facebook, or Twitter) with the “@GenScript” tag, the author will be rewarded with a $10 Amazon gift card or 2,000 GS points.

Identifying microbial protease allergens through protein language model-guided homology

Molecular Therapy. 2026-02; 
Kumar Thurimella, Elena Wu, Chenhao Li, Daniel B Graham, Róisín M Owens, Damian R Plichta, Caroline L Sokol, Ramnik J Xavier, Sergio Bacallado Department of Pure Mathematics and Mathematical Statistics, University of Cambridge
Products/Services Used Details Operation

Abstract

Emerging research links the gut, skin, and oral microbiomes to allergies, with serine proteases (SPs) identified as potential allergens. This study leverages deep learning and pre-trained protein language models (pLMs) to uncover allergenic SPs in metagenomic data. First, we develop a model to identify the catalytic serine residue in serine hydrolases, demonstrating how pLMs capture structural information. Next, we create a deep learning framework to detect candidate SP allergens across gene catalogs, using the conserved catalytic triad to identify homologs in gut and oral sites despite low sequence identity. Our model predicts a putative SP allergen resembling V8 protease, a known trigger for protease-activate... More

Keywords