PepMLM: Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modeling

Abstract

We introduce PepMLM, a purely target sequence-conditioned de novo generator of linear peptide binders. By employing a novel masking strategy that uniquely positions cognate peptide sequences at the terminus of target protein sequences, PepMLM tasks ESM-2 to fully reconstruct the binder region, achieving low perplexities matching or improving upon previously-validated peptide-protein sequence pairs. (Accepted as Poster at ICLR GEM Workshop)

Publication
ArXiv