moPPIt: De Novo Generation of Motif-Specific Peptide Binders via Multi-Objective Discrete Flow Matching
Targeting specific functional motifs, whether conserved viral epitopes, intrinsically disordered regions (IDRs), or fusion breakpoints, is essential for modulating protein function and protein-protein interactions (PPIs). Current design methods, however, depend on stable tertiary structures, limiting their utility for disordered or dynamic targets. Here, we present a motif-specific PPI targeting algorithm (moPPIt), a framework for the de novo generation of motif-specific peptide binders derived solely from target sequence data. The core of this approach is BindEvaluator, a transformer architecture that interpolates protein language model embeddings to predict peptide-protein binding site interactions with high accuracy (AUC = 0.97). We integrate this predictor into a novel Multi-Objective-Guided Discrete Flow Matching (MOG-DFM) framework, which steers generative trajectories toward peptides that simultaneously maximize binding affinity and motif specificity. After comprehensive in silico validation of binding and motif-specific targeting, we validate moPPIt in vitro by generating binders that strictly discriminate between the FN3 and IgG domains of NCAM1, confirming domain-level specificity, and further demonstrate precise targeting of IDRs by generating binders specific to the N-terminal disordered domain of β-catenin. In functional, disease-relevant assays, moPPIt-designed peptides targeting the GM-CSF receptor effectively block macrophage polarization. Finally, we demonstrate therapeutic utility in cell engineering, where binders directed against the tumor antigen AGR2 drive specific CAR T regulatory cell activation. In total, moPPIt serves as a purely sequence-based paradigm for controllably targeting the "undruggable" and disordered proteome.
1. Google Colab Notebooks
We provide two Google Colab notebooks to help you run and evaluate moPPIt without any local setup:
moPPIt Colab (generate motif-specific binders while optimizing other therapeutic-related properties): Link
PeptiDerive Colab (compute Relative Interaction Scores (RIS) for residues on the target protein): Link
2. Command-line Usage
You can also run moPPIt and BindEvaluator from the command line.
2.1 Run moPPIt
Example command:
python -u moo.py \
--output_file './samples.csv' \
--length 10 \
--n_batches 600 \
--weights 1 1 1 4 4 2 \
--motifs '16-31,62-79' \
--motif_penalty \
--objectives Hemolysis Non-Fouling Half-Life Affinity Motif Specificity \
--target_protein MHVPSGAQLGLRPDLLARRRLKRCPSRWLCLSAAWSFVQVFSEPDGFTVIFSGLGNNAGGTMHWNDTRPAHFRILKVVLREAVAECLMDSYSLDVHGGRRTAAG
2.2 Run BindEvaluator
BindEvaluator predicts the binding sites on the target protein, given a target protein seqeunce and a binder sequence.
Example command:
python -u bindevaluator.py \
-target MHVPSGAQLGLRPDLLARRRLKRCPSRWLCLSAAWSFVQVFSEPDGFTVIFSGLGNNAGGTMHWNDTRPAHFRILKVVLREAVAECLMDSYSLDVHGGRRTAAG \
-binder YVEICRCVVC \
-sm ./classifier_ckpt/finetuned_BindEvaluator.ckpt \
-n_layers 8 \
-d_model 128 \
-d_hidden 128 \
-n_head 8 \
-d_inner 64
Repository Authors
Tong Chen, PhD Student at University of Pennsylvania
Pranam Chatterjee, Assistant Professor at University of Pennsylvania
Reach out to us with any questions!