Property-Driven Protein Inverse Folding with Multi-Objective Preference Alignment

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

protein designpreference alignment

Protein sequence design must balance designability, defined as the ability to recover a target backbone, with multiple, often competing, developability properties such as solubility, thermostability, and expression. Existing approaches address these properties through post hoc mutation, inference-time biasing, or retraining on property-specific subsets, yet they are target dependent and demand substantial domain expertise or careful hyperparameter tuning. In this paper, we introduce ProtAlign, a multi-objective preference alignment framework that fine-tunes pretrained inverse folding models to satisfy diverse developability objectives while preserving structural fidelity. ProtAlign employs a semi-online Direct Preference Optimization strategy with a flexible preference margin to mitigate conflicts among competing objectives and constructs preference pairs using in silico property predictors. Applied to the widely used ProteinMPNN backbone, the resulting model MoMPNN enhances developability without compromising designability across tasks including sequence design for CATH 4.3 crystal structures, de novo generated backbones, and real-world binder design scenarios, making it an appealing framework for practical protein sequence design.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces ProtAlign, a multi-objective preference alignment framework that fine-tunes inverse folding models to balance designability with developability properties such as solubility and thermostability. It resides in the Multi-Objective Preference Alignment leaf, which contains three papers including the original work. This leaf sits within the broader Preference-Based Optimization Methods branch, indicating a moderately populated research direction focused on aligning generative models with multiple objectives through preference signals rather than post-hoc filtering or single-objective optimization.

The taxonomy reveals neighboring approaches in sibling leaves: Direct Preference Optimization for Designability focuses solely on structural fidelity using confidence scores, while Guided Generation and Sampling Methods employ classifier guidance or MCMC strategies without explicit preference learning. The Multi-Objective Preference Alignment leaf explicitly excludes single-objective methods, positioning this work at the intersection of structural accuracy and practical therapeutic constraints. Related branches like Antibody-Specific Design Pipelines and Developability Prediction provide complementary tools for property assessment, but the preference alignment approach distinguishes itself by integrating objectives during model training rather than relying on external guidance or iterative refinement.

Among seventeen candidates examined, the ProtAlign framework contribution shows one refutable candidate, suggesting some prior work addresses multi-objective preference alignment for protein design. The semi-online Direct Preference Optimization strategy examined ten candidates with none clearly refuting it, indicating potential novelty in the specific training procedure and flexible margin mechanism. The MoMPNN model contribution examined six candidates without clear refutation, though the limited search scope means substantial related work may exist beyond the top semantic matches. The statistics suggest the framework's novelty lies more in its training methodology than in the general concept of multi-objective protein design.

Based on the limited literature search covering seventeen candidates, the work appears to occupy a moderately explored niche within preference-based protein design. The taxonomy structure shows this is an active area with established sibling methods, but the specific combination of semi-online DPO and flexible margins may offer incremental advances. The analysis does not cover exhaustive prior work in reinforcement learning for proteins or broader multi-objective optimization literature, leaving open questions about how this approach compares to methods outside the semantic search scope.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Multi-objective protein sequence design balancing designability and developability properties. The field has organized itself around several complementary strategies for navigating the tension between generating structurally sound proteins and ensuring they meet practical therapeutic or functional criteria. Preference-Based Optimization Methods focus on aligning generative models with multiple objectives through techniques like reinforcement learning and preference learning, enabling direct trade-off management during sequence generation. Guided Generation and Sampling Methods steer diffusion models or other generative processes toward desired property profiles without full retraining, while Antibody-Specific Design Pipelines address the unique constraints of therapeutic antibody development. Generative Models for De Novo Protein Design explore foundational architectures for creating novel sequences, and Developability Prediction and Assessment provide the predictive tools needed to evaluate manufacturability and stability. Foundations and Principles of Protein Design and Specialized Design Applications round out the taxonomy by covering theoretical underpinnings and domain-specific challenges such as enzyme design or binder engineering. A particularly active area involves reconciling structural designability with developability constraints, where works like Designability Preference Optimization[2] and Multi-objective Antibody Design[1] demonstrate how preference alignment can simultaneously optimize binding affinity, stability, and manufacturability. Property-Driven Inverse Folding[0] sits within this preference-based optimization cluster, emphasizing the integration of multiple property objectives directly into the inverse folding process rather than treating them as post-hoc filters. This contrasts with approaches like Multi-objective Binder Design[26], which may rely more heavily on guided sampling or iterative refinement. Nearby methods such as Guided Discrete Diffusion[3] illustrate alternative strategies that guide generative processes without explicit preference models, highlighting an ongoing tension between end-to-end optimization and modular design pipelines. The central challenge remains how to efficiently explore the high-dimensional space of sequences that satisfy both biophysical plausibility and practical therapeutic requirements.

Claimed Contributions

ProtAlign multi-objective preference alignment framework

Can Refute

1 retrieved paper

The authors propose ProtAlign, a framework that aligns pretrained inverse folding models with both designability and multiple developability properties (such as solubility, thermostability, and expression) without requiring target-dependent hyperparameter tuning or domain expertise.

1 retrieved paper

Can Refute

Semi-online Direct Preference Optimization with flexible preference margin

10 retrieved papers

The authors develop a novel semi-online DPO algorithm that uses an adaptive preference margin to balance competing developability objectives while maintaining sequence-structure fidelity during optimization.

10 retrieved papers

MoMPNN model for property-driven protein design

6 retrieved papers

The authors present MoMPNN, a model created by applying ProtAlign to ProteinMPNN, which improves developability properties while maintaining designability across various protein design tasks including crystal structures, de novo backbones, and binder design.

6 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Multi-objective antibody design with constrained preference optimization PDF

M Ren, ZK He, H Zhang (2025)

[26] Preference optimization of protein language models as a multi-objective binder design paradigm PDF

Mistani, Pouria, Mysore, Venkatesh, Pouria A. Mistani, Venkatesh Mysore (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

ProtAlign multi-objective preference alignment framework

[26] Preference optimization of protein language models as a multi-objective binder design paradigm PDF

Can Refute

Contribution

Semi-online Direct Preference Optimization with flexible preference margin

[36] Direct Preference Optimization with an Offset PDF

Cannot Refute

[37] Î²-DPO: Direct Preference Optimization with Dynamic Î² PDF

Cannot Refute

[38] Token-level Direct Preference Optimization PDF

Cannot Refute

[39] Robust Preference Optimization via Dynamic Target Margins PDF

Cannot Refute

[40] Gradient Imbalance in Direct Preference Optimization PDF

Cannot Refute

[41] Adaptive Margin RLHF via Preference over Preferences PDF

Cannot Refute

[42] Sppd: Self-training with process preference learning using dynamic value margin PDF

Cannot Refute

[43] Length-controlled margin-based preference optimization without reference model PDF

Cannot Refute

[44] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization PDF

Cannot Refute

[45] Balanceddpo: Adaptive multi-metric alignment PDF

Cannot Refute

Contribution

MoMPNN model for property-driven protein design

[7] Sparks of function by de novo protein design PDF

Cannot Refute

[24] Relaxed Sequence Sampling for Diverse Protein Design PDF

Cannot Refute

[25] ALLM-Ab: Active Learning-Driven Antibody Optimization Using Fine-Tuned Protein Language Models PDF

Cannot Refute

[46] Computational methods to engineer antibodies for vaccines and therapeutics PDF

Cannot Refute

[47] Rational Design of Artificial Protein Platform for the Efficacy of Genetically Fused Functional Peptides PDF

Cannot Refute

[48] Enhancing Functional Protein Design Using Heuristic Optimization and Deep Learning for AntiâInflammatory and Gene Therapy Applications PDF

Cannot Refute

Property-Driven Protein Inverse Folding with Multi-Objective Preference Alignment

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Multi-objective antibody design with constrained preference optimization PDF

[26] Preference optimization of protein language models as a multi-objective binder design paradigm PDF

Contribution Analysis

ProtAlign multi-objective preference alignment framework

[26] Preference optimization of protein language models as a multi-objective binder design paradigm PDF

Semi-online Direct Preference Optimization with flexible preference margin

[36] Direct Preference Optimization with an Offset PDF

[37] Î²-DPO: Direct Preference Optimization with Dynamic Î² PDF

[38] Token-level Direct Preference Optimization PDF

[39] Robust Preference Optimization via Dynamic Target Margins PDF

[40] Gradient Imbalance in Direct Preference Optimization PDF

[41] Adaptive Margin RLHF via Preference over Preferences PDF

[42] Sppd: Self-training with process preference learning using dynamic value margin PDF

[43] Length-controlled margin-based preference optimization without reference model PDF

[44] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization PDF

[45] Balanceddpo: Adaptive multi-metric alignment PDF

MoMPNN model for property-driven protein design

[7] Sparks of function by de novo protein design PDF

[24] Relaxed Sequence Sampling for Diverse Protein Design PDF

[25] ALLM-Ab: Active Learning-Driven Antibody Optimization Using Fine-Tuned Protein Language Models PDF

[46] Computational methods to engineer antibodies for vaccines and therapeutics PDF

[47] Rational Design of Artificial Protein Platform for the Efficacy of Genetically Fused Functional Peptides PDF

[48] Enhancing Functional Protein Design Using Heuristic Optimization and Deep Learning for AntiâInflammatory and Gene Therapy Applications PDF

Table of Contents

[48] Enhancing Functional Protein Design Using Heuristic Optimization and Deep Learning for AntiâInflammatory and Gene Therapy Applications PDF