A Genetic Algorithm for Navigating Synthesizable Molecular Spaces

ICLR 2026 Conference SubmissionAnonymous Authors
synthesizabilitymolecular designgenetic algorithms
Abstract:

Inspired by the effectiveness of genetic algorithms and the importance of synthesizability in molecular design, we present SynGA, a simple genetic algorithm that operates directly over synthesis routes. Our method features custom crossover and mutation operators that explicitly constrain it to synthesizable molecular space. By modifying the fitness function, we demonstrate the effectiveness of SynGA on a variety of design tasks, including synthesizable analog search and sample-efficient property optimization, for both 2D and 3D objectives. Furthermore, by coupling SynGA with a machine learning-based filter that focuses the building block set, we boost SynGA to state-of-the-art performance. For property optimization, this manifests as a model-based variant SynGBO, which employs SynGA and block filtering in the inner loop of Bayesian optimization. Since SynGA is lightweight and enforces synthesizability by construction, our hope is that SynGA can not only serve as a strong standalone baseline but also as a versatile module that can be incorporated into larger synthesis-aware workflows in the future.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces SynGA, a genetic algorithm that evolves molecules by directly manipulating synthesis routes rather than molecular graphs. It resides in the 'Direct Synthesis Route Manipulation' leaf of the taxonomy, which contains only two papers total (including this one). This places the work in a notably sparse research direction within the broader field of synthesis-aware molecular design, suggesting that direct synthesis-route-based genetic algorithms remain relatively underexplored compared to post-hoc filtering or graph-based approaches.

The taxonomy reveals that neighboring leaves—'Reaction-Regulated Graph-Based Methods' and 'Retrosynthetic Planning with Evolutionary Algorithms'—contain methods that incorporate reaction rules or retrosynthetic planning but do not operate directly on synthesis trees as primary representations. The broader 'Synthesizability-Filtered Optimization' branch (containing graph-based and hybrid deep learning methods) is more densely populated, indicating that most prior work applies synthesizability constraints after molecular generation rather than embedding them in the evolutionary operators themselves. SynGA's approach diverges from these directions by making synthesis routes the fundamental unit of evolution.

Among thirty candidates examined, none clearly refute any of the three core contributions. For the main SynGA framework, ten candidates were reviewed with zero refutable overlaps; the same holds for the machine learning-based building block filtering and the SynGBO Bayesian optimization variant. This suggests that within the limited search scope, no prior work appears to combine direct synthesis route manipulation with custom crossover/mutation operators and ML-guided block filtering in the manner proposed. The statistics indicate that each contribution appears relatively novel given the examined literature, though the search was not exhaustive.

Based on the top-thirty semantic matches and the sparse taxonomy leaf, the work appears to occupy a distinct niche. The analysis covers synthesis-aware genetic algorithm methods but does not extend to all retrosynthetic planning or deep generative modeling approaches. The limited search scope means that related work outside the top-K candidates or in adjacent fields may exist but was not captured here.

Taxonomy

Core-task Taxonomy Papers
30
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: synthesis-aware molecular design using genetic algorithms over synthesis routes. The field organizes around four main branches that reflect different strategies for balancing molecular property optimization with practical synthesizability. Synthesis-Constrained Genetic Algorithm Frameworks directly manipulate synthesis routes or reaction pathways during evolution, ensuring that candidate molecules remain grounded in feasible chemistry from the outset. Synthesizability-Filtered Optimization applies post-hoc filters or scoring functions to rank generated molecules by synthetic accessibility, often using retrosynthetic analysis or rule-based checks. Property-Driven Design with Genetic Algorithms prioritizes target properties such as binding affinity, solubility, or catalytic activity, sometimes incorporating synthesizability as one objective among many in a multiobjective framework. Finally, Synthesis-Aware Computational Support Tools provide enabling technologies—retrosynthetic planners, reaction databases, and visualization platforms—that underpin the other branches by making synthesis knowledge computationally accessible. Recent work highlights contrasting philosophies: some methods evolve molecules by directly editing synthesis trees or applying reaction rules (e.g., Procedural Synthesis[12], Reaction Rules Evolution[17]), while others generate candidates freely and then filter or rank them using retrosynthetic feasibility scores (e.g., RetroEA[18], Synthetic Accessibility Visualization[26]). Genetic Algorithm Synthesizable[0] sits squarely within the Direct Synthesis Route Manipulation cluster, sharing conceptual ground with Procedural Synthesis[12] by operating on synthesis pathways rather than finished molecular graphs. This approach contrasts with post-hoc filtering strategies and with purely property-driven frameworks like Schrock Catalyst Optimization[3] or Monte Carlo Multiobjective[5], which may treat synthesizability as a secondary constraint. The central tension across these branches remains how tightly to couple synthesis planning with evolutionary search: tight integration promises chemically grounded designs but may limit exploration, whereas looser coupling offers broader chemical space coverage at the risk of proposing impractical targets.

Claimed Contributions

SynGA: A genetic algorithm operating directly over synthesis routes

The authors introduce SynGA, a genetic algorithm that evolves synthesis routes directly using custom crossover and mutation operators. This design explicitly constrains the search to synthesizable molecular space by construction, without requiring post-hoc synthesis validation.

10 retrieved papers
Machine learning-based building block filtering to enhance SynGA

The authors propose an ML-guided building block filtering approach that dynamically restricts the building block set depending on the optimization task. For analog search, this uses a lightweight classifier; for property optimization, it employs a neural additive model over building blocks.

10 retrieved papers
SynGBO: A Bayesian optimization algorithm using SynGA in its inner loop

The authors develop SynGBO, a model-based Bayesian optimization algorithm that integrates SynGA with block filtering as a subroutine for optimizing acquisition functions. This approach achieves state-of-the-art performance on property optimization benchmarks.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

SynGA: A genetic algorithm operating directly over synthesis routes

The authors introduce SynGA, a genetic algorithm that evolves synthesis routes directly using custom crossover and mutation operators. This design explicitly constrains the search to synthesizable molecular space by construction, without requiring post-hoc synthesis validation.

Contribution

Machine learning-based building block filtering to enhance SynGA

The authors propose an ML-guided building block filtering approach that dynamically restricts the building block set depending on the optimization task. For analog search, this uses a lightweight classifier; for property optimization, it employs a neural additive model over building blocks.

Contribution

SynGBO: A Bayesian optimization algorithm using SynGA in its inner loop

The authors develop SynGBO, a model-based Bayesian optimization algorithm that integrates SynGA with block filtering as a subroutine for optimizing acquisition functions. This approach achieves state-of-the-art performance on property optimization benchmarks.