SR-Scientist: Scientific Equation Discovery With Agentic AI

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Symbolic regressionEquation DiscoveryLarge Language ModelsAgentic AI

Recently, Large Language Models (LLMs) have been applied to scientific equation discovery, leveraging their embedded scientific knowledge for hypothesis generation. However, current methods typically confine LLMs to the role of an equation proposer within search algorithms like genetic programming. In this paper, we present SR-Scientist, a framework that elevates the LLM from a simple equation proposer to an autonomous AI scientist that writes code to analyze data, implements the equation as code, submits it for evaluation, and optimizes the equation based on experimental feedback. Specifically, we wrap the code interpreter into a set of tools for data analysis and equation evaluation. The agent is instructed to optimize the equation by utilizing these tools over a long horizon with minimal human-defined pipelines. Empirical results show that SR-Scientist outperforms baseline methods by an absolute margin of 6% to 35% on datasets covering four science disciplines. Additionally, we demonstrate our method's robustness to noise, the generalization of the discovered equations to out-of-domain data, and their symbolic accuracy. Furthermore, we develop an end-to-end reinforcement learning framework to enhance the agent's capabilities.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces SR-Scientist, a framework that positions large language models as autonomous agents capable of writing code, analyzing data, and iteratively refining equations based on experimental feedback. This work resides in the LLM-Enhanced Symbolic Regression leaf, which contains five papers total, indicating a moderately populated but still emerging research direction. Unlike sibling papers that primarily use LLMs as equation proposers within traditional search algorithms, SR-Scientist elevates the LLM to a more autonomous role, integrating code interpretation and tool-driven evaluation into the discovery loop.

The broader Symbolic Regression Methods branch encompasses classical genetic programming approaches, reinforcement learning-based methods, and LLM-enhanced techniques. SR-Scientist bridges these areas by combining LLM reasoning with an agent-based workflow, distinguishing it from purely evolutionary methods in the Classical Symbolic Regression leaf and from policy-gradient approaches in the Reinforcement Learning-Based Symbolic Regression leaf. The taxonomy also reveals adjacent directions such as Differential Equation Discovery and Hybrid Neural-Symbolic Methods, which focus on temporal dynamics and neural network integration respectively, rather than the autonomous code-writing paradigm proposed here.

Among thirty candidates examined across three contributions, none were identified as clearly refuting the proposed approach. The SR-Scientist framework contribution examined ten candidates with zero refutable overlaps, as did the reinforcement learning pipeline and tool-driven evaluation system contributions. This suggests that within the limited search scope, the combination of autonomous agent behavior, code-based equation implementation, and iterative optimization through tool use appears distinct from prior LLM-enhanced symbolic regression methods, which typically confine LLMs to hypothesis generation rather than end-to-end scientific workflow orchestration.

The analysis reflects a focused literature search of thirty semantically related papers, not an exhaustive survey of all symbolic regression or LLM-based discovery work. While the statistics indicate no direct prior work overlap within this sample, the relatively small size of the LLM-Enhanced Symbolic Regression leaf and the rapid evolution of LLM-agent research suggest that closely related efforts may exist outside the examined candidate set. The framework's novelty appears strongest in its integration of autonomous coding and tool use, though the reinforcement learning component overlaps conceptually with existing RL-based symbolic regression methods.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: scientific equation discovery from observational data. The field is organized around several complementary branches that reflect different problem settings and methodological emphases. Symbolic Regression Methods focus on discovering closed-form algebraic expressions directly from data, often using genetic programming or neural-guided search strategies. Differential Equation Discovery targets systems governed by temporal or spatial dynamics, aiming to recover governing PDEs or ODEs from trajectory observations. Domain-Specific Equation Discovery tailors techniques to particular scientific disciplines such as physics, biology, or climate science, where prior knowledge can guide the search. Causal Discovery and Structure Learning emphasizes identifying not just predictive equations but the underlying causal relationships among variables. Hybrid and Multi-Task Approaches combine symbolic and neural components or tackle multiple discovery objectives simultaneously, while Foundations and Evaluation addresses benchmarking, theoretical guarantees, and reproducibility challenges across the entire landscape. Recent work has seen growing interest in leveraging large language models to enhance symbolic regression, a direction that blends domain knowledge encoded in pre-trained models with classical search heuristics. SR-Scientist[0] exemplifies this trend by integrating LLM-based reasoning into the equation discovery pipeline, positioning itself within the LLM-Enhanced Symbolic Regression cluster alongside LLM-SR[3] and DrSR[2]. While LLM-SR[3] primarily uses language models to propose candidate expressions, SR-Scientist[0] emphasizes a more interactive or scientist-in-the-loop workflow that refines hypotheses iteratively. Nearby efforts such as MLLM-based Discovery of Intrinsic[9] explore multimodal inputs to capture richer contextual cues, and Mimicking the Physicists Eye[30] incorporates visual or geometric reasoning into the discovery process. These LLM-enhanced methods contrast with purely evolutionary or sparsity-driven approaches, raising open questions about the trade-offs between interpretability, computational cost, and the risk of overfitting to patterns seen during pre-training versus genuine physical laws.

Claimed Contributions

SR-SCIENTIST framework for autonomous equation discovery

10 retrieved papers

The authors introduce SR-SCIENTIST, a framework where an LLM agent autonomously discovers scientific equations through long-horizon optimization. The agent uses code interpreters as tools to analyze data and evaluate equations, operating with minimal human-defined pipelines and maintaining an experience buffer to overcome context length limitations.

10 retrieved papers

End-to-end reinforcement learning pipeline for agent capability enhancement

10 retrieved papers

The authors develop a complete RL pipeline including training data construction and reward design that enables the LLM agent to evolve and improve its equation discovery capabilities through self-experience, using GRPO algorithm for optimization.

10 retrieved papers

Tool-driven data analysis and equation evaluation system

10 retrieved papers

The authors design a tool system that wraps code interpreters into two primary tools: a data analyzer for exploring observed data and an equation evaluator for testing hypotheses. This enables the agent to conduct long-horizon optimization through multi-turn interactions without rigid predefined workflows.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[2] DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience PDF

Li Kai, Zhang Yifan, Cheng Jian (2025)

[3] LLM-SR: Scientific Equation Discovery via Programming with Large Language Models PDF

Shojaee, Parshin, Meidani, Kazem, Parshin Shojaee, Gupta, Shashank, Kazem Meidani, Farimani, Amir Barati, Shashank Gupta, Reddy, Chandan K., A. Farimani, Chandan K. Reddy (2024) • International Conference on Learning Representations

[9] MLLM-based Discovery of Intrinsic Coordinates and Governing Equations from High-Dimensional Data PDF

Li Ruikun, Lu Yan, Tang, Shixiang, Qi, Biqing, Ouyang, Wanli (2025)

[30] Mimicking the Physicist's Eye: A VLM-centric Approach for Physics Formula Discovery PDF

Liu Jiaqi, Lai, Songning, Jiaqi Liu, Li Pengze, Songning Lai, Yu Di, Pengze Li, Zhou Wen-jie, Di Yu, Zhou, Yiyang, Wenjie Zhou, Xia, Peng, Yiyang Zhou, Wang, Zijun, Peng Xia, Chen Xi, Zijun Wang, Tang, Shixiang, Xi Chen, Bai Lei, Shixiang Tang, Ouyang, Wanli, Lei Bai, Ding, Mingyu, Wanli Ouyang, Yao, Huaxiu, Mingyu Ding, Wang Aoran, Huaxiu Yao, Aoran Wang (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

SR-SCIENTIST framework for autonomous equation discovery

[2] DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience PDF

Cannot Refute

[22] Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems PDF

Cannot Refute

[51] ReTool: Reinforcement Learning for Strategic Tool Use in LLMs PDF

Cannot Refute

[52] Data Interpreter: An LLM Agent For Data Science PDF

Cannot Refute

[53] Prover Agent: An Agent-based Framework for Formal Mathematical Proofs PDF

Cannot Refute

[54] To code or not to code? adaptive tool integration for math language models via expectation-maximization PDF

Cannot Refute

[55] Building Math Agents with Multi-Turn Iterative Preference Learning PDF

Cannot Refute

[56] Multi-Agent Evolve: LLM Self-Improve through Co-evolution PDF

Cannot Refute

[57] LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery PDF

Cannot Refute

[58] AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent PDF

Cannot Refute

Contribution

End-to-end reinforcement learning pipeline for agent capability enhancement

[51] ReTool: Reinforcement Learning for Strategic Tool Use in LLMs PDF

Cannot Refute

[59] Reinforcement symbolic regression machine PDF

Cannot Refute

[60] Diffusion-Based Symbolic Regression PDF

Cannot Refute

[61] Learning to discover abstractions for llm reasoning PDF

Cannot Refute

[62] Machine Learning for Symbolic Mathematics and Physics Discovery PDF

Cannot Refute

[63] EGG-SR: Embedding Symbolic Equivalence into Symbolic Regression via Equality Graph PDF

Cannot Refute

[64] Agent rl scaling law: Agent rl with spontaneous code execution for mathematical problem solving PDF

Cannot Refute

[65] Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning PDF

Cannot Refute

[66] Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning PDF

Cannot Refute

[67] From Equations to Insights: Unraveling Symbolic Structures in PDEs with LLMs PDF

Cannot Refute

Contribution

Tool-driven data analysis and equation evaluation system

[68] Investigating Execution-Aware Language Models for Code Optimization PDF

Cannot Refute

[69] Waitgpt: Monitoring and steering conversational llm agent in data analysis with on-the-fly code visualization PDF

Cannot Refute

[70] Efficient evaluation of electrostatic potential with computerized optimized code. PDF

Cannot Refute

[71] Automated Visualization Code Synthesis via Multi-Path Reasoning and Feedback-Driven Optimization PDF

Cannot Refute

[72] Enabling efficient execution of a variational data assimilation application PDF

Cannot Refute

[73] VisPath: Automated Visualization Code Synthesis via Multi-Path Reasoning and Feedback-Driven Optimization PDF

Cannot Refute

[74] Computational framework for a family of methods based on stress-constrained topology optimization PDF

Cannot Refute

[75] The stratosphere platform for big data analytics PDF

Cannot Refute

[76] TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning PDF

Cannot Refute

[77] FinVerse: An Autonomous Agent System for Versatile Financial Analysis PDF

Cannot Refute

SR-Scientist: Scientific Equation Discovery With Agentic AI

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[2] DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience PDF

[3] LLM-SR: Scientific Equation Discovery via Programming with Large Language Models PDF

[9] MLLM-based Discovery of Intrinsic Coordinates and Governing Equations from High-Dimensional Data PDF

[30] Mimicking the Physicist's Eye: A VLM-centric Approach for Physics Formula Discovery PDF

Contribution Analysis

SR-SCIENTIST framework for autonomous equation discovery

[2] DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience PDF

[22] Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems PDF

[51] ReTool: Reinforcement Learning for Strategic Tool Use in LLMs PDF

[52] Data Interpreter: An LLM Agent For Data Science PDF

[53] Prover Agent: An Agent-based Framework for Formal Mathematical Proofs PDF

[54] To code or not to code? adaptive tool integration for math language models via expectation-maximization PDF

[55] Building Math Agents with Multi-Turn Iterative Preference Learning PDF

[56] Multi-Agent Evolve: LLM Self-Improve through Co-evolution PDF

[57] LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery PDF

[58] AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent PDF

End-to-end reinforcement learning pipeline for agent capability enhancement

[51] ReTool: Reinforcement Learning for Strategic Tool Use in LLMs PDF

[59] Reinforcement symbolic regression machine PDF

[60] Diffusion-Based Symbolic Regression PDF

[61] Learning to discover abstractions for llm reasoning PDF

[62] Machine Learning for Symbolic Mathematics and Physics Discovery PDF

[63] EGG-SR: Embedding Symbolic Equivalence into Symbolic Regression via Equality Graph PDF

[64] Agent rl scaling law: Agent rl with spontaneous code execution for mathematical problem solving PDF

[65] Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning PDF

[66] Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning PDF

[67] From Equations to Insights: Unraveling Symbolic Structures in PDEs with LLMs PDF

Tool-driven data analysis and equation evaluation system

[68] Investigating Execution-Aware Language Models for Code Optimization PDF

[69] Waitgpt: Monitoring and steering conversational llm agent in data analysis with on-the-fly code visualization PDF

[70] Efficient evaluation of electrostatic potential with computerized optimized code. PDF

[71] Automated Visualization Code Synthesis via Multi-Path Reasoning and Feedback-Driven Optimization PDF

[72] Enabling efficient execution of a variational data assimilation application PDF

[73] VisPath: Automated Visualization Code Synthesis via Multi-Path Reasoning and Feedback-Driven Optimization PDF

[74] Computational framework for a family of methods based on stress-constrained topology optimization PDF

[75] The stratosphere platform for big data analytics PDF

[76] TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning PDF

[77] FinVerse: An Autonomous Agent System for Versatile Financial Analysis PDF

Table of Contents