FineNib: A Query Synthesizer For Static Analysis of Security Vulnerabilities

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.6 Download Report PDF

Static AnalysisProgram SynthesisVulnerability Detection

CodeQL is a powerful static analysis engine that represents programs’ abstract syntax trees as databases that can be queried to detect security vulnerabilities. While CodeQL supports expressive interprocedural dataflow queries, the coverage and precision of its existing security queries remain limited, and writing new queries is challenging even for experts. Automatically synthesizing CodeQL queries from known vulnerabilities (CVEs) can provide fine-grained vulnerability signatures, enabling both improved detection and systematic variant analysis. We present FineNib, an agentic framework for synthesizing CodeQL queries from known CVE descriptions. FineNib leverages the Model Context Protocol (MCP) for agentic tool use, integrates abstract syntax tree guidance, and incorporates CodeQL’s language infrastructure and documentation into the synthesis loop. A key challenge is that state-of-the-art large language models hallucinate deprecated CodeQL syntax due to limited training data and outdated knowledge. FineNib addresses this by combining contextual engineering, iterative query feedback, and structured tool interaction to reliably generate executable, up-to-date queries.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

FineNib contributes an agentic framework that synthesizes CodeQL queries from CVE descriptions, addressing the challenge of hallucinated deprecated syntax in LLM-generated queries. The paper resides in the 'CodeQL Query Synthesis from CVE Descriptions' leaf, which contains only two papers including FineNib itself. This represents a highly sparse research direction within the broader taxonomy of fifty papers across thirty-six topics, suggesting that automated CVE-to-CodeQL synthesis is an emerging rather than crowded area. The sibling paper QLCoder shares the same goal of LLM-driven query generation from vulnerability descriptions.

The taxonomy tree reveals that FineNib's parent branch 'LLM-Assisted Query and Specification Synthesis' sits alongside 'Neuro-Symbolic Integration for Vulnerability Detection' and 'Domain-Specific Static Analysis Techniques'. While neuro-symbolic approaches like LLM-guided interprocedural analysis combine neural models with traditional static analysis engines for whole-repository reasoning, FineNib focuses specifically on query synthesis rather than detection execution. The neighboring 'LLM-Based Static Analyzer Synthesis' leaf addresses generating complete analyzers from bug patterns, whereas FineNib targets individual queries for an existing analyzer (CodeQL). This positioning clarifies that FineNib operates at the query specification layer rather than the analysis engine layer.

Among twenty candidates examined, the core agentic framework contribution shows one refutable candidate from ten examined, indicating some prior work in CVE-to-query synthesis exists within this limited search scope. The custom MCP interface contribution was not evaluated against candidates, leaving its novelty unassessed in this analysis. The evaluation contribution examined ten candidates with none appearing to refute it, suggesting that systematic validation on real-world CVEs and repositories may be less explored. These statistics reflect a focused semantic search rather than exhaustive coverage, and the single refutable match likely corresponds to the sibling paper QLCoder in the same taxonomy leaf.

Based on the limited search of twenty candidates, FineNib appears to address a relatively sparse research direction with modest prior work overlap. The taxonomy structure confirms that automated CVE-to-CodeQL synthesis remains an emerging area compared to more established branches like domain-specific smart contract analysis or foundational dataflow techniques. However, this assessment is constrained by the top-K semantic search methodology and does not capture potential relevant work outside the examined candidate set or in adjacent communities such as program synthesis or automated software engineering.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Automated synthesis of static analysis queries for security vulnerability detection. The field encompasses a broad spectrum of approaches, from foundational static analysis methods and frameworks that establish core techniques for program inspection, to modern LLM-assisted query and specification synthesis that leverages large language models to generate detection rules from natural-language descriptions. Neuro-symbolic integration combines neural learning with symbolic reasoning to improve detection accuracy, while domain-specific static analysis techniques target particular languages, platforms, or vulnerability classes. Hybrid and multi-method analysis approaches merge static and dynamic methods or combine diverse tools to overcome individual limitations, and specialized analysis contexts address unique environments such as smart contracts, IoT firmware, or mobile applications. Evaluation, benchmarking, and empirical studies provide the datasets and metrics necessary to assess these varied techniques, and vulnerability repair and mitigation extend detection into automated patching. Advanced analysis techniques and infrastructure explore novel representations, program slicing, and scalable architectures that support large-scale code inspection. Recent work has intensified around LLM-driven synthesis, where models generate CodeQL or similar queries directly from CVE descriptions or natural-language specifications. FineNib[0] and QLCoder[21] exemplify this trend, both focusing on translating vulnerability reports into executable static analysis queries with minimal manual intervention. These efforts contrast with earlier foundational methods like Pixy[10] and Static Analysis Survey[7], which relied on hand-crafted rules and expert knowledge. A key tension lies in balancing automation and precision: while LLM-assisted approaches promise rapid query generation and adaptability to emerging threats, they must contend with the semantic subtleties and false-positive rates that have long challenged static analysis. FineNib[0] sits squarely within the LLM-assisted query synthesis branch, sharing its emphasis on automating rule creation with QLCoder[21], yet it also draws on insights from benchmarking studies like Benchmarking Web Security[6] to validate generated queries against real-world vulnerability datasets. This positioning highlights an ongoing shift toward integrating machine learning with traditional static analysis rigor, aiming to scale detection capabilities without sacrificing the interpretability and correctness that domain experts require.

Claimed Contributions

FineNib agentic framework for CVE-to-query synthesis

Can Refute

10 retrieved papers

FineNib is an agentic framework that translates CVE descriptions into executable CodeQL queries. It embeds an LLM in a synthesis loop with execution feedback and constrains reasoning using a custom MCP interface that provides structured interaction with a Language Server Protocol and a RAG database.

10 retrieved papers

Can Refute

Custom MCP interface for structured reasoning

0 retrieved papers

The framework introduces a novel integration that combines execution-guided synthesis with semantic retrieval and structured reasoning. The MCP interface provides syntax guidance via LSP and semantic guidance via a vector database of CodeQL queries and documentation.

0 retrieved papers

Evaluation on real-world CVEs and repositories

10 retrieved papers

The authors evaluate FineNib on CWE-Bench-Java comprising 176 CVEs across 111 Java projects, covering 42 vulnerability types. The evaluation demonstrates how FineNib identifies sources, sinks, sanitizers, and taint propagation steps to synthesize queries that detect vulnerabilities in vulnerable versions while remaining silent on patched versions.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[21] QLCoder: A Query Synthesizer For Static Analysis of Security Vulnerabilities PDF

Claire Wang, Ziyang Li, Saikat Dutta, Mayur Naik (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

FineNib agentic framework for CVE-to-query synthesis

[21] QLCoder: A Query Synthesizer For Static Analysis of Security Vulnerabilities PDF

Can Refute

[22] Automating the early detection of security design flaws PDF

Cannot Refute

[51] Automatic inference of search patterns for taint-style vulnerabilities PDF

Cannot Refute

[52] Automatic detection of access control vulnerabilities via API specification processing PDF

Cannot Refute

[53] Towards automatic generation of vulnerability-based signatures PDF

Cannot Refute

[54] A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes PDF

Cannot Refute

[55] Automating ROS2 Security Policies Extraction through Static Analysis PDF

Cannot Refute

[56] Supporting automated vulnerability analysis using formalized vulnerability signatures PDF

Cannot Refute

[57] Vulnerability detection in ethereum smart contracts via machine learning: A qualitative analysis PDF

Cannot Refute

[58] Towards automated security design flaw detection PDF

Cannot Refute

Contribution

Custom MCP interface for structured reasoning

Contribution

Evaluation on real-world CVEs and repositories

[59] How far have we gone in vulnerability detection using large language models PDF

Cannot Refute

[60] Empirical validation of automated vulnerability curation and characterization PDF

Cannot Refute

[61] Advanced smart contract vulnerability detection via llm-powered multi-agent systems PDF

Cannot Refute

[62] SIExVulTS: Sensitive Information Exposure Vulnerability Detection System using Transformer Models and Static Analysis PDF

Cannot Refute

[63] Out of Distribution, Out of Luck: How Well Can LLMs Trained on Vulnerability Datasets Detect Top 25 CWE Weaknesses? PDF

Cannot Refute

[64] CVE-Bench: Benchmarking LLM-based Software Engineering Agent's Ability to Repair Real-World CVE Vulnerabilities PDF

Cannot Refute

[65] VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching PDF

Cannot Refute

[66] CVE2CWE: Automated Mapping of Software Vulnerabilities to Weaknesses Based on CVE Descriptions PDF

Cannot Refute

[67] SecVulEval: Benchmarking LLMs for Real-World C/C++ Vulnerability Detection PDF

Cannot Refute

[68] Automatic Software Vulnerabilty Detection Using Code Metrics and Feature Extraction PDF

Cannot Refute

FineNib: A Query Synthesizer For Static Analysis of Security Vulnerabilities

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[21] QLCoder: A Query Synthesizer For Static Analysis of Security Vulnerabilities PDF

Contribution Analysis

FineNib agentic framework for CVE-to-query synthesis

[21] QLCoder: A Query Synthesizer For Static Analysis of Security Vulnerabilities PDF

[22] Automating the early detection of security design flaws PDF

[51] Automatic inference of search patterns for taint-style vulnerabilities PDF

[52] Automatic detection of access control vulnerabilities via API specification processing PDF

[53] Towards automatic generation of vulnerability-based signatures PDF

[54] A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes PDF

[55] Automating ROS2 Security Policies Extraction through Static Analysis PDF

[56] Supporting automated vulnerability analysis using formalized vulnerability signatures PDF

[57] Vulnerability detection in ethereum smart contracts via machine learning: A qualitative analysis PDF

[58] Towards automated security design flaw detection PDF

Custom MCP interface for structured reasoning

Evaluation on real-world CVEs and repositories

[59] How far have we gone in vulnerability detection using large language models PDF

[60] Empirical validation of automated vulnerability curation and characterization PDF

[61] Advanced smart contract vulnerability detection via llm-powered multi-agent systems PDF

[62] SIExVulTS: Sensitive Information Exposure Vulnerability Detection System using Transformer Models and Static Analysis PDF

[63] Out of Distribution, Out of Luck: How Well Can LLMs Trained on Vulnerability Datasets Detect Top 25 CWE Weaknesses? PDF

[64] CVE-Bench: Benchmarking LLM-based Software Engineering Agent's Ability to Repair Real-World CVE Vulnerabilities PDF

[65] VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching PDF

[66] CVE2CWE: Automated Mapping of Software Vulnerabilities to Weaknesses Based on CVE Descriptions PDF

[67] SecVulEval: Benchmarking LLMs for Real-World C/C++ Vulnerability Detection PDF

[68] Automatic Software Vulnerabilty Detection Using Code Metrics and Feature Extraction PDF

Table of Contents