FineNib: A Query Synthesizer For Static Analysis of Security Vulnerabilities

ICLR 2026 Conference SubmissionAnonymous Authors
Static AnalysisProgram SynthesisVulnerability Detection
Abstract:

CodeQL is a powerful static analysis engine that represents programs’ abstract syntax trees as databases that can be queried to detect security vulnerabilities. While CodeQL supports expressive interprocedural dataflow queries, the coverage and precision of its existing security queries remain limited, and writing new queries is challenging even for experts. Automatically synthesizing CodeQL queries from known vulnerabilities (CVEs) can provide fine-grained vulnerability signatures, enabling both improved detection and systematic variant analysis. We present FineNib, an agentic framework for synthesizing CodeQL queries from known CVE descriptions. FineNib leverages the Model Context Protocol (MCP) for agentic tool use, integrates abstract syntax tree guidance, and incorporates CodeQL’s language infrastructure and documentation into the synthesis loop. A key challenge is that state-of-the-art large language models hallucinate deprecated CodeQL syntax due to limited training data and outdated knowledge. FineNib addresses this by combining contextual engineering, iterative query feedback, and structured tool interaction to reliably generate executable, up-to-date queries.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

FineNib contributes an agentic framework that synthesizes CodeQL queries from CVE descriptions, addressing the challenge of hallucinated deprecated syntax in LLM-generated queries. The paper resides in the 'CodeQL Query Synthesis from CVE Descriptions' leaf, which contains only two papers including FineNib itself. This represents a highly sparse research direction within the broader taxonomy of fifty papers across thirty-six topics, suggesting that automated CVE-to-CodeQL synthesis is an emerging rather than crowded area. The sibling paper QLCoder shares the same goal of LLM-driven query generation from vulnerability descriptions.

The taxonomy tree reveals that FineNib's parent branch 'LLM-Assisted Query and Specification Synthesis' sits alongside 'Neuro-Symbolic Integration for Vulnerability Detection' and 'Domain-Specific Static Analysis Techniques'. While neuro-symbolic approaches like LLM-guided interprocedural analysis combine neural models with traditional static analysis engines for whole-repository reasoning, FineNib focuses specifically on query synthesis rather than detection execution. The neighboring 'LLM-Based Static Analyzer Synthesis' leaf addresses generating complete analyzers from bug patterns, whereas FineNib targets individual queries for an existing analyzer (CodeQL). This positioning clarifies that FineNib operates at the query specification layer rather than the analysis engine layer.

Among twenty candidates examined, the core agentic framework contribution shows one refutable candidate from ten examined, indicating some prior work in CVE-to-query synthesis exists within this limited search scope. The custom MCP interface contribution was not evaluated against candidates, leaving its novelty unassessed in this analysis. The evaluation contribution examined ten candidates with none appearing to refute it, suggesting that systematic validation on real-world CVEs and repositories may be less explored. These statistics reflect a focused semantic search rather than exhaustive coverage, and the single refutable match likely corresponds to the sibling paper QLCoder in the same taxonomy leaf.

Based on the limited search of twenty candidates, FineNib appears to address a relatively sparse research direction with modest prior work overlap. The taxonomy structure confirms that automated CVE-to-CodeQL synthesis remains an emerging area compared to more established branches like domain-specific smart contract analysis or foundational dataflow techniques. However, this assessment is constrained by the top-K semantic search methodology and does not capture potential relevant work outside the examined candidate set or in adjacent communities such as program synthesis or automated software engineering.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
20
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: Automated synthesis of static analysis queries for security vulnerability detection. The field encompasses a broad spectrum of approaches, from foundational static analysis methods and frameworks that establish core techniques for program inspection, to modern LLM-assisted query and specification synthesis that leverages large language models to generate detection rules from natural-language descriptions. Neuro-symbolic integration combines neural learning with symbolic reasoning to improve detection accuracy, while domain-specific static analysis techniques target particular languages, platforms, or vulnerability classes. Hybrid and multi-method analysis approaches merge static and dynamic methods or combine diverse tools to overcome individual limitations, and specialized analysis contexts address unique environments such as smart contracts, IoT firmware, or mobile applications. Evaluation, benchmarking, and empirical studies provide the datasets and metrics necessary to assess these varied techniques, and vulnerability repair and mitigation extend detection into automated patching. Advanced analysis techniques and infrastructure explore novel representations, program slicing, and scalable architectures that support large-scale code inspection. Recent work has intensified around LLM-driven synthesis, where models generate CodeQL or similar queries directly from CVE descriptions or natural-language specifications. FineNib[0] and QLCoder[21] exemplify this trend, both focusing on translating vulnerability reports into executable static analysis queries with minimal manual intervention. These efforts contrast with earlier foundational methods like Pixy[10] and Static Analysis Survey[7], which relied on hand-crafted rules and expert knowledge. A key tension lies in balancing automation and precision: while LLM-assisted approaches promise rapid query generation and adaptability to emerging threats, they must contend with the semantic subtleties and false-positive rates that have long challenged static analysis. FineNib[0] sits squarely within the LLM-assisted query synthesis branch, sharing its emphasis on automating rule creation with QLCoder[21], yet it also draws on insights from benchmarking studies like Benchmarking Web Security[6] to validate generated queries against real-world vulnerability datasets. This positioning highlights an ongoing shift toward integrating machine learning with traditional static analysis rigor, aiming to scale detection capabilities without sacrificing the interpretability and correctness that domain experts require.

Claimed Contributions

FineNib agentic framework for CVE-to-query synthesis

FineNib is an agentic framework that translates CVE descriptions into executable CodeQL queries. It embeds an LLM in a synthesis loop with execution feedback and constrains reasoning using a custom MCP interface that provides structured interaction with a Language Server Protocol and a RAG database.

10 retrieved papers
Can Refute
Custom MCP interface for structured reasoning

The framework introduces a novel integration that combines execution-guided synthesis with semantic retrieval and structured reasoning. The MCP interface provides syntax guidance via LSP and semantic guidance via a vector database of CodeQL queries and documentation.

0 retrieved papers
Evaluation on real-world CVEs and repositories

The authors evaluate FineNib on CWE-Bench-Java comprising 176 CVEs across 111 Java projects, covering 42 vulnerability types. The evaluation demonstrates how FineNib identifies sources, sinks, sanitizers, and taint propagation steps to synthesize queries that detect vulnerabilities in vulnerable versions while remaining silent on patched versions.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

FineNib agentic framework for CVE-to-query synthesis

FineNib is an agentic framework that translates CVE descriptions into executable CodeQL queries. It embeds an LLM in a synthesis loop with execution feedback and constrains reasoning using a custom MCP interface that provides structured interaction with a Language Server Protocol and a RAG database.

Contribution

Custom MCP interface for structured reasoning

The framework introduces a novel integration that combines execution-guided synthesis with semantic retrieval and structured reasoning. The MCP interface provides syntax guidance via LSP and semantic guidance via a vector database of CodeQL queries and documentation.

Contribution

Evaluation on real-world CVEs and repositories

The authors evaluate FineNib on CWE-Bench-Java comprising 176 CVEs across 111 Java projects, covering 42 vulnerability types. The evaluation demonstrates how FineNib identifies sources, sinks, sanitizers, and taint propagation steps to synthesize queries that detect vulnerabilities in vulnerable versions while remaining silent on patched versions.