Jet Expansions: Restructuring LLM Computation for Model Inspection
Overview
Overall Novelty Assessment
The paper introduces Jet Expansions, a mathematical framework for decomposing language model computations into explicit input-to-output paths using generalized Taylor series operators. It resides in the 'Computational Path Formalization and Theory' leaf, which contains only two papers total. This is one of the sparsest research directions in the taxonomy, indicating a relatively underexplored theoretical niche focused on rigorous mathematical formalizations rather than empirical circuit discovery or application-driven interpretability.
The taxonomy reveals substantial activity in neighboring areas: Activation-Based Decomposition (five papers across two leaves) focuses on extracting features from hidden states, while Weight-Based and Circuit-Level Analysis (eight papers) emphasizes causal subgraph identification. Reasoning Path Decomposition (sixteen papers across three leaves) targets multi-step logic chains. Jet Expansions diverges by providing mathematical foundations for these empirical methods rather than proposing new feature extraction or circuit-tracing techniques. Its scope_note explicitly excludes empirical circuit discovery, positioning it as theoretical infrastructure.
Among twenty-eight candidates examined, the contribution-level analysis shows mixed novelty signals. The core Jet Expansions framework (ten candidates examined, zero refutations) and the function decomposition perspective (ten candidates, zero refutations) appear relatively novel within the limited search scope. However, the claim of grounding existing interpretability tools encountered one refutable candidate among eight examined, suggesting some theoretical overlap with prior formalization efforts. The search scale is modest—top-K semantic matches plus citations—so these findings reflect local rather than exhaustive coverage.
Given the sparse theoretical leaf and limited search scope, the work appears to occupy a distinct formal niche. The framework's mathematical rigor and higher-order expansion machinery differentiate it from variance-based methods like Neural-ANOVA, though the grounding of existing tools shows some precedent. The analysis covers approximately thirty semantically related papers, leaving open whether broader theoretical literature in adjacent fields might reveal additional connections.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors propose a principled mathematical framework that uses jet operators (functional counterparts of truncated Taylor series) to systematically decompose language models into explicit input-to-output computational paths and complementary remainders. This functional decomposition provides a systematic operator for cutting through entanglement in LLMs, enabling scalable model inspection without requiring additional data or training.
The authors introduce a conceptual shift in interpretability methodology by framing it as a problem of function decomposition rather than traditional data-driven approaches. This perspective enables manipulation of functions directly in function space, requiring no probe datasets or sampling, and allows arbitrary portions of computation to be isolated from the monolithic transformer.
The authors establish a rigorous mathematical foundation using jet operators that subsumes and generalizes existing interpretability techniques like Logit Lens and path expansion methods. This framework provides formal justification for these tools and extends them to new instantiations such as extracting n-gram probability tables directly from LLMs without requiring corpus data.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[44] Neural-ANOVA: Model Decomposition for Interpretable Machine Learning PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
Jet Expansions framework for restructuring LLM computations
The authors propose a principled mathematical framework that uses jet operators (functional counterparts of truncated Taylor series) to systematically decompose language models into explicit input-to-output computational paths and complementary remainders. This functional decomposition provides a systematic operator for cutting through entanglement in LLMs, enabling scalable model inspection without requiring additional data or training.
[59] Unfolding Videos Dynamics via Taylor Expansion PDF
[60] M-Rule: An Enhanced Deep Taylor Decomposition for Multi-model Interpretability PDF
[61] Hope: High-order polynomial expansion of black-box neural networks PDF
[62] GTEA: Guided Taylor Expansion Approximation Network for Optical Flow Estimation PDF
[63] Towards explaining anomalies: A deep Taylor decomposition of one-class models PDF
[64] Cat: Interpretable concept-based taylor additive models PDF
[65] Explaining COVID-19 diagnosis with Taylor decompositions PDF
[66] An integrated model based on feedforward neural network and Taylor expansion for indicator correlation elimination PDF
[67] Explaining nonlinear classification decisions with deep taylor decomposition PDF
[68] Tayloraecnet: A Taylor Style Neural Network For Full-Band Echo Cancellation PDF
Treating interpretability as function decomposition
The authors introduce a conceptual shift in interpretability methodology by framing it as a problem of function decomposition rather than traditional data-driven approaches. This perspective enables manipulation of functions directly in function space, requiring no probe datasets or sampling, and allows arbitrary portions of computation to be isolated from the monolithic transformer.
[69] Tensorization of neural networks for improved privacy and interpretability PDF
[70] A survey on kolmogorov-arnold network PDF
[71] Multilevel wavelet decomposition network for interpretable time series analysis PDF
[72] Kolmogorov-Arnold Networks for Interpretable and Efficient Function Approximation PDF
[73] A comprehensive survey on self-interpretable neural networks PDF
[74] Beyond the Black Box: A Review of Quantitative Metrics for Neural Network Interpretability and Their Practical Implications PDF
[75] Neural additive models: Interpretable machine learning with neural nets PDF
[76] Interpretable basis decomposition for visual explanation PDF
[77] Tensor Product Neural Networks for Functional ANOVA Model PDF
[78] Neural basis models for interpretability PDF
Theoretical grounding of existing interpretability tools
The authors establish a rigorous mathematical foundation using jet operators that subsumes and generalizes existing interpretability techniques like Logit Lens and path expansion methods. This framework provides formal justification for these tools and extends them to new instantiations such as extracting n-gram probability tables directly from LLMs without requiring corpus data.