Path Channels and Plan Extension Kernels: a Mechanistic Description of Planning in a Sokoban RNN
Overview
Overall Novelty Assessment
The paper reverse-engineers a Sokoban-playing RNN to uncover how it internally represents plans as 'path channels' in hidden states. It resides in the 'Plan Representation Discovery in Game-Playing RNNs' leaf, which contains only three papers total, including this work and two siblings. This is a notably sparse research direction within the broader mechanistic interpretability branch, suggesting the paper addresses a relatively underexplored niche focused specifically on interpretable plan representations in puzzle-solving agents rather than general RNN debugging or application-oriented planning.
The taxonomy reveals that mechanistic interpretability of planning sits alongside much larger branches devoted to RNN-based planning applications in robotics and logistics. The closest neighboring leaf, 'Neural Circuit Mechanisms for Sequential Planning,' examines circuit-level mechanisms but excludes high-level planning models without mechanistic analysis. The paper's focus on convolutional kernels encoding transition models and bidirectional plan construction distinguishes it from application-driven work in robotic motion planning or cognitive neuroscience-inspired models, which do not prioritize reverse-engineering learned algorithms in trained networks.
Among thirty candidates examined across three contributions, none were found to clearly refute the paper's claims. The first contribution, discovering path channels as direct plan representation, examined ten candidates with zero refutable matches. Similarly, the mechanistic explanation via plan extension kernels and the bidirectional planning algorithm with backtracking each examined ten candidates without identifying overlapping prior work. This suggests that within the limited search scope, the specific combination of path channel discovery, kernel-based transition models, and backtracking mechanisms appears relatively novel, though the small candidate pool limits definitive conclusions.
Based on the limited literature search of thirty semantically similar papers, the work appears to occupy a sparsely populated research direction with minimal direct overlap among examined candidates. However, the analysis does not cover exhaustive citation networks or domain-specific venues, leaving open the possibility of relevant prior work outside the top-K semantic matches. The taxonomy structure confirms this is an emerging area with few directly comparable studies.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors identify that specific hidden state channels in the DRC(3,3) network directly encode the agent's and boxes' future movement directions without requiring linear probes. High activation in a path channel at a location indicates the propensity to move in that channel's assigned direction.
The authors reverse-engineer the convolutional kernels that operate on path channels, showing these kernels implement forward and backward plan extension by propagating activations along movement directions and enabling backtracking through negative value propagation.
The authors describe how the network implements bidirectional search by initializing path segments at boxes and targets, extending them via specialized kernels, and pruning unpromising paths by propagating negative activations backward along path segments as a form of backtracking.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[1] Planning in a recurrent neural network that plays Sokoban PDF
[40] Interpreting learned search: finding a transition model and value function in an RNN that plays Sokoban PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
Discovery of path channels as direct plan representation
The authors identify that specific hidden state channels in the DRC(3,3) network directly encode the agent's and boxes' future movement directions without requiring linear probes. High activation in a path channel at a location indicates the propensity to move in that channel's assigned direction.
[60] Predrnn: A recurrent neural network for spatiotemporal predictive learning PDF
[61] Multi-condition latent diffusion network for scene-aware neural human motion prediction PDF
[62] Temporal recurrent networks for online action detection PDF
[63] Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences PDF
[64] Learning semantic latent directions for accurate and controllable human motion prediction PDF
[65] A prospective approach for human-to-human interaction recognition from Wi-Fi channel data using attention bidirectional gated recurrent neural network with GUI ⦠PDF
[66] Desire: Distant future prediction in dynamic scenes with interacting agents PDF
[67] On human motion prediction using recurrent neural networks PDF
[68] Text2action: Generative adversarial synthesis from language to action PDF
[69] Adversarial generative learning and timed path optimization for real-time visual image prediction to guide robot arm movements PDF
Mechanistic explanation via plan extension kernels
The authors reverse-engineer the convolutional kernels that operate on path channels, showing these kernels implement forward and backward plan extension by propagating activations along movement directions and enabling backtracking through negative value propagation.
[40] Interpreting learned search: finding a transition model and value function in an RNN that plays Sokoban PDF
[51] Residual Mask in Cascaded Convolutional Transformer for Spectral Reconstruction PDF
[52] Trajectory convolution for action recognition PDF
[53] Pdsketch: Integrated planning domain programming and learning PDF
[54] Online planner selection with graph neural networks and adaptive scheduling PDF
[55] Continuity-Preserving Convolutional Autoencoders for Learning Continuous Latent Dynamical Models from Images PDF
[56] Spatiotemporal trajectories in resting-state FMRI revealed by convolutional variational autoencoder PDF
[57] FIR filters for online trajectory planning with time-and frequency-domain specifications PDF
[58] Universal value iteration networks: When spatially-invariant is not universal PDF
[59] Multi-Objective Optimization of Route Planning Based on Distributed Breadth Convolutional Neural Network and Markov Decision Processes PDF
Bidirectional planning algorithm with backtracking
The authors describe how the network implements bidirectional search by initializing path segments at boxes and targets, extending them via specialized kernels, and pruning unpromising paths by propagating negative activations backward along path segments as a form of backtracking.