Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis
Overview
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce TSLA, a theoretically grounded method that identifies attention heads responsible for Task Recognition and Task Learning in in-context learning by measuring head contributions relative to task-label unembeddings in geometric subspace terms, addressing limitations of prior attribution approaches.
Through steering experiments and geometric analysis, the authors demonstrate that TR heads align hidden states to the task-label subspace for label-space recognition, while TL heads perform within-subspace rotations toward correct labels to enable accurate prediction.
The authors establish a unified framework that bridges attention-head-level mechanistic analysis with the holistic Task Recognition and Task Learning decomposition, reconciling prior findings on induction heads and task vectors within this integrated perspective.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
Contribution Analysis
Detailed comparisons for each claimed contribution
Task Subspace Logit Attribution (TSLA) framework for identifying TR and TL heads
The authors introduce TSLA, a theoretically grounded method that identifies attention heads responsible for Task Recognition and Task Learning in in-context learning by measuring head contributions relative to task-label unembeddings in geometric subspace terms, addressing limitations of prior attribution approaches.
Geometric characterization of TR and TL head mechanisms
Through steering experiments and geometric analysis, the authors demonstrate that TR heads align hidden states to the task-label subspace for label-space recognition, while TL heads perform within-subspace rotations toward correct labels to enable accurate prediction.
Unified framework reconciling component-level and holistic ICL perspectives
The authors establish a unified framework that bridges attention-head-level mechanistic analysis with the holistic Task Recognition and Task Learning decomposition, reconciling prior findings on induction heads and task vectors within this integrated perspective.