Formalising Human-in-the-Loop: Computational Reductions, Failure Modes, and Legal-Moral Responsibility

ICLR 2026 Conference SubmissionAnonymous Authors
Human-in-the-loopAutomated decision making systemHuman oversight in sociotechnical systemsOracle machineAI safetyTrustworthy AI
Abstract:

We use the notion of oracle machines and reductions from computability theory to formalise different Human-in-the-loop (HITL) setups for AI systems, distinguishing between trivial human monitoring (i.e., total functions), single endpoint human action (i.e., many-one reductions), and highly involved human-AI interaction (i.e., Turing reductions). We then proceed to show that the legal status and safety of different setups vary greatly. We present a taxonomy to categorise HITL failure modes, highlighting the practical limitations of HITL setups. We then identify omissions in UK and EU legal frameworks, which focus on HITL setups that may not always achieve the desired ethical, legal, and sociotechnical outcomes. We suggest areas where the law should recognise the effectiveness of different HITL setups and assign responsibility in these contexts, avoiding human `scapegoating'. Our work shows an unavoidable trade-off between attribution of legal responsibility, and technical explainability. Overall, we show how HITL setups involve many technical design decisions, and can be prone to failures out of the humans' control. Our formalisation and taxonomy opens up a new analytic perspective on the challenges in creating HITL setups, helping inform AI developers and lawmakers on designing HITL setups to better achieve their desired outcomes.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper formalizes human-in-the-loop (HITL) setups using oracle machines and computational reductions from computability theory, distinguishing trivial monitoring, single-endpoint action, and highly interactive collaboration. It resides in the 'Computational Reduction Models for Human-AI Interaction' leaf, which contains only two papers total. This leaf sits within the broader 'Theoretical Foundations and Formal Frameworks' branch, indicating a relatively sparse research direction focused on rigorous formal characterizations rather than empirical or application-driven work.

The taxonomy reveals neighboring leaves addressing 'Interaction Protocols and Decision Frameworks' (tractable protocols and agreement mechanisms) and 'Safety and Reliability Frameworks' (mode confusion and formal verification). These adjacent areas share the theoretical branch but diverge in focus: the sibling leaves emphasize decision-theoretic models and fault detection, whereas the paper's leaf concentrates on reduction-based abstractions. The taxonomy's scope notes clarify that applied implementations belong elsewhere, reinforcing that this work occupies a foundational niche distinct from domain-specific applications scattered across the 'Application Domains' branch.

Among 29 candidates examined, the three contributions—formalizing HITL via reductions (9 candidates), taxonomizing failure modes (10 candidates), and analyzing legal frameworks (10 candidates)—show no clear refutations. The limited search scope means these statistics reflect top-K semantic matches and citation expansion, not exhaustive coverage. The formalization contribution appears particularly novel given the sparse leaf population, while the failure taxonomy and legal analysis may overlap with broader human-AI interaction literature not captured in this focused search. The absence of refutable pairs suggests either genuine novelty or gaps in the candidate pool.

Based on the limited search of 29 candidates, the work appears to occupy a sparsely populated formal niche, with its reduction-based approach distinguishing it from neighboring protocol-oriented or safety-focused frameworks. The analysis cannot confirm whether larger-scale searches or domain-specific legal literature would reveal closer prior work, particularly for the legal responsibility and failure mode contributions.

Taxonomy

Core-task Taxonomy Papers
25
3
Claimed Contributions
29
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: formalising human-in-the-loop setups using computational reductions. The field spans a diverse set of concerns, from foundational theory to practical deployment. At the highest level, the taxonomy organizes work into six main branches: Theoretical Foundations and Formal Frameworks, which develop rigorous models and reduction-based abstractions for human–AI interaction; Interactive System Design and Optimization, which addresses interface design, user modeling, and adaptive workflows; Machine Learning with Human Feedback, covering reinforcement learning from human preferences and related training paradigms; Dimensionality Reduction and Reliability, which tackles visualization, interpretability, and robustness; Application Domains, spanning robotics, healthcare, energy systems, and beyond; and Supporting Methodologies, which provide cross-cutting techniques such as protocol design and data fusion. Representative works illustrate these themes: for instance, Humans Out Loop[11] and Formal Frameworks Mode Confusion[12] anchor the theoretical side, while RRHF[18] and LLM Interactive Code Generation[3] exemplify machine learning with feedback, and Visual Analytics Dimensionality Reduction[2] highlights interpretability challenges. Several active lines of work reveal key trade-offs and open questions. One tension lies between formal guarantees—pursued by reduction-based frameworks that treat human input as an oracle or computational resource—and the messy realities of adaptive interfaces and noisy feedback in deployed systems. Another contrast emerges between domain-agnostic methodologies, such as dimensionality reduction techniques like Linear t-SNE[22], and domain-specific applications like Brain Stimulation Optimization[10] or Vehicle Platooning Intervention[14], each of which must reconcile general principles with specialized constraints. Within this landscape, Formalising Human-in-the-Loop[0] sits squarely in the Theoretical Foundations branch, specifically under Computational Reduction Models for Human-AI Interaction. Its emphasis on rigorous reduction-based abstractions aligns closely with Humans Out Loop[11], which also explores formal characterizations of human involvement, though the two may differ in how they model the boundary between automated and human-driven decision-making. By anchoring human-in-the-loop setups in computational complexity and reduction theory, this work provides a unifying lens that complements more empirical or application-focused studies elsewhere in the taxonomy.

Claimed Contributions

Formalisation of HITL setups using computational reductions

The authors introduce a novel computational framework that characterises HITL setups through oracle machines and reduction types from computability theory. This formalisation distinguishes three setup types: trivial monitoring (total functions), endpoint action (many-one reductions), and involved interaction (Turing reductions), unifying disparate HITL concepts under a consistent theoretical lens.

9 retrieved papers
Taxonomy of HITL failure modes

The authors develop a taxonomy organised into five main failure categories (machine components, process and workflow, human–machine interface, human component, and exogenous circumstances) that systematically captures how HITL setups can fail in practice. This taxonomy connects failure modes to the different computational reduction types identified in their formalisation.

10 retrieved papers
Analysis of legal frameworks and responsibility trade-offs

The authors analyse UK and EU legal frameworks (GDPR and EU AI Act) to identify gaps in how they address HITL requirements, and reveal an inherent trade-off: HITL setups with greater explainability (involved interactions) create responsibility gaps, while setups with clearer responsibility attribution (endpoint actions) are less transparent. They provide suggestions for improving legal frameworks to prevent humans from becoming scapegoats.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Formalisation of HITL setups using computational reductions

The authors introduce a novel computational framework that characterises HITL setups through oracle machines and reduction types from computability theory. This formalisation distinguishes three setup types: trivial monitoring (total functions), endpoint action (many-one reductions), and involved interaction (Turing reductions), unifying disparate HITL concepts under a consistent theoretical lens.

Contribution

Taxonomy of HITL failure modes

The authors develop a taxonomy organised into five main failure categories (machine components, process and workflow, human–machine interface, human component, and exogenous circumstances) that systematically captures how HITL setups can fail in practice. This taxonomy connects failure modes to the different computational reduction types identified in their formalisation.

Contribution

Analysis of legal frameworks and responsibility trade-offs

The authors analyse UK and EU legal frameworks (GDPR and EU AI Act) to identify gaps in how they address HITL requirements, and reveal an inherent trade-off: HITL setups with greater explainability (involved interactions) create responsibility gaps, while setups with clearer responsibility attribution (endpoint actions) are less transparent. They provide suggestions for improving legal frameworks to prevent humans from becoming scapegoats.