InclusiveVidPose: Bridging the Pose Estimation Gap for Individuals with Limb Deficiencies in Video-Based Motion

ICLR 2026 Conference SubmissionAnonymous Authors
Disabled personIndividuals with limb deficienciesdatasethuman pose estimation
Abstract:

Approximately 445.2 million individuals worldwide are living with traumatic amputations, and an estimated 31.64 million children aged 0–14 have congenital limb differences, yet they remain largely underrepresented in human pose estimation (HPE) research. Accurate HPE could significantly benefit this population in applications, such as rehabilitation monitoring and health assessment. However, the existing HPE datasets and methods assume that humans possess a full complement of upper and lower extremities and fail to model missing or altered limbs. As a result, people with limb deficiencies remain largely underrepresented, and current models cannot generalize to their unique anatomies or predict absent joints. To bridge this gap, we introduce InclusiveVidPose Dataset, the first video-based large-scale HPE dataset specific for individuals with limb deficiencies. We collect 313 videos, totaling 327k frames, and covering nearly 400 individuals with amputations, congenital limb differences, and prosthetic limbs. We adopt 8 extra keypoints at each residual limb end to capture individual anatomical variations. Under the guidance of an internationally accredited para-athletics classifier, we annotate each frame with pose keypoints, segmentation masks, bounding boxes, tracking IDs, and per-limb prosthesis status. Experiments on InclusiveVidPose highlight the limitations of the existing HPE models for individuals with limb deficiencies. We introduce a new evaluation metric, Limb-specific Confidence Consistency (LiCC), which assesses the consistency of pose estimations between residual and intact limb keypoints. We also provide a rigorous benchmark for evaluating inclusive and robust pose estimation algorithms, demonstrating that our dataset poses significant challenges. We hope InclusiveVidPose spur research toward methods that fairly and accurately serve all body types. The project website is available at: InclusiveVidPose.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces InclusiveVidPose, a large-scale video dataset for human pose estimation in individuals with limb deficiencies, comprising 313 videos and 327k frames across nearly 400 individuals. Within the taxonomy, it resides in the 'Video-Based Limb Deficiency Datasets' leaf under 'Benchmark Datasets and Evaluation Frameworks'. This leaf contains only two papers total, indicating a sparse research direction. The dataset addresses a critical gap where standard pose estimation resources assume intact anatomy, making this a relatively underexplored area despite its clinical importance.

The taxonomy reveals that neighboring leaves focus on prosthetic user gait datasets and general limb deficiency pose datasets, while sibling branches address vision-based methods, wearable sensors, and clinical applications. The 'Video-Based Limb Deficiency Datasets' node explicitly excludes image-only datasets and those without limb deficiency focus, positioning InclusiveVidPose as complementary to depth-focused efforts and gait-specific collections. The broader 'Benchmark Datasets' branch contains only four papers across three leaves, suggesting that dataset creation for this population remains nascent compared to methodological development in adjacent vision-based and clinical branches.

Among 16 candidates examined, the dataset contribution shows one refutable candidate from 10 examined, the extended keypoint schema shows one from 4 examined, and the LiCC metric shows one from 2 examined. The limited search scope means these statistics reflect top-K semantic matches rather than exhaustive coverage. The dataset contribution appears to have substantial prior work in the form of at least one overlapping resource among the candidates reviewed, while the keypoint schema and evaluation metric each face at least one potentially overlapping prior approach within their smaller candidate pools.

Based on the limited literature search of 16 candidates, the work addresses a sparsely populated research direction with only one sibling paper in its taxonomy leaf. The dataset's video-based temporal sequences and scale differentiate it within the benchmark branch, though the analysis cannot determine whether similar resources exist beyond the top-K semantic matches examined. The extended keypoint schema and LiCC metric each show potential overlap with at least one candidate, warranting closer examination of those specific prior works.

Taxonomy

Core-task Taxonomy Papers
26
3
Claimed Contributions
15
Contribution Candidate Papers Compared
3
Refutable Paper

Research Landscape Overview

Core task: human pose estimation for individuals with limb deficiencies. This emerging field addresses the challenge of accurately capturing body pose when standard skeletal models assume intact limbs. The taxonomy reveals a multifaceted landscape organized around several complementary directions. Vision-based methods adapt existing pose estimation architectures to handle missing or prosthetic limbs, while wearable sensor approaches leverage inertial and kinematic data for tracking. A critical branch focuses on benchmark datasets and evaluation frameworks, recognizing that standard pose datasets rarely include limb differences and that new data resources are essential for progress. Parallel branches address adaptive mesh recovery for 3D reconstruction, clinical and rehabilitation applications that translate pose estimates into therapeutic insights, and motion prediction for assistive device control. Additional clusters examine upper-limb motion analysis, mixed reality therapeutic tools, and curated conference proceedings that document the field's evolution. Recent work highlights tensions between data scarcity and model generalization. Several studies introduce specialized datasets for wheelchair users (WheelPose[4], WheelPoser[5]) or develop diffusion-based methods to synthesize training data for prosthetic scenarios (Diffusion Prosthetic Pose[1]). Others propose domain adaptation techniques that repurpose models trained on able-bodied populations (Limb Loss Reprogramming[7], LDPose[6]). Within this landscape, InclusiveVidPose[0] sits squarely in the benchmark datasets branch, contributing a video-based resource that captures diverse limb deficiency presentations. Its emphasis on video data contrasts with depth-focused efforts like Depth Impairment Pose[3] and complements gait-oriented datasets such as ProGait[12]. By providing temporal sequences rather than static frames, InclusiveVidPose[0] enables richer motion analysis and supports downstream applications in rehabilitation and assistive technology, addressing a gap that many vision-based and clinical branches depend upon for validation and real-world deployment.

Claimed Contributions

InclusiveVidPose Dataset

The authors present the first large-scale video-based human pose estimation dataset focused on individuals with limb deficiencies, containing 313 videos with 327k frames covering nearly 400 individuals with amputations, congenital limb differences, and prosthetic limbs, annotated with pose keypoints, segmentation masks, bounding boxes, tracking IDs, and prosthesis status.

10 retrieved papers
Can Refute
Extended keypoint schema with residual limb endpoints

The authors introduce an extended keypoint schema that builds on the COCO format by adding eight residual-limb endpoint keypoints (above and below elbow/knee on both sides) to explicitly represent anatomical variations in individuals with limb deficiencies, enabling models to distinguish between intact and residual structures.

3 retrieved papers
Can Refute
Limb-specific Confidence Consistency (LiCC) metric

The authors propose a new evaluation metric called Limb-specific Confidence Consistency that measures whether pose estimation models can correctly distinguish intact limbs from residual or missing limbs by comparing predicted confidence scores for visible keypoints against mutually exclusive anatomically impossible keypoints.

2 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

InclusiveVidPose Dataset

The authors present the first large-scale video-based human pose estimation dataset focused on individuals with limb deficiencies, containing 313 videos with 327k frames covering nearly 400 individuals with amputations, congenital limb differences, and prosthetic limbs, annotated with pose keypoints, segmentation masks, bounding boxes, tracking IDs, and prosthesis status.

Contribution

Extended keypoint schema with residual limb endpoints

The authors introduce an extended keypoint schema that builds on the COCO format by adding eight residual-limb endpoint keypoints (above and below elbow/knee on both sides) to explicitly represent anatomical variations in individuals with limb deficiencies, enabling models to distinguish between intact and residual structures.

Contribution

Limb-specific Confidence Consistency (LiCC) metric

The authors propose a new evaluation metric called Limb-specific Confidence Consistency that measures whether pose estimation models can correctly distinguish intact limbs from residual or missing limbs by comparing predicted confidence scores for visible keypoints against mutually exclusive anatomically impossible keypoints.