BANZ-FS: BANZSL Fingerspelling Dataset

ICLR 2026 Conference SubmissionAnonymous Authors
Sign LanguageBANZSLFingerspelling
Abstract:

Fingerspelling plays a vital role in sign languages, particularly for conveying names, technical terms, and words not found in the standard lexicon. However, evaluation of two-handed fingerspelling detection and recognition is rarely addressed in existing sign language datasets—particularly for BANZSL (British, Australian, and New Zealand Sign Language), which share a common two-handed manual alphabet. To bridge this gap, we curate a large-scale dataset, dubbed BANZ-FS, focused on BANZSL fingerspelling in both controlled and real-world environments. Our dataset is compiled from three distinct sources: (1) live sign language interpretation in news broadcasts, (2) controlled laboratory recordings, and (3) diary vlogs from online platforms and social media. This composition enables BANZ-FS to capture variations in signing tempos and fluency across diverse signers and contents. Each instance in BANZ-FS is carefully annotated with multi-level alignment: video ↔ subtitles, video ↔ fingerspelled letters, and video ↔ target lexicons. In total, BANZ-FS includes over 35,000 video-aligned fingerspelling instances. Importantly, BANZ-FS highlights the unique linguistic and visual challenges posed by two-handed fingerspelling, including handshape coarticulation, self-occlusion, intra-letter variation, and rapid inter-letter transitions. We benchmark state-of-the-art models on the key tasks, including fingerspelling detection, isolated fingerspelling recognition, and fingerspelling recognition in context. Experimental results show that BANZ-FS presents substantial challenges while offering rich opportunities for BANZSL understanding and broader sign language technology. The dataset and benchmarks are available at BANZ-FS.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Taxonomy

Core-task Taxonomy Papers
35
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: two-handed fingerspelling detection and recognition in sign language. The field organizes around several complementary branches. Recognition Methods and Architectures explores neural network designs and feature extraction strategies, ranging from early CNN approaches like JSL CNN Recognition[4] to more recent transformer-based models. Datasets and Benchmarks provides the empirical foundation, with resources spanning multiple sign languages—such as BSL Two Hand[9], Thai Two Handed[8], and Arabic Sign Keypoints[3]—that enable systematic evaluation. Weakly-Supervised and Automatic Annotation addresses the challenge of scaling data collection when manual labeling is costly, exemplified by Weakly Supervised BSL[1]. Language-Specific Recognition Systems tailors methods to individual sign languages, while Applications and Learning Tools translate research into educational platforms like Mixed Reality Learning[20]. Linguistic and Cognitive Studies examines how signers produce and perceive fingerspelling, Animation and Synthesis generates realistic signing avatars, and Non-Sign Detection and Robustness ensures systems handle real-world variability. A particularly active line of work focuses on building high-quality two-handed fingerspelling datasets, which remain scarce compared to one-handed resources. The BANZ Fingerspelling Dataset[0] contributes to this effort by providing annotated examples for a less-studied sign language, joining a small handful of similar benchmarks like Thai Fingerspelling Benchmark[2] and Turkish Bimanual Alphabet[30]. These datasets enable researchers to move beyond American Sign Language and explore cross-linguistic variation in two-handed alphabets. Meanwhile, weakly-supervised methods such as Weakly Supervised BSL[16] offer a complementary strategy for expanding coverage when full annotation is impractical. The original paper sits squarely within the Datasets and Benchmarks branch, addressing the fundamental need for diverse, well-annotated corpora that can support both language-specific recognition systems and broader studies of fingerspelling across different signing communities.

Claimed Contributions

BANZ-FS: Large-scale BANZSL fingerspelling dataset

The authors introduce BANZ-FS, a dataset containing over 35,000 video-aligned fingerspelling instances for British, Australian, and New Zealand Sign Language. The dataset is compiled from three sources: news broadcasts, laboratory recordings, and online vlogs, capturing diverse signing tempos and contexts with multi-level annotations including video-subtitle alignment, fingerspelled letters, and target lexicons.

10 retrieved papers
Multi-level annotation protocol for fingerspelling tasks

The authors develop a comprehensive annotation framework that includes temporal boundaries of sign video clips, temporal boundaries of fingerspellings, lexical forms of fingerspellings, and English transcriptions. This protocol supports multiple fingerspelling-related tasks and explicitly annotates linguistic phenomena such as abbreviations, acronyms, misspellings, and inline corrections.

10 retrieved papers
Benchmark evaluation of fingerspelling recognition methods

The authors establish comprehensive benchmarks for fingerspelling detection, isolated fingerspelling recognition, and fingerspelling recognition in context using publicly available state-of-the-art models. The experimental results demonstrate that BANZ-FS poses significant challenges to existing methods while providing a platform for evaluating two-handed fingerspelling understanding.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

BANZ-FS: Large-scale BANZSL fingerspelling dataset

The authors introduce BANZ-FS, a dataset containing over 35,000 video-aligned fingerspelling instances for British, Australian, and New Zealand Sign Language. The dataset is compiled from three sources: news broadcasts, laboratory recordings, and online vlogs, capturing diverse signing tempos and contexts with multi-level annotations including video-subtitle alignment, fingerspelled letters, and target lexicons.

Contribution

Multi-level annotation protocol for fingerspelling tasks

The authors develop a comprehensive annotation framework that includes temporal boundaries of sign video clips, temporal boundaries of fingerspellings, lexical forms of fingerspellings, and English transcriptions. This protocol supports multiple fingerspelling-related tasks and explicitly annotates linguistic phenomena such as abbreviations, acronyms, misspellings, and inline corrections.

Contribution

Benchmark evaluation of fingerspelling recognition methods

The authors establish comprehensive benchmarks for fingerspelling detection, isolated fingerspelling recognition, and fingerspelling recognition in context using publicly available state-of-the-art models. The experimental results demonstrate that BANZ-FS poses significant challenges to existing methods while providing a platform for evaluating two-handed fingerspelling understanding.

BANZ-FS: BANZSL Fingerspelling Dataset | Novelty Validation