BANZ-FS: BANZSL Fingerspelling Dataset

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.0 Download Report PDF

Sign LanguageBANZSLFingerspelling

Fingerspelling plays a vital role in sign languages, particularly for conveying names, technical terms, and words not found in the standard lexicon. However, evaluation of two-handed fingerspelling detection and recognition is rarely addressed in existing sign language datasets—particularly for BANZSL (British, Australian, and New Zealand Sign Language), which share a common two-handed manual alphabet. To bridge this gap, we curate a large-scale dataset, dubbed BANZ-FS, focused on BANZSL fingerspelling in both controlled and real-world environments. Our dataset is compiled from three distinct sources: (1) live sign language interpretation in news broadcasts, (2) controlled laboratory recordings, and (3) diary vlogs from online platforms and social media. This composition enables BANZ-FS to capture variations in signing tempos and fluency across diverse signers and contents. Each instance in BANZ-FS is carefully annotated with multi-level alignment: video ↔ subtitles, video ↔ fingerspelled letters, and video ↔ target lexicons. In total, BANZ-FS includes over 35,000 video-aligned fingerspelling instances. Importantly, BANZ-FS highlights the unique linguistic and visual challenges posed by two-handed fingerspelling, including handshape coarticulation, self-occlusion, intra-letter variation, and rapid inter-letter transitions. We benchmark state-of-the-art models on the key tasks, including fingerspelling detection, isolated fingerspelling recognition, and fingerspelling recognition in context. Experimental results show that BANZ-FS presents substantial challenges while offering rich opportunities for BANZSL understanding and broader sign language technology. The dataset and benchmarks are available at BANZ-FS.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: two-handed fingerspelling detection and recognition in sign language. The field organizes around several complementary branches. Recognition Methods and Architectures explores neural network designs and feature extraction strategies, ranging from early CNN approaches like JSL CNN Recognition[4] to more recent transformer-based models. Datasets and Benchmarks provides the empirical foundation, with resources spanning multiple sign languages—such as BSL Two Hand[9], Thai Two Handed[8], and Arabic Sign Keypoints[3]—that enable systematic evaluation. Weakly-Supervised and Automatic Annotation addresses the challenge of scaling data collection when manual labeling is costly, exemplified by Weakly Supervised BSL[1]. Language-Specific Recognition Systems tailors methods to individual sign languages, while Applications and Learning Tools translate research into educational platforms like Mixed Reality Learning[20]. Linguistic and Cognitive Studies examines how signers produce and perceive fingerspelling, Animation and Synthesis generates realistic signing avatars, and Non-Sign Detection and Robustness ensures systems handle real-world variability. A particularly active line of work focuses on building high-quality two-handed fingerspelling datasets, which remain scarce compared to one-handed resources. The BANZ Fingerspelling Dataset[0] contributes to this effort by providing annotated examples for a less-studied sign language, joining a small handful of similar benchmarks like Thai Fingerspelling Benchmark[2] and Turkish Bimanual Alphabet[30]. These datasets enable researchers to move beyond American Sign Language and explore cross-linguistic variation in two-handed alphabets. Meanwhile, weakly-supervised methods such as Weakly Supervised BSL[16] offer a complementary strategy for expanding coverage when full annotation is impractical. The original paper sits squarely within the Datasets and Benchmarks branch, addressing the fundamental need for diverse, well-annotated corpora that can support both language-specific recognition systems and broader studies of fingerspelling across different signing communities.

Claimed Contributions

BANZ-FS: Large-scale BANZSL fingerspelling dataset

10 retrieved papers

The authors introduce BANZ-FS, a dataset containing over 35,000 video-aligned fingerspelling instances for British, Australian, and New Zealand Sign Language. The dataset is compiled from three sources: news broadcasts, laboratory recordings, and online vlogs, capturing diverse signing tempos and contexts with multi-level annotations including video-subtitle alignment, fingerspelled letters, and target lexicons.

10 retrieved papers

Multi-level annotation protocol for fingerspelling tasks

10 retrieved papers

The authors develop a comprehensive annotation framework that includes temporal boundaries of sign video clips, temporal boundaries of fingerspellings, lexical forms of fingerspellings, and English transcriptions. This protocol supports multiple fingerspelling-related tasks and explicitly annotates linguistic phenomena such as abbreviations, acronyms, misspellings, and inline corrections.

10 retrieved papers

Benchmark evaluation of fingerspelling recognition methods

Can Refute

10 retrieved papers

The authors establish comprehensive benchmarks for fingerspelling detection, isolated fingerspelling recognition, and fingerspelling recognition in context using publicly available state-of-the-art models. The experimental results demonstrate that BANZ-FS poses significant challenges to existing methods while providing a platform for evaluating two-handed fingerspelling understanding.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

BANZ-FS: Large-scale BANZSL fingerspelling dataset

[2] Deep multimodal-based finger spelling recognition for Thai sign language: a new benchmark and model composition PDF

Cannot Refute

[36] Fingerspelling within sign language translation PDF

Cannot Refute

[37] TLFS23 Tamil language fingerspelling dataset PDF

Cannot Refute

[38] Deep motion templates and extreme learning machine for sign language recognition PDF

Cannot Refute

[39] TFRS: Thai finger-spelling sign language recognition system PDF

Cannot Refute

[40] AzSLD: Azerbaijani sign language dataset for fingerspelling, word, and sentence translation with baseline software PDF

Cannot Refute

[41] HandReader: Advanced Techniques for Efficient Fingerspelling Recognition PDF

Cannot Refute

[42] Recent advances of deep learning for sign language recognition PDF

Cannot Refute

[43] American sign language fingerspelling recognition in the wild PDF

Cannot Refute

[44] Spelling it out: Real-time ASL fingerspelling recognition PDF

Cannot Refute

Contribution

Multi-level annotation protocol for fingerspelling tasks

[45] Understanding vision-based continuous sign language recognition PDF

Cannot Refute

[46] Thai fingerspelling recognition using hand landmark clustering PDF

Cannot Refute

[47] Point-Supervised Japanese Fingerspelling Localization via HR-Pro and Contrastive Learning PDF

Cannot Refute

[48] Finger spelling recognition using depth information and support vector machine PDF

Cannot Refute

[49] A multi-class pattern recognition system for practical finger spelling translation PDF

Cannot Refute

[50] Documentary and corpus approaches to sign language research PDF

Cannot Refute

[51] Arabic Sign Language Recognition: A Multimodal Systematic Review, Taxonomy, and Benchmark Recommendations PDF

Cannot Refute

[52] Simultaneous spotting of signs and fingerspellings based on hierarchical conditional random fields and boostmap embeddings PDF

Cannot Refute

[53] Vision-Based recognition of fingerspelled acronyms using hierarchical temporal memory PDF

Cannot Refute

[54] Public DGS Corpus: Annotation Conventions / Ãffentliches DGS-Korpus: Annotationskonventionen PDF

Cannot Refute

Contribution

Benchmark evaluation of fingerspelling recognition methods

[55] Fingerspelling detection in american sign language PDF

Can Refute

[1] Weakly-supervised fingerspelling recognition in british sign language videos PDF

Cannot Refute

[2] Deep multimodal-based finger spelling recognition for Thai sign language: a new benchmark and model composition PDF

Cannot Refute

[56] Asl stem wiki: Dataset and benchmark for interpreting stem articles PDF

Cannot Refute

[57] Investigating motion history images and convolutional neural networks for isolated Irish sign language fingerspelling recognition PDF

Cannot Refute

[58] A new extension of FDOSM based on Pythagorean fuzzy environment for evaluating and benchmarking sign language recognition systems PDF

Cannot Refute

[59] Seeing in 2d, thinking in 3d: 3d hand mesh-guided feature learning for continuous fingerspelling PDF

Cannot Refute

[60] A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images PDF

Cannot Refute

[61] Real-Time Sign Language Fingerspelling Recognition System Using 2D Deep CNN with Two-Stream Feature Extraction Approach. PDF

Cannot Refute

[62] LSE-FS-UVigo Dataset and Keypoint-Based Fingerspelling Recognition PDF

Cannot Refute

BANZ-FS: BANZSL Fingerspelling Dataset

Overview

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

BANZ-FS: Large-scale BANZSL fingerspelling dataset

[2] Deep multimodal-based finger spelling recognition for Thai sign language: a new benchmark and model composition PDF

[36] Fingerspelling within sign language translation PDF

[37] TLFS23 Tamil language fingerspelling dataset PDF

[38] Deep motion templates and extreme learning machine for sign language recognition PDF

[39] TFRS: Thai finger-spelling sign language recognition system PDF

[40] AzSLD: Azerbaijani sign language dataset for fingerspelling, word, and sentence translation with baseline software PDF

[41] HandReader: Advanced Techniques for Efficient Fingerspelling Recognition PDF

[42] Recent advances of deep learning for sign language recognition PDF

[43] American sign language fingerspelling recognition in the wild PDF

[44] Spelling it out: Real-time ASL fingerspelling recognition PDF

Multi-level annotation protocol for fingerspelling tasks

[45] Understanding vision-based continuous sign language recognition PDF

[46] Thai fingerspelling recognition using hand landmark clustering PDF

[47] Point-Supervised Japanese Fingerspelling Localization via HR-Pro and Contrastive Learning PDF

[48] Finger spelling recognition using depth information and support vector machine PDF

[49] A multi-class pattern recognition system for practical finger spelling translation PDF

[50] Documentary and corpus approaches to sign language research PDF

[51] Arabic Sign Language Recognition: A Multimodal Systematic Review, Taxonomy, and Benchmark Recommendations PDF

[52] Simultaneous spotting of signs and fingerspellings based on hierarchical conditional random fields and boostmap embeddings PDF

[53] Vision-Based recognition of fingerspelled acronyms using hierarchical temporal memory PDF

[54] Public DGS Corpus: Annotation Conventions / Ãffentliches DGS-Korpus: Annotationskonventionen PDF

Benchmark evaluation of fingerspelling recognition methods

[55] Fingerspelling detection in american sign language PDF

[1] Weakly-supervised fingerspelling recognition in british sign language videos PDF

[2] Deep multimodal-based finger spelling recognition for Thai sign language: a new benchmark and model composition PDF

[56] Asl stem wiki: Dataset and benchmark for interpreting stem articles PDF

[57] Investigating motion history images and convolutional neural networks for isolated Irish sign language fingerspelling recognition PDF

[58] A new extension of FDOSM based on Pythagorean fuzzy environment for evaluating and benchmarking sign language recognition systems PDF

[59] Seeing in 2d, thinking in 3d: 3d hand mesh-guided feature learning for continuous fingerspelling PDF

[60] A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images PDF

[61] Real-Time Sign Language Fingerspelling Recognition System Using 2D Deep CNN with Two-Stream Feature Extraction Approach. PDF

[62] LSE-FS-UVigo Dataset and Keypoint-Based Fingerspelling Recognition PDF

Table of Contents

[54] Public DGS Corpus: Annotation Conventions / Ãffentliches DGS-Korpus: Annotationskonventionen PDF