ENHANCED SIGN LANGUAGE DETECTION USING DEEP LEARNING TECHNIQUES: A REVIEW

Abstract
Sign language recognition (SLR) systems leveraging deep learning have made significant strides in recent years, yet critical gaps remain between laboratory performance and real-world applicability. This systematic review analyzes 20 peer-reviewed studies (2021-2024) to evaluate methodological approaches, identify persistent challenges, and propose actionable solutions. While convolutional neural networks (CNNs) and recurrent architectures achieve 88-98% accuracy on constrained datasets, our analysis reveals four fundamental limitations: (1) the accuracy-efficiency trade-off that prevents real-time deployment, (2) inadequate temporal modeling for continuous signing, (3) dataset biases that hinder cross-lingual generalization, and (4) the neglect of non-manual linguistic features. Transformer-based and hybrid models show promise but face computational bottlenecks and data hunger. The review highlights successful techniques like attention mechanisms (improving accuracy by 4-7%) and skeleton-based approaches (reducing computation by 40%), while exposing critical shortcomings in current evaluation protocols. The study proposed four key recommendations: developing edge-optimized lightweight architectures, implementing multimodal fusion for linguistic completeness, expanding diverse dataset curation through community engagement, and adopting self-supervised learning for low-resource scenarios. These findings provide a roadmap for bridging the gap between experimental SLR systems and practical, inclusive deployment, emphasizing the need for closer collaboration between computer scientists, linguists, and deaf communities. The study serves as both a comprehensive reference for SLR researchers and a call to action for more linguistically informed, computationally efficient solutions.
Keywords
Sign Language Prediction, Deep Learning Techniques, Systematic Review, Self-Supervised Learning, Multimodal Fusion