What’s Your Sign?
USE CASE STUDY
American Sign Language Handshape Recognition with Vertex AI
Powering SDSU research team with Vertex AI experimentation workbench to unlock AI/ML for American Sign Language recognition.
General challenges for automated sign recognition
The state of sign language AI is far behind the state of AI systems for spoken and written languages. This is due to a number of factors: a lack of adequate sign language datasets, a lack of knowledge exchange between computational scientists and sign language linguistics experts, a lack of a conventionalized written system for signed languages, and the fact that most existing language models are built on spoken/written language, all result in unreliable models. AI-assisted sign language recognition should be leveraged in a way that benefits research and the various stakeholders, in particular sign language communities.
Current challenges for SDSU
Analysis of signed language datasets is laborious and costly because it requires trained humans to watch vast amounts of video footage to manually label or annotate signs and their components. Even partial automatization of these annotation processes – as has been possible for speech recognition – would significantly advance the researchers’ ability to examine signed languages at large. The Laboratory for Language and Cognitive Neuroscience (LLCN) at San Diego State University (SDSU) wants to develop AI models trained on the recognition of handshape parts rather than signs as wholes (whole-word approach is prevalent in current models) because it approximates approaches to automatic speech recognition (i.e., breaking down speech audio into individual speech sounds) and thus can lead to improved success of the models because those would be the most robust to natural variation in how signers articulate signs (e.g., dialects).
The SDSU team would like to use Google’s AI/ML to recognize and classify the most common American Sign Language (ASL) handshapes and handshape parts to boost linguistic research on ASL. The SDSU team wants to train models on large structured datasets (they collected thousands of annotated video clips of people producing signs) and to easily carry out and manage experiments for developing the best ML-boosted solution. SDSU plans to use GCP tools to create a benchmark (an experimental workbench) for recognizing the most common handshapes and handshape parts in ASL.
In the first phase, SDSU wants to evaluate capabilities of a baseline model for recognizing the handshapes and/or handshape parts.
SOLUTION DELIVERED BY F33
F33 has followed its AI/ML Framework to deliver AI/ML solution for SDSU. As a result, we helped SDSU to formulate requirements, prepare datasets and train their research team to use the solution.
“An independent multi-channel training and recognition will support automatic annotation of signs and their parts, such as handshape, what fingers are selected, are they spread or bent, etc., and can aid or fully automate corpus annotation. Such corpora could be used to improve models of sign recognition and translation. Why is a language-based “sub-unit” based modeling approach important? For example, the meaning of a sign can change depending on which fingers are selected or whether they are bent or straight. There could also be subtle variation in the way a person articulates a sign (e.g., due to dialects/accents). Models informed by findings from sign linguistics research will lead to systems that more closely mirror real-world sign language usage.”