Sign Language Recognition with Computer Vision • Luca ML Blog

Quick Intro

Building a real-time action-tracking model. requires a blend of computer vision savvy and a good grasp of anatomy. But here’s the good news: the heavy lifting in technology has been done, thanks to Google’s Mediapipe library.

This nifty tool lets us use its holistic model to track body movements in real-time. So, the main puzzle left for us to solve would be developing and training a model that can classify these movements. For dealing with sequence tasks like this, LSTM is always one of the top choices.

The Work Flow

Flowchart of action recognition — *Basic models work flow*

Real-time sign language recognition — *Real-Time Sign Language Recognition*

Final Thoughts

Data Quality is ALWAYS the first: The quality of your data (think camera resolution, lighting, backgrounds, frame duration) is a game-changer for your model’s performance. Sometimes, a simple shallow LSTM model works wonders with top-notch data.
Holistic + LSTM = A Winning Combo: The holistic model is a star – it’s quick, accurate, and dependable. Pair it with some shallow LSTM layers, and an efficient setup for action recognition is ready.
Feature Selection Tricks: In this project, I played around with face, hand, and body landmarks. I guess dropping the face landmarks might speed up the prediction process even more without messing up the results too much. I haven’t tested this yet but I believe this might be the case.

Improvements

Training is not smooth, overfitting: Some dropouts and normalizations might improve the model’s training stability, as you can see the training is not smooth at all.
The recognition speed: Not fast enough for tackling real-world hand gesture translation tasks, as I am using 50% as the threshold, meaning if the prediction probability reaches 50% then it yields the corresponding action as the result. I think lowering that threshold and increasing the model’s generalization ability might lead to a much faster prediction speed.

Project Repository

👉 Github