Skip to content

AUM researchers introduce AI tool ‘Drishti’ for interpreting sign language

Tucked away in a corner of Goodwyn Hall, Harshavardhan Meka, a computer science major, gently opens his laptop, logs in, and taps the space bar.

At first glance, Meka appears to be a typical student preparing to study, but the graduate student is part of a research team at Auburn University at Montgomery (AUM) developing an artificial intelligence-powered tool designed to recognize sign language.

What unfolds next is a glimpse of the team’s cutting-edge creation.

Harshavardhan Meka runs a test with Drishti.

While still pressing the space bar, Meka holds up the sign language symbol for the number five as the webcam on his laptop mirrors the gesture. Instantly, the computer creates an outline of his hand and translates it, typing the number into a search bar on his screen — a simple act that highlights the study’s new transformative AI technology to assist individuals with hearing or speech impairments.

Meet “Drishti,” the AI-powered tool named after the Hindi word for “vision” and the idea of Tathagata Bhattacharya, assistant professor of computer science at AUM. The tool is designed to bridge communication gaps for individuals with hearing or speech impairments by recognizing and translating sign language gestures into text.

“Drishti primarily serves individuals with hearing or speech impairments by allowing them to perform tasks such as searching or typing without relying on voice or keyboard inputs,” he said. “It’s an inclusive, gesture-based interface that empowers users who might otherwise be marginalized by traditional technology.”

Many popular conversational AI platforms — such as Siri, Alexa, and Google Assistant — rely heavily on voice or keyboard interaction, creating accessibility challenges for users who cannot speak or type efficiently. Drishti addresses this gap by allowing users to interact with the digital world through intuitive, real-time gesture recognition.

“Unlike many static assistive tools, Drishti integrates generative AI to allow for a more natural, adaptive, and user-friendly experience,” Bhattacharya said, noting that the system also leverages large language models, a type of generative AI that produces human-like text. “Its simplicity and real-time responsiveness set it apart from more complex or hardware-dependent solutions.”

The tool requires only a standard laptop or desktop with a webcam and operates on lightweight software with minimal computating resources. Its accessible design makes it practical and affordable, even for users with basic setups.

“Being involved in the development of Drishti has been an incredibly rewarding experience,” said Meka, who will graduate in December 2025. “It’s allowed me to apply what I’ve learned in the classroom to a project that has real-world, social impact.”

Currently, Drishti can recognize sign language gestures for the English alphabet (A–Z), digits (0–9), and basic commands such as “space” and “delete.” Upon detection, the system coverts gestures directly into text, enabling users to compose messages, search queries, and digital commands seamlessly.

To give Drishti life and build its capabilities, the research team collected 87,000 images for alphabetic gestures and 15,000 images for number gestures.

“We created a data set of 102,000 images and developed a proprietary library for training the AI model,” Meka said. “This hybrid dataset is housed on a website we developed and is not publicly available anywhere else.”

Generative AI — capable of producing images, code and more — plays a key role in enhancing Drishti’s robustness by predicting and interpreting incomplete or partially formed hand gestures, Bhattacharya said.

“This makes the system more forgiving and adaptable compared to rigid keyboard-based systems, reducing the need for users to repeat gestures,” he said.

Drishti’s development process was not without challenges. The researchers had to ensure accurate gesture recognition under various environmental conditions such as lighting and account for differences in hand shapes and sizes. Achieving real-time responsiveness while maintaining high accuracy required extensive fine-tuning of the AI algorithms, Bhattacharya explained.

“We overcame these challenges by conducting usability testing and collecting extensive datasets from diverse participants to confirm clarity and ease of use across different user groups,” he said. “The team also implemented data augmentation techniques to enhance model generalization and robustness, enabling the AI to perform reliably in real-world scenarios.”

Tathagata Bhattacharya and Harshavardhan Meka stand together on campus.
Tathagata Bhattacharya and Harshavardhan Meka

Bhattacharya said the experience reinforced the research team’s belief that accessibility must be an integral part of technological advancement, not an afterthought.

“This project underscores the importance of designing with inclusivity at the core,” he said. “We envision expanding Drishti to support full sign language interpretation, which would further enhance its value and accessibility.”

The team’s future plans for Drishti also include integrating hand gestures from other languages, improving its AI accuracy, and collaborating with accessibility organizations to broaden user reach.

“Our overarching goal is to scale Drishti for broader deployment and maximize its real-world impact,” Bhattacharya said. “AI is uniquely positioned to make assistive technologies increasingly personalized, intuitive, and accessible for everyone, especially in digital spaces that can be empowering for individuals with disabilities.”

In June, the research team’s paper, “Drishti: A Generative AI-Based Application for Gesture Recognition and Execution,” will be published as a book chapter in a Springer Nature series. The paper is authored by Bhattacharya, with co-authors Meka and computer science graduate students Srikanth Ponaganti, Peddi Adithya Vardhan, and Irshad Ali Mohammad.

“I’m really excited about what this study means for individuals with speech and hearing impairments,” said Meka, who presented the Drishti research project at AUM’s Celebration of Research, Creative Activity, and Community Engagement event in April. “This technology has the potential to transform how people interact with the digital world.”

Back To Top