• Small text Medium text Large text
  • Default colours Black text on white background Yellow text on black background

Universal Text Recognition: A Wayfinding Tool for People with Visual Impairments

Principal researcher

Name: Prof. Erik G. Learned-Miller

Contact details: Univeristy of Massachusetts Amherst, Department of Computer Science, 140 Governors Drive , Amherst, MA 010039242, USA.
Tel: +1 413 545 2993
Email: elm@cs.umass.edu

Website: www.cs.umass.edu/faculty/directory/learned-miller_erik

Project details

Start date: 01/09/2007
End date: 31/08/2010

Description: Visually impaired individuals have achieved impressive autonomy using a combination of traditional aids (guidedogs and long canes) and more recent advances such as global positioning systems, reading devices for printed text (e.g., the Kurzweil-National Federation for the Blind text-to-speech reader), and other technologies.

Still, the desire or need to read street signs, store front banners, marquees, and other forms of text that are ubiquitous in the world cannot be met without help from another person. Our goal is to develop software for reading text in complex indoor and outdoor environments. We believe this is the key missing piece of technology in a universal reader, a device that could assist those who are blind and visually impaired in navigating and operating in natural scenes and environments.

We propose the development of new algorithms and software for such a device. Specifically, the software will read text from digital camera input in highly diverse and complex environments, such as those found in street scenes or inside commercial buildings and convert that text to speech or Braille for use by people who are blind.

We focus on three central issues: Accuracy - The most fundamental task is to increase the accuracy of the basic text detection and recognition algorithms. Current systems simply do not recognize enough words to be practically useful. Incorporating user input and goals - We aim to develop software mechanisms whereby a user can provide essential input to the device to appropriately narrow the image analysis when appropriate. For example, if the user can specify that he or she is seeking "coffee", then the search can be tailored to the user's request, lowering the detection threshold for text that matches the user's goal, and pruning out irrelevant text. Graceful failure - Just as important as increasing accuracy - perhaps even more important to the user of such a device - is to provide what we refer to as graceful failure, i.e. the minimization of harmful effects due to errors made by the device.

This is a critical and often overlooked aspect of such systems. It is essential that the user of such a device not be misled into believing that the device is correct when it is not, for example, when crossing the street. Because the technical task of reading text in outdoor environments can be arbitrarily difficult, the device will inevitably make errors.

It is a primary goal to design software that produces feedback about the confidence level of returned results, and other cues that will mitigate the impact of errors. The user can then assess the reliability of the information provided by the device and make an intelligent decision about whether to accept the results, depending upon the specifics of the current situation.

Currently people who are visually impaired must rely heavily on others who are sighted to travel and destinations that are important for everyday living. The goal of this project is to produce software for a device that can read (and speak) words on signs, placards, marquees, and store fronts to visually impaired users. Such a device would dramatically increase the independence and autonomy of such individuals.

Other organisations involved in this project

Last updated: 20/03/2010