Speech for Seniors
Among the valuable functions that computers have brought to our lives are enhancing the ability to interact socially with people who are not present and managing the details of everyday life. However, these functions are not currently available in any real sense to many of the very people who would most benefit from them. Elderly and disabled people in particular are often unable to use computers because of the complexity and unnaturalness of current user interfaces. In addition, people with perceptual and motor impairments often find using a computer difficult and uncomfortable. Better interfaces could greatly improve these users' independence and reduce their social isolation.
Although human-computer interaction has been a topic of academic study for many years, today's human-computer interfaces are almost entirely limited to keyboard input which was first used in the 1950's and the graphical mouse input styles that were designed in the 1980's. Speech is generally not part of today's user interfaces, although the ability to speak is a capability that many people with sensory and motor impairments still retain. Although speech interfaces have long been predicted to have the potential to lead to dramatic improvements in the naturalness of human-computer interaction, this potential has not been realized. Nevertheless, speech recognition technology has been making tremendous strides in the last few years. I think it's time to reconsider how it could be applied to improve the lives of seniors by enabling more natural user interfaces. At this year's SpeechTEK conference I moderated a session on Designing Speech and Multimodal Applications for Senior Users. I talked about a few general considerations having to do with physical, perceptual and cognitive changes associated with aging that impact the design of speech and multimodal applications. There was also an interactive discussion where some of the audience members brought up their own experiences. Some key points that came out of the discussion included:
- Participants had found it difficult to find acoustic data for training and testing recognizers on older people's voices.
- Some participants found that older users prefer a more formal interaction style than younger users
- Slower speech rate and longer timeouts are useful in applications for older users
- The question of male vs. female computer voices was interesting because there was some data that male voices are more intelligible than female voices, but some participants' testing showed a preference for female voices.
- One participant talked about an outbound application that checked in with older users, and found that the users didn't feel that they were receiving second class service because a computer rather than a person was calling them.
- Multimodal interfaces can provide supplementary sources of information for users who have difficulty hearing or seeing. In addition, graphical vs. spoken information isn't an either/or choice. People who have difficulty seeing and hearing can benefit from having information presented both visually and auditorily.
|
How can Speech Help Seniors? There are at least two ways that speech can help seniors interact with technology. One is to simply provide voice access to functions that would otherwise require using a keyboard, mouse or screen. This might be helpful for people who have arthritis or other mobility impairments or who have visual impairments. Dictation applications like Dragon Naturally Speaking or Windows Speech Recognition can be used in this way. These do work, especially for highly motivated users, but in essence they are just providing a roundabout way of accessing what is basically a visual user interface and are not particularly natural. The second, and much more revolutionary, way that speech could help seniors is to provide applications that are truly multimodal from the ground up, and which seamlessly integrate the advantages of speech with the advantages of the visual interface. This might seem daunting, because there are millions of computer applications in existence. Does this mean that every one of them has to be redesigned from the ground up as a multimodal application? No, because if you think about it, there are just a few functions that most non-professional users need the vast majority of the time. For example, most people want to read and write email, keep track of their schedules and do searches. They don't need to debug C programs or install network routers. A good multimodal application that could help people with the basic everyday tasks would be very useful.
|
|
|