Overview of SAPI 4.0The SAPI is designed to provide an API layer between applications that use speech technology and the speech engines. In this way, programs that utilize SAPI can use an upgraded version of the speech engine without having to be re-compiled. An additional benefit is the ability to share the various speech resources between applications. The SAPI Suite is a collection of tools, source code, documentation, and speech engines designed for developing text text-to to-speech and speech recognition applications. The SAPI SDK includes source code and binary files for the following development tools
You can download the SAPI SDK Suite from Microsofts research site at the following address: http://www.research.microsoft.com/research/sdk/ Text-to-Speech is a technology that enables a written text to be spoken through a sound card in your computer. Voice recognition is a complementing technology that allows enables voice commands to direct the flow of a program or the actions of your computer. These voice commands are issued through a microphone attached to the sound card in your computer. The Software Developers Kit (SDK) provides you with everything that you need to begin using the Speech API (except the microphone). The SDK contains a large number of samples written in Visual Basic, C++, and Java. The SDK includes six ActiveX controls that allow enable you to create applications ranging from simple speech recognition to a voice voice-operated word processor. The ActiveX controls that are included in the Speech API SDK cover the following areas:
Those controls that are prefixed with Direct give access to the complete Speech API. These controls load all of the speech engines as in-process controls. For those developers seeking the maximum about of control and flexibility, these are the controls that you they will want to use. The Voice command and Voice Text control do not provide nearly the level of access to the API that their Direct counterparts do. However, what you loose in flexibility is made up for in the ability to develop applications that use these controls with just a few lines of code. If your are planning on using these controls in a real real-world application, you will most likely want to use the Direct versions, as they are quicker. These controls are out-of-process controls and that provide for resource and memory sharing between voice applications. If you arewere planning on to createing a new word processor that responds to voice commands, the Dictation control would be an excellent place to start. This control allows enables you to add such features as inverse text and word correction. The Telephone control is designed to aid you in the creation of Telephony applications. This control combines voice synthesis, voice recognition, and DTMF into a single ActiveX control. |
|
![]() |
![]() |
|||