Beijing Infoquick Sinovoice Speech Technology Corp.

 

Solution=>|   Profession Level for CTI  |   Embedded Level Solution  |   for Other Fields  |

 

OpenSpeech Recognizer    [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]

Optimizations for VoiceXML

OpenSpeech Recognizer is the first speech recognition engine optimized for use in VoiceXML systems. VoiceXML is a markup language defined by the World Wide Web Consortium (W 3C ) for specifying telephony speech applications. OSR does not provide an interpreter that processes the VoiceXML language. Rather, it includes many features that simplify construction of such an interpreter and improve its performance. OSR is equally suited to non-VoiceXML systems.

VoiceXML is not yet a formal standard but already has seen growing importance in open system architectures for speech. Compare d with traditional approaches to speech application deployment, executing VoiceXML applications places unique demands on the speech recognition engine, particularly with respect to dynamic grammar handling and specific technical requirements. OSR is designed specifically to meet these challenges with features such as:

•  Grammar Definition: OSR includes native support for the Speech Recognition Grammar Specification (SRGS) file format required by VoiceXML. A tool is provided to convert grammars written in the Augmented Backus-Naur Format (ABNF) , used by older SpeechWorks speech recognition products, into SRGS format.

•  Grammar Loading: OSR refers to grammars using a Universal Resource Identifier (URI) as required by VoiceXML. OSR will fetch the grammar if it resides on a remote system. Grammars are stored in a two-level memory and disk cache for greater efficiency.

•  Dynamic Grammars: OSR automatically compiles grammars when necessary. The OSR grammar compiler is extremely fast, processing several thousand words without noticeable delay. Larger grammars can be provided in pre-compiled form. OSR can also be configured to use a centralized grammar compilation server if desired.

•  Parallel Grammars: OSR allows multiple grammars to be loaded in parallel, as suggested by VoiceXML, without the use of "wrapper grammar" to combine them. This improves efficiency by eliminating compilation of a single combined grammar, particularly when some of the included grammars are large. OSR allows compiled and source grammars to be mixed.

•  ECMAScript Support: OSR supports ECMAScript embedded in grammars as required by VoiceXML, allowing for application-specific processing during the recognition process. ECMAScript is a standard, general purpose scripting language. Scripting is most often used to compute return values but can also be used to prune illegal grammar paths.

•  DTMF Grammars: OSR will process DTMF grammars as required by VoiceXML. Note that OSR does not decode the audio signal itself but requires any detected DTMF to be passed in as symbols. By loading DTMF grammars in parallel with speech grammars OSR can be used to give callers the option of speaking or keying input.

•  Built-in Grammars: OSR includes the seven built-in grammars required by VoiceXML to handle common tasks (boolean, currency, date, digits, number, phone number, and time).

•  Result Format: OSR returns results in the Natural Language Semantic Markup Language (NLSML) format proposed by VoiceXML for semantic interpretation.

Language Availability

OSR supports Multi-language recognition, which allows the caller to speak one of multiple languages in a single recognition or mix languages within a single recognition.

OSR standard language packs are designed for a broad range of callers and support a wide variety of applications. Standard language packs include:

•  zh-TW Mandarin , Taiwan

•  ce-HK Cantonese, Hong Kong ; includes English for multilingual recognition

•  de-DE German, Germany

•  en-AU English, Australia

•  en-NZ New Zealand

•  en-UK English, United Kingdom ; will be renamed en-GB in a future release

•  en-US English, United States

•  en-SG English, Singapore

•  es-US Spanish, United States

•  es-MX Spanish, Mexico

•  fr-CA French, Canada

•  fr-FR French, France

•  ja-JP Japanese, Japan

•  ko-KR Korean, Korea

GOTO=> [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]

Copyright@ 2001-2008,beijing InfoQuick SinoVoice speech technology Corp. All Rights Reserved.