

| Orator SDKRussian speech synthesis TTS |
ApplicationsSpeech synthesis technology is becoming more and more important in the
society. Speech synthesis system is widely applied and implemented in many
solutions. We present here only a short and open list of possible applications.
- Telecommunication:
- IVR (Interactive Voice Response) systems;
- Inquiry and information services;
- Call-centers;
- Messaging systems;
- Automatic announcement systems;
- Consumer and professional products:
- Interactive web services with voice synthesis;
- Talking pages;
- Dialogue systems;
- Technical support for complex tasks;
- Foreign language acquisition and tutorial systems;
- Electronic dictionaries and translators;
- Speech-to-Speech translation systems;
- Radio and TV entertaining projects;
- Advertisement;
- Document and speech preparation;
- Software for visually or speech impaired people:
- reading books, web-pages, e-mails;
- Alternative access to information;
TechnologyBy definition, a text-to-speech system generates speech for a given text.
Such a simple definition hides numerous occurring sub-tasks which call for
effective solutions, expected from professionals in different fields: linguists,
software developers, sound engineers and many others.
There are several principal approaches to text-to-speech systems
construction. We offer high quality concatenative triphone synthesis. This
approach allows us to keep a beneficial balance between low memory requirements
and high quality of natural sounding synthesized speech. Speech Technology Center, as one of the leading companies in the field of
speech technologies managed to gather a strong team of experienced professionals
to make a TTS system which outperforms its predecessors.
Implementation STC
TTS SDK 1.5
Text to speech Software Development Kit 1.5.
This is
a set of tools which is used for applying a speech synthesis system in your
software products. The kit comprises several dll-libraries, headers, examples
and the system documentation. Dll-libraries do not depend on a particular
engineer environment and can be easily built in. The SDK is supplemented with a
male synthesis voice. Broad acoustic database, developed jointly with the
leading specialists of St.Petersburg State University, allows for a most
complete coverage of Russian phonetics. Using this database alongside with
comprehensive algorithms of text processing and prosody modeling guarantees high
naturalness of a synthesized speech. STC TTS Engine 1.5 Text to speech Engine 1.5 STC TTS Engine 1.5 is a kernel of the speech synthesis system developed at
speech Technology Center. It is made in accordance with the Microsoft Speech API
5.1 (www.microsoft.com/speech) recommendations. This technology allows for a
fast standard plug-in of the TTS system or adding new voices to a TTS. Fast and effective integration of the speech synthesis system in your
applications, together with natural sounding synthesized speech, is the main
benefit of STC TTS Engine. FeaturesFull compatible
with SAPI 5.1; Full support of SAPI XML tags; SAPI lexicon support;
Invariant to a synthesis voice;
STC TTS ToolsThis is
a set of tools, developed on the base of TTS SDK or TTS Engine, such as Orator,
Orator SP Edition, Personal Voice Settings, DicEditor and some others. OratorOrator is a Windows-application, aimed at synthesizing texts in .txt or .rtf
formats. Main function of this application is the demonstration of synthesis
quality using STC TTS SDK 1.5. Orator SP EditionOrator SP Edition is a Windows-application for performing text to speech
conversion. This software uses MS SAPI 5.x technology and works with installed
STC TTS Engine 1.5. The system has user friendly interface, allows for saving of
generated speech in mp3 and wav formats. Due to the embedded flexible bookmarks
system it is possible to use Orator SP Edition as a text editor with a personal
library. Standard pack includes Personal Voice Settings module for synthesis voice
tuning and DicEditor – a module for stress vocabulary editing. Advertising ModeAdvertising Mode is a unique mode used for appropriate handling of
advertising information (ads, vacancies, etc.). This module is supplied as a
complementary module to Orator SP Edition 1.5. It allows our corporative clients
to create multi-channel services for the above mentioned tasks. Advantages- Most natural sounding synthesized Russian speech;
- Natural voice timbre and prosody;
- 24 different intonational models for different types of utterances
(questions, exclamatory utterances etc.);
- Possibility to change pitch and tempo of synthesized speech manually;
- Possibility to change the sampling frequency;
Additional Features- Addition of synthesis voices according to a client’s demand;
- Implementation of any work mode needed for a client;
- Additional software packages development.
Standard Pack Characteristics- One male synthesis voice (sampled at 32 kHz);
- Spoken language as a working style;
- Size of vocabulary www.aot.ru is about 120 000 word lemmas;
- Wordform coverage is about 3 000 000.
Implementation Specification- Programming language: C++;
- Technologies involved: WinAPI, DLL, DSP, COM, ATL, SAPI;
System requirements- Processor: Pentium II or better;
- OS: Windows 95/98/Me/NT/2000/XP/2003;
- RAM: 64 Mb or more;
- HDD space required: 100 Mb;
- Sound card;
- Acoustic accessories: loudspeakers, headphones, etc.
|