Company News Products Solutions    Services Partners Tech support



4 Krasutskogo str
Saint Petersburg
196084 Russia
tel. +7 812 331-0665
fax: +7 812 327-9297
send a message

Index » News » Scientific reviews

21.10.2008 MORPHOLOGICAL RANDOM FORESTS FOR LANGUAGE MODELING OF INFLECTIONAL LANGUAGES

In this paper, we are concerned with using decision trees (DT) and random forests (RF) in language modeling for Czech LVCSR. We show that the RF approach can be successfully implemented for language modeling of an inflectional language. Performance of word-based and morphological DTs and RFs was evaluated on lecture recognition task. We show that while DTs perform worse than conventional trigram language models (LM), RFs outperform the latter. WER (up to 3.4% relative) and perplexity (10%) reduction over the trigram model can be gained with morphological RFs. Further improvement is obtained after interpolation of DT and RF LMs with the trigram one (up to 15.6% perplexity and 4.8% WER relative reduction). In this paper we also investigate distribution of morphological feature types chosen for splitting data at different levels of DTs.

21.10.2008 Inflectional Language Modeling with Random Forests for ASR

In this paper we show that the Random Forest (RF) approach can be successfully implemented for language modeling of an inflectional language for Automatic Speech Recognition (ASR) tasks. While Decision Trees (DTs) perform worse than a conventional trigram language model (LM), RFs outperform the latter. WER (up to 3.4% relative) and perplexity (10%) reduction over the trigram model can be gained with morphological RFs. Further improvement is obtained after interpolation of DT and RF LMs with the trigram one (up to 15.6% perplexity and 4.8% WER relative reduction).
÷èòàòü äàëåå

20.10.2008 Large Scale Russian Hybrid Unit Selection TTS

This paper outlines a project on the development of a new hybrid unit-selection and concatenative Russian TTS system. Project is held within Federal Research and Development Program in Priority Directions of Development of Scientific and Technological Complex of Russia in 2007-2012. A new generation Russian TTS that makes use of syntactic and semantic analysis and can be implemented in various types of electronic devices is the major aim of the project.
÷èòàòü äàëåå

18.10.2007 Outline of a New Hybrid Russian TTS System

This paper outlines a recently started project on development of a new hybrid unit-selection and
concatenative Russian TTS system. Project is held within Federal Research and Development Program in Priority Directions of Development of Scientific and Technological Complex of Russia in 2007-20121 (http://www.fcntp.ru/). Major features of the proposed system are presented. Stimulating a wide scientific discussion that would help to improve the system at the
early stages is the main aim of the paper.
÷èòàòü äàëåå

16.10.2007 Eigen Channel Method for Text-Independent Russian Speaker Verification

The method for compensation of session variability in text-independent speaker verification is presented in this paper. It is based on maximum likelihood estimations for speaker sessions modelling. The method is shown to reduce the verification error by 21% for 4-second and by 36% for 20-second testing segments comparing to the GMM-UBM baseline. The evaluation was performed for conversational speech recorded in GSM channels.
÷èòàòü äàëåå

16.10.2007 USING PARAMETERS OF IDENTICAL PITCH CONTOUR ELEMENTS

A formalized approach to pitch-based speaker discrimination using identical components of pitch contour structure is presented. The designed list of pitch units includes 7 basic unit types (16 subtypes). Each unit is described with a set of relevant pitch parameters. The effectiveness of three unit types (nuclear fall, nuclear rise and a filled hesitation pause) was tested on a 10-male speech corpus first on 2-session and 3-session data. The results show a positive discriminating potential of certain pitch parameters. The lowest EER values were obtained for the so-called “physical” F0 parameters of a rising nucleus (18% for F0 minimum in a 2-session comparison and 22% for F0 mean in a 3-session comparison). The fusion of all parameters for the three contour unit types produced an EER of 13% in a 3-session comparison.
÷èòàòü äàëåå

15.10.2007 Phone Recognition driven Method for Creating Context-Dependent Phones

Progress in the development of the Large Vocabulary Speech Recognition System created at Speech Technology Center is presented in this paper. The most widely used method for creating context-dependent phones is based on growing a decision tree with branches defined by binary questions, concerning neighbor phones. The list of questions may vary and is to some extent arbitrary. Decision on splitting is based on the behavior of entropy.
A new method based on recognition scores of phones is proposed. At each step of the algorithm additional models are introduced whenever existent models of monophones or triphones are poorly recognized. Retraining of the new model rearranges the training data between all models.
The problem of unseen triphones is solved by introducing a measure of similarity of contexts and
by clustering monophones according to this measure.
Context-dependent phones obtained with this method were tested on the task of keyword spotting. This limited task was chosen due to known and not properly solved problems concerned with creating decoder for LVCSR.
÷èòàòü äàëåå



Ñòðàíèöû: 1 2 3  » 
RSS


07.05.2010

Following the overwhelming success of the SpeechTEK event in the US, SpeechTEK Europe launches in London on 26 & 27 May 2010. SpeechTEK Europe will feature two packed days of conference sessions, keynotes and case studies plus an exhibit hall showcasing leading vendors and solutions providers, dedicated to bringing European buyers and sellers together in a focused setting.

more...

06.05.2010

The 1st EU–Russia Innovation Forum will be held on the 25 – 27th of May, 2010 in the City of Lappeenranta. The events will take place at the Lappeenranta City Hall and at the Lappeenranta University of Technology.

more...

25.03.2010

Speech Technology Center invites you to visit our booth at DSA 2010, which has now become one of the world's top 5 defense and security exhibitions and firmly remains the Asia Pacific region's most vital procurement hub for Defense and Security.

more...




Speech Technology Center © 2007-2009