abstr_livre_actes_de

G. Dreyfus, O. Macchi, S. Marcos, O. Nerrand, L. Personnaz, P. Roussel-Ragot, D. Urbani, C. Vignat
Adaptive Training of Feedback Neural Networks for Non-Linear Adaptive Filtering,
Neural Networks for Signal Processing II, 550 (IEEE, 1992).

Abstract
The paper proposes a general framework which encompasses the training of neural networks and the adaptation of filters. It is shown that neural networks can be considered as general non-linear filters which can be trained adaptively, i.e. which can undergo continual training. A unified view of gradient-based training algorithms for feedback networks is proposed, which gives rise to new algorithms. The use of some of these algorithms is illustrated by examples of non-linear adaptive filtering and process identification.

Cliquez ici pour obtenir le document / Click here to download the document postscript (or ps.gz) or .pdf.

I. Rivals, L. Personnaz, G. Dreyfus, D. Canas
Modeling and Control of a Wheeled Mobile Robot Using Recurrent Neural Networks : an Application to the Path Following Problem
International Workshop on Neural Networks for Computing (1994)

Abstract
We adress the problem of controlling an autonomous vehicle using recurrent neural networks for the path following problem. Our testbed is the full-scale outdoor robot REMI (Robot Evaluator for Mobile Investigations), a standard four-wheel-drive truck that has been equipped with actuators for the steering wheel, the throttle, the brakes and the gear, and with the sensors needed for navigation (an inertial dead-reckoning unit and an odometer).
Wheeled mobile robots are highly non-linear dynamical systems, their kinematics involving geometrical non-linearities and their actuators introducing dynamical non-linearities, such as saturations and the non-linear dynamics of the thermal motor. Thus, the identification of such processes requires non-linear modeling methods. We therefore start by presenting the identification of REMI's dynamical behaviour using recurrent neural networks.
We then outline a general framework for the training of neural controllers for non-linear dynamical systems, using a reference model, a model of the process to be controlled, and a neural controller implementing a state-feedback control law. We apply the principles presented in this framework to the control of the vehicle for the path following problem. The problem is to steer the vehicle along a known reference path with a predetermined velocity profile. We express the path following problem in terms of controlling the vehicle orientation, the velocity being imposed either by a human operator or by a neural speed-controller which is designed and trained independently. The aim is to derive a state-feedback control law for the steering command so as to (i) have the vehicle rally and follow the path and (ii) have the orientation of the vehicle tangent to the path when the vehicle follows the path.
Simulation results illustrating the training and the behaviour of the control system designed are shown. This control system has also been successfully tested with the robot REMI on various trajectories, on smooth and rough terrain, and we present the corresponding real-time experimental results, as well as comparisons with the classical approaches previously used.

B. Quenet, G. Dreyfus, C. Masson
Towards an Analytically Tractable Model of the Olfactory System
International Workshop on Neural Networks for Computing (1996). Invited presentation.

Abstract
We present an analytically tractable model for the formation of olfactory images in the glomerular system. The essential features of the model are the following:
- inhibition is dominant; this is expressed in the model by the fact that all inter- and intraglomerular connections are inhibitory;
- each family of receptors projects essentially onto a single glomerulus; this is expressed in the model by the fact that there is a one-to-one excitatory connection from receptors to glomeruli.
In addition, the activity of each glomerulus is modeled as that of a formal inhibitory neuron. For simplicity, all delays and all synaptic weights are assumed to be equal; synchronous dynamics is investigated.
This approach gives an understanding of the space of representation of the stimuli : we show that the system has the ability of extracting time-invariant key features of the input signal, thereby producing stable "images" at the glomerular level. This property results in an intrinsic robustness to the noise present in the signal. The existence of a Lyapunov function provides a tool for studying the dynamical behaviour of the system; the influence of synaptic noise is investigated.

B. Quenet, G. Dreyfus, V. Cerny
The Effect of the Synaptic Noise on the Coding Properties of a Model of the Olfactory System: Analytical study and simulations
International Workshop on "Machines that Learn" (1997).

Abstract
The deterministic version of a model of the glomerular stage in the olfactory tract has revealed an interesting behaviour which consists in spontaneously extracting key features from fluctuating signals. In the model, binary neurons figure the glomeruli, each connected to another by inhibitory synapses. Receptor neurons send excitatory connections to the glomerular neurons according to a one-to-one correspondence. The model performs a mapping from the space of all possible activities of the receptors to the space of spatio-temporal patterns of the glomerular activities (glomerular images). The simplicity of the model, with all delays and all synaptic weights equal, allows an analytical approach which has already led to an in-depth understanding of its dynamics.
We have proved that the coding principles are conserved when synaptic noise is added; indeed, the limit probability vector of the glomerular activities tends to the cyclic steady states that minimize a Lyapunov function. We observe three noise regimes : (1) a low noise regime, where the only steady states are those that minimize the Lyapunov function, thereby enhancing the efficiency of the extraction of key features with respect to the noise-free regime; (2) a medium noise regime where the mean activity of the glomeruli comes close to the mean activity of the receptors themselves, and (3) a high noise regime, where all coding properties are blurred out. To summarize, a modulation of the synaptic noise may lead the model to perform a more or less sketchy representation of the receptor activities, the most efficient key feature extraction occurring in the low noise regime.

P. Roussel, F. Moncet, B. Barrieu, A. Viola
Modélisation d'un processus dynamique à l'aide de réseaux de neurones bouclés. Application à la modélisation de la relation pluie-hauteur d'eau dans un réseau d'assainissement et à la détection de défaillance de capteur.
Innovative technologies in urban drainage, 1, 919-926, G.R.A.I.E.

Abstract
Systematic measurements of the rainfall as well as of the resulting water height, flow and velocity in the pipes, have been performed for several years in the sewer system of a French department. Due to possible electrical failures or slow sensor drifts, a validation procedure must be performed in order to guarantee the validity of the stored measurements.
To this end, we have developed a neural dynamic black-box model of the rainfall-water height relationship on a simple urban catchment equipped with a single water gauge and a single rainfall gauge. The validity of the measurements is assessed from the comparison between the measured heights and the corresponding values predicted by the model.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

B. Quenet, S. Sirapian, R. Dubois, G. Dreyfus
Generation of Olfactory Neural Codes by a Network of Hodgkin-Huxley Neurones
4th International Workshop on Neural Coding, Plymouth, UK (2001)

Networks of synchronously updated McCulloch & Pitts neurones exhibit spontaneously complex spatio-temporal patterns that can be compared to the activities of biological neurones in phase with a periodic LFP, as demonstrated experimentally by Wehr & Laurent in the locust olfactory pathway. Modelling biological neural nets with networks of very simple formal units makes the dynamics of the model analytically tractable. It is thus possible to determine the constraints that must be satisfied by its connection matrix and inputs in order to make its neurones exhibit a given sequence of activity. In the presentation, we address the following question: once a formal network has been built, that is able to reproduce quantitatively experimentally observed neuronal codes, can it serve as a guide to design a network of more realistic (Hodgkin-Huxley) formal neurones that exhibits the same dynamical behaviour? We demonstrate that such a strategy is indeed fruitful: it allowed us to design a model that reproduces the Wehr-Laurent olfactory codes, and to investigate the robustness of these codes to synaptic noise.

B. Quenet, D. Horn, R. Dubois, S. Sirapian, G. Dreyfus
Dynamic Neural Filter of Hodgkin-Huxley units
International Workshop on Learning, Snowbird (2002)

Abstract
The coding properties of a Dynamic Neural Filter (DNF) made of McCulloch and Pitts units can be completely described analytically, even in the presence of intrinsic noise. The main property of this type of network is its ability to map stable input vectors to spatiotemporal sequences of the neuronal activity. Two major problems can be solved with such a DNF: (1) the direct problem, which investigates the emerging spatiotemporal pattern, given the connection matrix and the input, and (2) the inverse problem, which investigates the family of connection matrices and inputs that can elicit a given spatiotemporal pattern. The existence of such a family is not guaranteed: the emergence of a given spatiotemporal pattern may require the existence of hidden neurons. Given an arbitrary temporal sequence of binary activities, we suggest here a method in order to (1) build an appropriate activity pattern for a parsimonious number of hidden neurons, when they are necessary, and (2) find the family of matrices and inputs compatible with the given sequence. Such a modeling approach may be useful when applied to experimental data recorded in biological neurons, once they are represented as binary temporal sequences of activities. This method provides a number of hidden neurons,which is an index of the complexity of the observed neural task, and also provides a numerical guide for the design of a model network of more biologically plausible formal neurons, such as Hodgkin-Huxley ones. We have therefore developed a tool that combines the advantages of an analytical approach and of a biologically plausible modeling, alllowing us to reproduce exactly experimentally observed sequences with spiking neurons. In this presentation, we illustrate our method with experimental data recorded by Wehr and Laurent in the locust antennal lobe.

Long-time electrocardiographic records (Holter) are an important tool in non-invasive electrocardiology. Such a record features at least of 100,000 heart beats for a 24-hour record, but only a few of them may express a heart anomaly. Therefore, a fully automated analysis is desirable as a computer-aided diagnosis tool.

To that effect, we suggest a mathematical decomposition of each heart beat on a specific family of regressors ("bumps"). Each bump has five adjustable parameters. Unlike conventional regressors (wavelet, RBF,...), bumps are designed to fit the usual cardiac "waves" that are defined by cardiologists; the shape and position of the waves are the basis of the experts' diagnostics. Since each wave is fitted by a single regressor (or possibly two), the number of parameters needed to model the relevant information is parsimonious, and the decomposition meets the intelligibility requirements of automated medical diagnostics tools.
Modeling a complete heart beat with N bumps requires N iterations of the following algorithm:

This approach was tested on several international databases, showing that the number of bumps needed to model accurately the medically significant waves is at most N = 6 .

G. Dreyfus, Y. Oussar
Non-linear black-box model selection (invited plenary presentation)
Neural Networks for Signal Processing (2003).

Neural networks, and, more generally, nonlinear-in-their-parameters models, are recognized tools for engineers; the basic issues in the training of those models may be considered as essentially solved. However, there are more general issues, not specific to neural networks, which are still open. One of them has become all-important as new areas of applications for nonlinear modeling are opening up: the problem of model selection. That includes:

The presentation will emphasize a model design methodology and recent developments in model selection, illustrated by academic examples and by industrial applications.

A .Goulon-Sigwalt-Abram, A. Duprat, G. Dreyfus
Graph Machines and Their Applications to Computer-Aided Drug Design: a New Approach to Learning from Structured Data
Unconventional Computing 2006, Lecture Notes in Computer Science, vol. 4135, pp. 1 – 19, Springer (2006)

Abstract

The recent developments of statistical learning focused on vector machines, which learn from examples that are described by vectors of features. However, there are many fields where structured data must be handled; therefore, it would be desirable to learn from examples described by graphs. Graph machines learn real numbers from graphs. Basically, for each input graph, a separate learning machine is built, whose algebraic structure contains the same information as the graph. We describe the training of such machines, and show that virtual leave-one-out, a powerful method for assessing the generalization capabilities of conventional vector machines, can be extended to graph machines. Academic examples are described, together with applications to the prediction of pharmaceutical activities of molecules and to the classification of properties; the potential of graph machines for computer-aided drug design is highlighted.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

Olivier Romain, Bruce Denby
Prototype of a software-defined broadcast media indexing engine
International Conference on Acoustics, Signal and Speech Processing (ICASSP), Honolulu (2007)

Abstract

The article compares two approaches to the description of ultrasound vocal tract images for application in a "silent speech interface," one based on tongue contour modeling, and a second, global coding approach in which images are projected onto a feature space of Eigentongues. A curvature-based lip profile feature extraction method is also presented. Extracted visual features are input to a neural network which learns the relation between the vocal tract configuration and line spectrum frequencies (LSF) contained in a one-hour speech corpus. An examination of the quality of LSF's derived from the two approaches demonstrates that the eigentongues approach has a more efficient implementation and provides superior results based on a normalized mean squared error criterion.

[ Retour a la liste des publications / Back to the list of publications ]

G. Dreyfus
Random probes for variable selection
Multiple Simultaneous Hypothesis Testing, Paris (2007)

Abstract

The random probe method for variable selection is a recently developed method, which provides direct control on Type I error and indirect control of Type II errors. We describe the principle of the method, discuss its limitations, and provide some typical applications.

[ Retour a la liste des publications / Back to the list of publications ]

T.Hueber, G.Chollet, B. Denby, M. Stone,L. Zouari
Ouisper: Corpus Based Synthesis Driven by Articulatory Data
International Conference on Phonetic Science (ICPhS), Saarbrücken, Germany (2007).

Abstract

Certain applications require the production of intelligible speech from articulatory data. This paper outlines a research program (Ouisper : Oral Ultrasound synthetIc SPEech souRce) to synthesize speech from ultrasound acquisition of the tongue movement and video sequences of the lips. Video data is used to search in a multistream corpus associating images of the vocal tract and lips with the audio signal. The search is driven by the recognition of phone units using Hidden Markov Models trained on video sequences. Preliminary results support the feasibility of this approach.
Keywords: clinical phonetics, pathophonetics, speech synthesis, automatic speech recognition

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

A. Goulon, A. Duprat, G. Dreyfus
Virtual Leave-One-Out Estimation of Generalization Error
International Learning Workshop, Puerto Rico (2007)

Virtual leave-one-out is an attractive alternative to cross-validation for estimating the prediction error of models, especially nonlinear ones: training is performed on the available data, and an estimate of the prediction error that would have been incurred on each example, if it had been withdrawn from the training set, is computed. For linear models, the virtual leave-one-out score reduces to the PRESS statistic. The computation of the leave-one-out score involves the computation of the leverage of each example, which indicates the influence of each example on the model.
There is a growing interest in graph machines and recursive networks, which learn from structured data, i.e. examples that are described by graphs instead of vectors. Graph machines encode the structure of the graphs and simultaneously provide a prediction of the properties of interest. Therefore the representation of the structured data is learnt together with the learning of the task, which exempts the model designer from finding a vector representation for the data.
We show how the computation of leverages, hence of the virtual leave-one-out score, can be extended to graph machines and recursive networks. We compare the real and virtual leave-one-out scores on several data sets. We describe examples of graph machine selection by virtual leave-one-out, and we show that, in addition, virtual leave-one-out can provide insight into the design of the predictors, i.e. the encoding of the input data into directed acyclic graphs. We illustrate these topics on several regression tasks, e.g. the estimation of the Gibbs free energy of solvation of molecules, the toxicity of halogenated aliphatic compounds, or the agonist activities of ecdysteroids.

[ Retour a la liste des publications / Back to the list of publications ]

Chollet, G., Landais, R., Hueber, T., Bredin, H., Mokbel, C., Perrot, P., Zouari
Some Experiments in Audio-Visual Speech Processing
Advances in Nonlinear Speech Processing, vol 4885, pp. 28-56, Springer (2007).

Abstract

Natural speech is produced by the vocal organs of a particular talker. The acoustic features of the speech signal must therefore be correlated with the movements of the articulators (lips, jaw, tongue, velum,...). For instance, hearing impaired people (and not only them) improve their understanding of speech by lip reading. This chapter is an overview of audiovisual speech processing with emphasis on some experiments concerning recognition, speaker verification, indexing and corpus based synthesis from tongue and lips movements.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

B. Denby, Y. Oussar, I. Ahriz
Geolocalisation in Cellular Telephone Networks
Proceedings of NATO 2007 Advanced Study Institute on Mining Massive Data Sets for Security,
F. Fogelman-Soulié, D. Perrotta, J. Piskorski & R. Steinberger, Eds., IOS Press, pp. 357-365,
Amsterdam ( 2008).

Abstract

The paper gives an overview of GPS and radio interface based geolocalisation techniques for cellular telephone networks,
including the E911 and E112 initiatives, Location Based Services, and law enforcement/security applications. An example of
localisation using the Database Correlation Method is also presented.

T. Hueber, G. Aversano, G. Chollet, B. Denby, G. Dreyfus, Y. Oussar, P. Roussel, M. Stone
Eigentongue feature extraction for an ultrasound-based silent speech interface
International Conference on Acoustics, Signal and Speech Processing (ICASSP), Honolulu (2007)

Abstract

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

T. Hueber, G. Chollet, B. Denby, G. Dreyfus, M. Stone,
Continuous-Speech Phone Recognition from Ultrasound and Optical Images of the Tongue and Lips
Interspeech, Anvers (2007).

Abstract

The article describes a video-only speech recognition system for a "silent speech interface" application, using ultrasound and optical images of the voice organ. A one-hour audio-visual speech corpus was phonetically labeled using an automatic speech alignment procedure and robust visual feature extraction techniques. HMM-based stochastic models were estimated separately on the visual and acoustic corpus. The performance of the visual speech recognition system is compared to a traditional acoustic-based recognizer. Index Terms: speech recognition, audio-visual speech description, silent speech interface, machine learning

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

P. Bouchet, R. Dubois, C. Henry, P. Roussel, G. Dreyfus,
Machine learning for shock decision in implanted defibrillators,
International Learning Workshop, Tampa (2009).

Abstract

The discrimination of Ventricular Tachycardia (VT) from Supra-Ventricular Tachycardia (SVT) remains a major challenge for appropriate therapy delivery in Implantable Cardioverter Defibrillators (ICDs). Unlike SVT, VT is a life-threatening arrhythmia that may lead to sudden death unless an appropriate shock is delivered. The discrimination in ICDs is performed from endocardial measurements of the electrical activity of the heart (EGM). Historically, only time intervals extracted from EGMs were used for the diagnosis. In the last decade, an additional analysis of features extracted directly from the shape of a single EGM channel led to improved performances, especially in order to avoid inappropriate shocks, which are very painful and stressful for patients. A recent study shows that inappropriate shocks occurred in 11.5% of the prophylactic ICD patients and accounted for 31.2% of the total shock episodes [1].
The discrimination method proposed here relies on the simultaneous analysis of two different ventricular EGM channels, available in most common implanted defibrillators. Therefore, we have designed a two-dimensional representation of both a far-field (RVp-Can) and a near-field (RVp-RVd) EGM signals (Figure 1), named "Spatial Projection Of Tachycardia electrograms" (SPOT) (Figure 2). The SPOT curve of a cardiac cycle is the plot of the amplitude of the far-field sensing signal versus the amplitude of the near-field sensing signal, with time as a parameter. Features extracted in this space representation allow curve comparison. The underlying assumption is that the morphology of an SVT SPOT curve is similar to that of the reference curve constructed from the patient's normal EGMs, while the SPOT curve for a VT is different (Figure 2): this is justified by the fact that the electrical signals pertaining to normal heartbeats and to SVT heartbeats originate from the atria and follow the same electrical conduction pathway to the ventricles, while VT electrical signals, originating from the ventricles, have a different activation pattern, leading to a change in the morphology of the signals received by the electrodes.
Morphological features are extracted from the curves, and candidate features for statistical classification, based on physiological prior knowledge, are computed to compare arrhythmia and reference SPOT curves: the average angle of the relative velocity vectors, the correlation coefficient between the norms of the velocity and the correlation coefficient between their curvatures. These three features, and two additional timing descriptors, form a set of candidate features, on which statistical feature selection was performed by the random probe method [2]. Classifiers of various types and complexities (Linear, Polynomial, Neural networks, Support Vector Machines) were subsequently trained. Model selection was carried out by leave-one-out.
SVM classification on a data base of 93 VT and 26 SVT from 73 patients provided 95.7% sensitivity and 92.3% specificity. Therefore, a substantial improvement in sensitivity and specificity is expected of SPOT-based discrimination algorithms for VT/SVT discrimination, which should result in a more comfortable therapy and an improved quality of life for defibrillator-implanted patients.

[ Retour a la liste des publications / Back to the list of publications ]

Hai-Ni Qu, Y. Oussar, G. Dreyfus, Weng Xu
Regularized Recurrent Least Squares Support Vector Machines
International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing, Shanghai, 2009

Abstract

Support vector machines are widely used for classification and regression tasks. They provide reliable static models, but their extension to the training of dynamic models is still an open problem. In the present paper, we describe Regularized Recurrent Support Vector Machines, which, in contrast to previous Recurrent Support Vector Machine, models, allow the design of dynamical models while retaining the built-in regularization mechanism present in Support Vector Machines. The principle is validated on academic examples; it is shown that the results compare favorably to those obtained by unregularized Recurrent Support Vector Machines and to regularized, partially recurrent Support Vector Machines.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

M. Pernot, E. Macé, R. Dubois, M. Couade, M. Fink, M. Tanter,
Mapping myocardial elasticity changes after RF-ablation using Supersonic Shear Imaging,
Computers in Cardiology, Park City (2009).

Abstract

Supersonic Shear Imaging (SSI) is a new ultrasound-based technique for imaging non-invasively and quantitatively the elastic modulus of soft tissues. Monitoring tissue stiffness changes during Radio-Frequency Ablation (RFA) may quantify the size and shape of the ablation necrosis and therefore assesses if the RFA is complete. We propose to apply SSI for monitoring the myocardial elasticity and evaluate the correlation with the RF Ablation necrosis size in both in vitro and in vivo experiments.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

R. Dubois, P. Roussel, M. Hocini, F. Sacher, M. Haissaguerre, G. Dreyfus,
A Wavelet Transform for Atrial Fibrillation Cycle Length Measurements,
Computers in Cardiology, Park City (2009).

Abstract

We describe a new algorithm for the estimation of Cycle Lengths (CL) in the atria. In the spirit of wavelet transforms, the algorithm correlates the EMG signal to a set of functions that are specifically designed to extract the cycle length present in the signal. This provides a CL vs time map, which is a highly informative representation of the electrical activation of the tissue. Subsequently, the information from this map is compressed into a histogram that unravels the distribution of the dominant CLs on a given time window. Finally, a sliding window tracks automatically the changes in CLs over a large time scale. Results on both synthetic and real data are presented. The correlation with known cycle lengths in the synthetic cases is strong, and the CL distributions on real data are similar to those obtained from manually annotated EGMs.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

R. Dubois, P. Roussel, M. Vaglio, F. Extramiana, F. Badilini, P. Maison-Blanche, G. Dreyfus,
Efficient modeling of ECG waves for morphology tracking,
Computers in Cardiology, Park City (2009).

Abstract

We propose a new approach to fully automatic ECG wave extraction and morphology tracking. It is based on Generalized Orthogonal Forward Regression (GOFR), which allows decomposing a one-dimensional signal into a set of appropriate parameterized functions. Two applications of GOFR to ECG modeling are presented. First, in order to delineate ECG characteristic waves, we make use of a specific function, called the Gaussian Mesa function (GMF). Secondly, we track the evolution of the T-wave morphology by introducing a Bi-Gaussian function (BGF).

The approach was validated on three experimental settings; the results confirm that the combination of GOFR and of an appropriate parametric function is remarkably efficient for ECG wave modeling.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

P. Bouchet, R. Dubois, C. Henry, P. Roussel, G. Dreyfus,
Spatial Projection of Tachycardia Electrograms for Morphology Discrimination in Implantable Cardioverter Defibrillator,
Computers in Cardiology, Park City (2009).

Abstract

Discrimination of Ventricular Tachycardia (VT) from Supra-Ventricular Tachycardia (SVT) remains a major
challenge for appropriate therapy delivery in Implantable Cardioverter Defibrillators (ICDs), especially in single
chamber devices. We propose here a new discrimination algorithm that analyzes, with a machine learning approach, the morphology of a two-dimensional representation
of both a far-field and a near-field ventricular sensing channel. Features extracted in this representation allow
comparisons between curves. Thus, arrhythmia discrimination is performed by comparing an arrhythmia curve to a reference curve. A statistical classifier was trained on a private database and tested on the standard Ann Arbor Electrogram Libraries.
Our discrimination algorithm demonstrated high sensitivity and specificity for VT/SVT discrimination. The
requirements of this algorithm make it appropriate for implementation in the simplest ICD system.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

T. Hueber, E. Benaroya, G. Chollet, B. Denby, G. Dreyfus, M. Stone
Visuo-Phonetic Decoding using Multi-Stream and Context-Dependent Models for an Ultrasound-based Silent Speech Interface,
Interspeech, Brighton (2009).

Abstract

Recent improvements are presented for phonetic decoding of continuous-speech from ultrasound and optical observations of the tongue and lips in a silent speech interface application. In a new approach to this critical step, the visual streams are modeled by context-dependent multi-stream Hidden Markov
Models (CD-MSHMM). Results are compared to a baseline system using context-independent modeling and a visual feature fusion strategy, with both systems evaluated on a onehour, phonetically balanced English speech database. Tongue and lip images are coded using PCA-based feature extraction techniques. The uttered speech signal, also recorded, is used to initialize the training of the visual HMMs. Visual phonetic
decoding performance is evaluated successively with and without the help of linguistic constraints introduced via a 2.5kword decoding dictionary.

Keywords: silent speech interface, visual speech recognition, multi-stream modeling

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

[ Retour a la liste des publications / Back to the list of publications ]

M. Toukourou, A. Johannet, G. Dreyfus,
Flash Flood Forecasting by Statistical Learning in the Absence of Rainfall Forecast: a Case Study,
Engineering Applications of Neural Networks EANN 2009, Londres (2009).

Abstract

The feasibility of flash flood forecasting without making use of rainfall predictions is investigated. After a presentation of the "cevenol flash floods", which caused 1.2 billion Euros of economical damages and 22 fatalities in 2002, the difficulties incurred in the forecasting of such events are analyzed, with emphasis on the nature of the database and the origins of measurement noise. The high level of noise in water level measurements raises a real challenge. For this reason, two regularization methods have been investigated and compared: early stopping and weight decay. It appears that regularization by early stopping provides networks with lower complexity and more accurate predicted hydrographs than regularization by weight decay. Satisfactory results can thus be obtained up to a forecasting horizon of three hours, thereby allowing an early warning of the populations.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

I. Ahriz, Y. Oussar, B. Denby, G. Dreyfus
Carrier relevance study for indoor localization
Workshop on Positioning, Navigation and Communication, WPNC'10, Dresden (2010).

Abstract

A study is made of subsets of relevant GSM carriers for an indoor localization problem. A database was created containing power measurement scans of all available GSM carriers in 5 of 8 rooms of a second storey laboratory in central Paris, France, and a statistical learning algorithm developed to discriminate between rooms based on these carrier strengths. To optimize the system, carrier relevance was ranked using either Orthogonal Forward Regression or Support Vector Machine - Recursive Feature Elimination procedures, and a subset of relevant variables obtained with cross-validation. Results show that the 60 most relevant carriers are sufficient to correctly localize 97% of scans in an independent test set.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

P. Milpied, R. Dubois, P. Roussel, C. Henry, G. Dreyfus
Morphological Stability of Bipolar and Unipolar Endocardial Electrograms
Computing in Cardiology, Belfast (2010).

Abstract

Implantable Cardioverter Defibrillators (ICD) are widely used for sudden cardiac death prevention. In most ICD algorithms, decision making includes a morphological analysis of the unipolar and/or bipolar electrograms (EGM). The principle of such algorithms is to create a "normal" template by averaging normal sinus rhythm heartbeats, for comparison to each arrhythmic heartbeat.

The present study addresses the stability of unipolar and bipolar EGMs with respect to the posture of the patient, and the temporal evolution of the EGM shapes during sinus rhythm. We show that unipolar EGMs are slightly affected by position changes, while bipolar ones are unaffected. Moreover, the morphological variability of both EGMs is significant during the first post-implant month and very small after a few months.

Collectively, these findings provide important information for the design of a statistically valid template updating procedure for morphological algorithms in ICDs.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

I. Ahriz, B. Denby, G. Dreyfus, R. Dubois, P. Roussel
The ARPEGEO Project: A New Look at Cellular RSSI Fingerprints for Localization
IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, Istanbul (2010).

Abstract

A new technique developed at ESPCI ParisTech should allow cellular received signal strength fingerprints to play an important role in localization systems for regions which are not well covered by GPS. The article describes the ARPEGEO project, initiated to evaluate the impact of full-band GSM fingerprints analyzed with modern machine learning techniques. Results on indoor localization, as well as techniques to facilitate practical implementation of the method, are presented.

Cliquez ici pour obtenir le document / Click here to download the .pdf. document

P. Milpied, R. Dubois, P. Roussel, C. Henry, G. Dreyfus
Arrhythmia Classification using Spatial Projection of Tachycardia Electrograms
Cardiostim 2010, Nice (2010).

Abstract

Discrimination of Ventricular Tachycardia (VT) from Supra-Ventricular Tachycardia (SVT) remains a major challenge for appropriate therapy delivery in ICDs. Historically, only time intervals extracted from electrograms (EGMs) were used for diagnosis. Morphology algorithms were added to improve performances. We propose a new discrimination algorithm that analyses the morphology of a two-dimensional representation of EGMs, named "Spatial Projection Of Tachycardia" (SPOT). The SPOT curve of a cardiac cycle is the plot of the amplitude of the far-field EGM versus the amplitude of the near-field EGM, with time as a parameter.

Two morphological features are extracted from the comparison of arrhythmia and NSR reference SPOT curves: i) the average angle of the relative velocity vectors and ii) the correlation coefficient between the norms of the velocity. Each arrhythmia is classified as a VT or an SVT according to the values of these two features.

Decision thresholds on feature values were estimated using two databases: a private one, featuring 29 induced VT and 19 induced SVT from 32 patients (28 males, 57?15.5 yrs, 50% ischemic cardiomyopathy), and the standard Ann Arbor Electrogram Libraries (AAEL), featuring 64 induced VT and 7 induced SVT from 41 patients (34 males, 62?13.2 yrs, 73.1% CAD). The algorithm provides 100% sensitivity and 92% specificity on these databases. A spontaneous arrhythmia database from implanted patients is under construction: the preliminary results on the first 5 patients give 100% sensitivity and 94% specificity on 3 VT and 50 SVT.

SPOT-based discrimination algorithm alone exhibits high sensitivity and specificity for VT/SVT discrimination on few patients. This technique could significantly improve arrhythmia discrimination in VR-ICDs. A dedicated clinical evaluation will be conducted to confirm those preliminary results.

[ Retour a la liste des publications / Back to the list of publications ]

B. Degand, P. Milpied, R. Dubois, C. Henry, G. Dreyfus
Atrial Activity Extraction in Single-Chamber Implantable Defibrillators
Cardiostim 2010, Nice (2010).

Abstract

For a number of patients eligible for ICD implantation, a single chamber (VR) model is sufficient. But to improve performances in arrhythmias classification, a dual chamber (DR) ICD could be preferred. However, to use DR algorithms with VR systems, atrial sensing is missing. This study uses a machine learning technique (Independent Component Analysis - ICA) to extract atrial activity from a single dual coil ventricular lead: it constructs a 'virtual EGM' (VEGM) from the two unipolar signals recorded between electrodes of the same nature and size: Right Ventricular Coil-Can (RVC) and Superior Vena Cava Coil-Can (SVCC).

During ICD implantation, 15s of EGM were recorded on both electrodes, and a VEGM was constructed with ICA. Both the SVCC EGM and the VEGM were used in turn as a surrogate of atrial signals in a DR-ICD algorithm. The VEGM being similar to a far-field signal, a P-refractory period and a P-wave confirmation window are added to avoid double counting.

14 patients with normal sinus rhythm were tested (12 males, 65?13.3 yrs, 6 dilated, 7 ischemic, 1 hypertrophic cardiomyopathy, LVEF: 30?11.5%). In terms of P wave detection, the device provides less under- and oversensing from the VEGM than from the SVCC EGM in 10 patients with the standard ICD settings. For the 4 others, due to a poor signal to noise ratio, the P wave detection threshold was adapted in the ICD. For these patients, the oversensing is much lower from the VEGM than from the SVCC EGM (9.2% vs 115.4%).

ICA applied to unipolar EGMs from a double coil lead provides encouraging results for atrial sensing. This technique could significantly improve arrhythmia discrimination in VR-ICDs.

M. Elgendi, B. Rebsamen, A. Cichock, F. Vialatte, J. Dauwels
Real-Time Wireless Sonification of Brain Signals
3rd International Conference on Cognitive Neurodynamics, Hokkaido, Japan (2011)

Abstract

In this paper, an alternative representation of EEG is investigated, in particular, translation of EEG into sound; patterns in the EEG then correspond to sequences of notes. The aim is to provide an alternative tool for analysing and exploring brain signals, e.g., for diagnosis of neurological diseases. Specifically, a system is proposed that transforms EEG signals, recorded by a wireless headset, into sounds in real-time. In order to assess the resulting representation of EEG as sounds, the proposed sonification system is applied to EEG signals of Alzheimer's (AD) patients and healthy age-matched control subjects (recorded by a high-quality wired EEG system). Fifteen volunteers were asked to classify the sounds generated from the EEG of 5 AD patients and 5 healthy subjects; the volunteers labeled most sounds correctly, in particular, an overall sensitivity and specificity of 93.3% and 97.3% respectively was obtained, suggesting that the sound sequences generated by the sonification system contain relevant information about EEG signals and underlying brain activity.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

G. Attuel, P. Attuel, N. Derval, L. Glass, and M. Haissaguerre
Statistical monitoring of atrial fibrillation ?
11^th Experimental chaos and complexity conference, Lille (2011).

Abstract

It is an open question, whether complex fragmented activity during fibrillation in the atrium, might characterise the stage of the pathology. Eventually, this could be used as genuine monitoring during ablation. We adress it by analysing the statistical properties of human's endocavitary electrograms during ablation. Particular attention is given to the fluctuations of the potential, which are in general not considered as relevant, for lack of clear interpretation. We believe that these are prototypical of non-equilibrium fluctuations, and that interpretation can be confidently envisaged from their statistical properties. A recent theoretical clarifcation on the probability distribution functions is a basic guideline for the study.

L. Crevier-Buchman, C. Gendrot, B. Denby, C. Pillot-Loiseau, P. Roussel, A. Colazo-Simon, G. Dreyfus, T. Hueber,
Articulatory Strategies for Lip and Tongue Movements in Silent versus Vocalized Speech,
International Congress of Phonetic Science 2011, Hong Kong (2011).

Abstract

In the context of Silent Speech Communication (SSC) development after total laryngectomy rehabilitation, tongue and lip movements were recorded with a portable ultrasound transducer and a CCD video camera respectively. A list of 60 French minimal-pairs and a list of 50 most frequent French words were pronounced in vocalized and silent mode by one speaker. Amplitude and timing of the articulatory movements were measured and compared in the two modes. This study showed for silent speech, i) a reduced duration of words, ii) a general hypoarticulation for lips, but non significant changes for tongue movements depending on the type of vowel and consonant.
Keywords: Silent speech, articulation, labial imaging, ultrasound imaging, tongue movement.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

J. Cai, T. Hueber, B. Denby, E.-L. Benaroya, G. Chollet, P. Roussel, G. Dreyfus, L. Crevier-Buchman,
A Visual Speech Recognition System for an Ultrasound-Based Silent Speech Interface
International Conference on Phonetic Science, Hong Kong (2011).

Abstract

The development of a continuous visual speech recognizer for a silent speech interface has been investigated using a visual speech corpus of ultrasound and video images of the tongue and lips. By using high-speed visual data and tied-state cross-word triphone HMMs, and including syntactic information via domain-specific language models, word-level recognition accuracy as high as 72% was achieved on visual speech. Using the Julius system, it was also found that the recognition should be possible in nearly real-time.

Keywords: silent speech interface, visual speech recognition, vocal tract ultrasound imaging

Cliquez ici pour obtenir le document / Click here to download the .pdf document

M. Elgendi, F. Vialatte, A. Cichocki, J. Dauwels
Optimization of EEG Frequency Bands for Improved Diagnosis of Alzheimer Disease
33rd Engineering in Medicine and Biology Society Conference, Boston, USA (2011).

Abstract

Many clinical studies have shown that electroencephalograms (EEG) of Alzheimer patients ( AD) often have an abnormal power spectrum. In this paper a fr equency band analysis of AD EEG signals is presented, with the aim of improving the diagnosis of AD from EEG signals. Relative power in different EEG frequency bands is used as features to distinguish between AD patients and healthy control subjects. Many different frequency bands between 4 and 30Hz are systematically tested, besides the traditional freq
uency bands, e.g., theta band (4-8Hz). The discriminative power of the resulting spectral features is assessed through statistical tests (Mann-Whitney U test). Moreover, linear discriminant analysis is conducted with those spectral features. The optimized frequency ranges (4-7Hz, 8-15Hz, 19-24Hz) yield sub stantially better classification performance than the traditional frequency bands (4-8Hz, 8-12Hz, 12-30Hz); the frequency band 4-7Hz is the optimal frequency range for detecting AD, which is similar to the classical theta band. The frequency bands were also optimized as features through leave-one-out crossvalidation, resulting in error-free classification. The optimized frequency bands may improve existing EEG based diagnostic tools for AD. Additional testing on larger AD datasets is required to verify the effectiveness of the proposed approach.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

B. D. Lindsay, R. Dubois, A. Shah, C. Ramanathan, S. Zuckerman, M. Strom, B. George, N. Varma, M. Hocini, P. Jais, M. Haissaguerre,
Novel Directional Activation Map Using Local Propagation Between Adjacent Electrograms,
Heart Rhythm, San Francisco (2011).

Abstract

Introduction: Standard isochronal maps are derived from subjective annotation of individual electrograms. We evaluated a novel Directional Activation Mapping (DAM) in mapping atrial tachycardias (ATs).
Methods: The DAM calculates the time delay between adjacent electrograms and assigns local propagation vectors. 3D global epicardial activation was constructed by displaying composite vectors on epicardial maps calculated by CardioInsight Electrocardiographic Mapping (ECM) system from 252 body surface electrograms combined with CT anatomy. DAM was validated during pacing and endocardial mapping in 10 AT patients.
Results: DAM accurately mapped ATs in 9 of 10 patients with 5 focal and 5 macroreentrant ATs. One case was inevaluable due to extensive scarring and multiple ATs. Reentrant mechanisms included typical RA flutter and scar related RA and LA flutter. Conclusions: The spatial vectors derived from the new method accurately delineated 90% of ATs and facilitated analysis of ECM. Directional Activation Mapping offers a novel physiologic approach to mapping 3D propagation and activation.

B. Denby, J. Cai, P. Roussel, G. Dreyfus, L. Crevier-Buchman, C. Pillot-Loiseau, T. Hueber, G. Chollet,
Tests of an Interactive, Phrasebook-Style Post-Laryngectomy Voice-Replacement System,
International Congress of Phonetic Science 2011, Hong Kong (2011).

Abstract

The article presents the results of tests of a portable post-laryngectomy voice replacement system that allows a silently articulating speaker to select and play back short phrases contained in a 60-phrase phrasebook. Such a system could be a useful communication tool for post-laryngectomy patients unable to use tracheo-oesophageal speech. Experiments on two non-pathological speakers and one person having undergone a total laryngectomy in 1998 are presented. Results are promising and provide proof of principle for a more sophisticated system currently being developed.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

T. Hueber, P. Badin, G. Bailly, A. Ben Youssef, F. Elesei, B. Denby, G. Chollet,
Statistical Mapping Between Articulatory and Acoustic Feedback. Application to Silent Speech Interface and Visual Articulatory Feedback,
1^st International Workshop on Performative Speech and Singing Synthesis (P3S), Vancouver, Canada (2011).

Abstract

This paper reviews some theoretical and practical aspects of different statistical mapping techniques used to model the relationships between the articulatory gestures and the resulting speech sound. These techniques are based on the joint modeling of articulatory and acoustic data using Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM). These methods are implemented in two systems: (1) the silent speech interface developed at SIGMA and LTCI laboratories which converts tongue and lip motions, captured during silent articulation by ultrasound and video imaging, into audible speech, and (2) the visual articulatory feedback system, developed at GIPSA-lab, which automatically animates, from the speech sound, a 3D orofacial clone displaying all articulators (including the tongue). These mapping techniques are also discussed in terms of real-time implementation.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

Y. Tomita, A. Gaume, H. Bakardjian, M. Maurice, A. Cichocki, Y. Yamaguchi, G. Dreyfus, F. Vialatte
Concatenation method for high temporal resolution SSVEP-BCI
International Conference on Neural Computation Theory and Applications, Paris (2011).

Abstract

Electroencephalographic (EEG) signals are generally non-stationary, however, nearly stationary brain responses, such as steady-state visually evoked potentials (SSVEP), can be recorded in response to repetitive stimuli. Although Fourier transform has precise resolution with long time windows (5 or 10 s for instance) to extract SSVEP response (1-100 Hz ranges), its resolution with shorter windows decreases due to the Heisenberg-Gabor uncertainty principle. Therefore, it is not easy to extract evoked responses such as SSVEP within short EEG epochs. This limits the information transfer rate of SSVEP-based brain-computer interfaces. In order to circumvent this limitation, we concatenate EEG signals recorded simultaneously from different channels, and we Fourier analyze the resulting sequence. From this constructed signal, high frequency resolution can be obtained with time epochs as small as only 1 s, which improves SSVEPs classification. This method may be effective for high-speed brain computer interfaces (BCI).

Cliquez ici pour obtenir le document / Click here to download the .pdf document

K. Hiyoshi-Taniguchi, F. Vialatte, M. Kawasaki, H. Fukuyama, A. Cichocki
Neurodynamics of emotional judgments in the human brain.
International Conference on Neural Computation Theory and Applications, Paris (2011).

Abstract

The purpose of this study is to clarify multi-modal brain processing related to human emotions. This study aimed to induce a controlled perturbation in the emotional system of the brain by multi-modal stimuli, and to investigate whether such emotional stimuli could induce reproducible and consistent changes in EEG signals. We exposed two subjects to auditory, visual, or combined audio-visual stimuli. Audio stimuli consisted of voice recordings of the Japanese word 'arigato' (thank you) pronounced with three different intonations (Angry - A, Happy - H or Neutral - N). Visual stimuli consisted of faces of women expressing the same emotional valences (A, H or N). Audio-visual stimuli were composed using either congruent combinations of faces and voices (e.g. H x H) or non-congruent (e.g. A x H). The data was collected with EEG system and analysis was performed by computing the topographic distributions of EEG signals in the theta, alpha and beta frequency ranges. We compared the conditions stimuli (A or H) vs. control (N), and congruent vs. non-congruent. Topographic maps of EEG power differed between those conditions on both subjects. The obtained results suggest that EEG could be used as a tool to investigate emotional valence and discriminate various emotions.

M. Elgendi, F. Vialatte, M. Constable, J. Dauwels
Immersive neurofeedback: a new paradigm
International Conference on Neural Computation Theory and Applications, Paris (2011).

Abstract

Healthcare organizations continue to pursue ways of offering higher-quality care to face the demand and expectations in promoting and maintaining health and in disease prevention. Currently, in neuroscience, there is an undergoing paradigm shift towards immersive neurofeedback mechanism. This will improve the user's (or patient's) ability to control brain activity, medical diagnoses, and rehabilitation of neurological or psychiatric disorders. Indeed, several psychological and medical studies have confirmed that virtual immersive activity is enjoyable, stimulating, and can have a healing effect. The new paradigm consists of an immersive room and three input devices: Emotiv headset (wireless non-invasive acquisition of brain waves), Kinect camera (gesture recognition), and wireless microphone (voice/speech recognition); towards immersive treatment and better quality health system in the near future.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

B. Denby, J. Cai, T. Hueber, P. Roussel, G. Dreyfus, L. Crevier-Buchman, C. Pillot-Loiseau, G. Chollet, S. Manitsaris, M. Stone,
Towards a Practical Silent Speech Interface Based on Vocal Tract Imaging,
International Seminar on Speech Production 2011, Montréal, Canada (2011).

Abstract

The paper describes advances in the development of an ultrasound silent speech interface for use in silent communications applications or as a speaking aid for persons who have undergone a laryngectomy. It reports some first steps towards making such a device light weight, portable, interactive, and practical to use. Simple experimental tests of an interactive silent speech interface for everyday applications are described. Possible future improvements including extension to continuous speech and realtime operation are discussed.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

A. Bornancin Plantier, A. Johannet, P. Roussel, G. Dreyfus
Flash flood forecasting by neural networks without rainfall forecasts: model selection and generalization capability
European Geosciences Union Meeting, Vienna (2011).

Abstract

The necessity of developing efficient forecasting tools for flash floods has been highlighted by the recent occurrences of catastrophic floods in the south of France such as in Vaison-la-Romaine (1992), Nîmes (1988), Gardons (2002), Arles (2003), Var (2010). These disasters result from intense rainfalls on small (a few hundreds of km2) and high-slope watersheds, resulting in flows of thousands of m³/s with concentration times of a few hours only . The death toll (over 100) in these circumstances in the southeast of France , and the cost of more than 1.2 billion euros in 2002 showed that a reliable tool to forecast such phenomena was mandatory.
Real time flood forecasting is a complex task with a growing economic and societal impact. Its complexity arises from the coupling of atmospheric, hydrological and geo-hydrological models; furthermore, the experimental data often lack reliability, which is an additional factor of complexity. Gathering more accurate data, increasing the accuracy of the current physics-based models, and implementing them on increasingly powerful computers, are very useful efforts, but they have limitations. In this context, the FLASH project [FLASH 2010], funded by the French Agency of Research (ANR) proposes an alternative solution, which complements the above-mentioned approach by designing models in a machine-learning perspective.
The watershed of interest is the Gardon d'Anduze watershed (540 km2). Neural networks were developed to forecast the water level at Anduze for various forecasting horizons from 30 minutes to 5 hours. The database includes 17 flash floods, which occurred between 1994 and 2008. The experimental measurements were supplied by six rain gauges and three gauge stations.
As time plays a functional role in the rainfall-runoff relation, discrete-time dynamic models must be designed. In this work a nonlinear function implemented by a multilayer perceptron with time delays was chosen. The least squares cost function is optimized with respect to the parameters by Levenberg Marquardt optimization, the gradient of the cost function being computed by backpropagation. The water level was predicted as a time series, with rainfalls as exogenous inputs: the model variables were past values of the water levels and of the rainfall. The water level was preferred to the water flow as predicted quantity, because it makes the prediction independent from the rating curve, which is not known accurately for high outflows.
Due to the high heterogeneity of rainfall, the rainfall forecasts are not yet available at the required small spatial and temporal scales. A specific model is thus adjusted for each forecasting horizon (half of an hour to 5 hours) without future rainfall information.
As the model complexity control is a particularly critical issue, due (i) to the lack of accurate estimations of rainfalls, and (ii) to the high noise level in water level measurements, the traditional early stopping regularization method was used. Model complexity selection was performed by a variant of cross-validation [Dreyfus 2005] using various validation scores. The sliding window width for rainfalls and for past levels, as well as the number of hidden neurons, and the hyperparameters of the optimization algorithm were estimated similarly.
The quality of the generalization is assessed by performance criteria calculated on three test events (independent from the training and validation sets): September 2002, October 2008 and November 2008.
Hydrographs at several forecasting horizons are displayed. Very satisfactory results are obtained up to a forecasting horizon of three hours (Nash criteria evolving between 0.95 for half of an hour, and 0.50 for three hours), thereby allowing early warnings to be issued to the public.

T. Hueber, E.-L. Benaroya, B. Denby, G. Chollet,
Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-based Silent Speech Interface,
Interspeech 2011, Florence, Italy (2011).

Abstract

This paper presents recent developments on our "silent speech interface" which converts tongue and lip motions, captured by ultrasound and video imaging, into audible speech. We present here two approaches to model the relationships between the observed articulatory movements and the resulting speech sound, which are based on the joint modeling of visual and spectral features using respectively Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). The prediction of the voiced/unvoiced parameter from visual articulatory data only is also investigated using an artificial neural network (ANN). The proposed mapping techniques are evaluated on a continuous speech database containing one hour of high-speed ultrasound and video sequences.
Index Terms: silent speech interface, GMM, HMM, ultrasound, video, multimodal, statistical mapping

Cliquez ici pour obtenir le document / Click here to download the .pdf document

B. Happi, O. Romain, B. Denby, E.-L. Benaroya, F. De Dieuleveult, B. Granado, H. Kherimi, G. Chollet, D. Petrovska-Delecrétaz, R. Blouet,
Software Radio FM Broadcast Receiver for Audio Indexing Applications,
Industrial Conference on Industrial Technology 2012, Athens, Greece (2012).

Abstract

Broadcast radio is a rich but underexploited source of multimedia content. To make this available to users, it will be indispensable to develop new types of navigators capable of searching the large quantities of information contained in the radio bands. The article introduces a prototype of a new software radio enabled broadcast media navigator implemented on an FPGA, which is able to demodulate simultaneously all channel in the FM band and perform audio indexing upon them, ultimately using a Graphics Processing Unit.

B. Happi, O. Romain, B. Denby, E.-L. Benaroya, S. Viateur,
FPGA-Based Radio-on-Demand Broadcast Receiver with Musical Genre Identification,
International Symposium on Industrial Electronics 2012, Hangzhou, China (2012).

Abstract

Broadcast radio is a rich yet underexploited source of multimedia content. To make this content available to users, it will be indispensable to develop new types of navigators capable of searching the large quantities of information contained in the radio bands. The article introduces a prototype of a new software radio enabled broadcast media navigator implemented on a Field Programmable Gate Array and multi-core processor, which is able to demodulate simultaneously all channel in the FM band and perform a real time classification of the musical genre.

S. Manitsaris, F. Xavier, B. Denby, G. Dreyfus, P. Roussel,
An Open Source Speech Synthesis Module for a Visual-Speech Recognition System,
Acoustics 2012, Nantes (2012).

Abstract

A Silent Speech Interface (SSI) is a voice replacement technology that permits speech communication without vocalization. The visual-speech recognition engine of the proposed SSI is based on vocal tract imaging. The system aims to give the laryngectomised speaker the opportunity to speak with his/her original voice. This paper presents the speech synthesis module of a SSI that uses the open-source MaryTTS (Text-To-Speech). The visual-speech recognition engine of the SSI outputs a text sentence, which is imported to the speech synthesis module in order to synthesize speech in French or English. A new module of phonetic transcription has been developed and integrated into MaryTTS. In addition, English and French semi-HMM (Hidden Markov Models) model voices have been built. The SSI can be remotely controlled using a mobile device and the new voices are installed in a Web Server.

E. Gallego-Jutgla, M. Elgendi, F. Vialatte, J. Solé-Casals, A. Cichocki, C. Latchoumane, J. Jeong, J. Dauwels
Diagnosis of Alzheimer's Disease from EEG by Means of Synchrony Measures in Optimized Frequency Bands
34th Engineering in Medicine and Biology Society Conference, San Diego, USA (2012).

Abstract

Several clinical studies have reported that EEG synchrony is affected by Alzheimer's disease (AD). In this paper a frequency band analysis of AD EEG signals is presented, with the aim of improving the diagnosis of AD using EEG signals. In this paper, multiple synchrony measures are assessed through statistical tests (Mann-Whitney U test), including correlation, phase synchrony and Granger causality measures. Moreover, linear discriminant analysis (LDA) is conducted with those synchrony measures as features. For the data set at hand, the frequency range (5-6Hz) yields the best accuracy for diagnosing AD, which lies within the classical theta band (4-8Hz). The corresponding classification error is 4.88% for directed transfer function (DTF) Granger causality measure. Interestingly, results show that EEG of AD patients is more synchronous than in healthy subjects within the optimized range 5-6Hz, which is in sharp contrast with the loss of synchrony in AD EEG reported in many earlier studies. This new finding may provide new insights about the neurophysiology of AD. Additional testing on larger AD datasets is required to verify the effectiveness of the proposed approach.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

Y. Tian, B. Denby, I. Ahriz, P. Roussel G. Dreyfus
Fast, Handset-Based GSM Fingerprints for Indoor Localization
9^th International Symposium on Wireless Communications and Systems, Paris (2012)

Abstract

Accurately localizing users in indoor environments remains an important and challenging task. The article presents new results on room-level indoor localization, using cellular Received Signal Strength fingerprints collected with a standard cellular handset programmed to perform fast scans of the 900 and 1800 Megahertz GSM bands as a user explores an indoor environment at a normal walking pace. Support Vector Machines are used to deal with the high dimensionality of the fingerprints. The study demonstrates that an appropriately programmed standard cellular handset can provide a simple, inexpensive solution for accurate room-level indoor localization.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

T. Hueber, G. Bailly, B. Denby,
Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech Interface,
Interspeech, Portland, USA (2012).

Abstract

The article presents an HMM-based mapping approach for converting ultrasound and video images of the vocal tract into an audible speech signal, for a silent speech interface application. The proposed technique is based on the joint modeling of articulatory and spectral features, for each phonetic class, using Hidden Markov Models (HMM) and multivariate Gaussian distributions with full covariance matrices. The articulatory-to-acoustic mapping is achieved in 2 steps:
1) finding the most likely HMM state sequence from the articulator y observations;
2) inferring the spectral trajectories from both the decoded state sequence and the articulatory observations.
The proposed technique is compared to our previous approach, in which only the decoded state sequence was used for the inference of the spectral trajectories, independently from the articulatory observations.
Both objective and perceptual evaluations
show that this new approach leads to a better estimation of the spectral trajectories.

Index Terms: silent speech interface, handicap, HMM-based speech synthesis, audio visual speech processing.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

A. Rathi, X. Zhang, F.B. Vialatte
FPGA implementation of SOBI to perform BSS in real time
International Conference on Neural Computation Theory and Applications, Barcelona (2012).

Abstract

Blind Source Separation (BSS) is an effective and powerful tool for source separation and artifact removal in EEG signals. For the real time applications such as Brain Computer Interface (BCI) or clinical Neuro-monitoring, it is of prime importance that BSS is effectively performed in real time. The motivation to implement BSS in Field Programmable Gate Array (FPGA) comes from the hypothesis that the performance of the system could be significantly improved in terms of speed considering the optimal parallelism environment that hardware provides. In this paper, FPGA is used to implement the SOBI algorithm of EEG with a fixed-point algorithm. The results obtained show that, FPGA implementation of SOBI reduces the computation time and thus has great potential for real time.

Cliquez ici pour obtenir le document / Click here to download the .pdf document

J. Thorey, P. Adibpour P., Y. Tomit, A. Gaume, H. Bakardjian, G. Dreyfus, F.B. Vialatte
Fast BCI calibration: comparing methods to adapt BCI systems for new subjects
International Conference on Neural Computation Theory and Applications, Barcelona (2012).

Abstract

A Brain Computer Interface (BCI) is a system where a direct connection is established between the brain and a computer, providing a subject with a new communication channel. Unfortunately, BCI have many drawbacks: signal recording is problematic, brain signatures are non reproducible from individual to individual, etc. A dependent-BCI prototype, the BrainPC project, was developed in the SIGMA laboratory. Electroencephalographic (EEG) signals collected by a BrainAmp amplifier in responses to flickering light stimuli (Steady State Visual Evoked Potentials) are converted into machine-readable commands. This system is coupled with a human-machine interface. We propose a solution for fast calibration of the automatic detection of SSVEP between experimental subjects. We tested different calibration methods; harmonic and electrode selections were shown to be the most efficient methods.

O. Romain, B. Happi Tietche, B. Denby,
Prototype of a Radio-on-demand Broadcast Receiver with Real Time Musical Genre Classification,
Conference on Design and Architectures for Signal and Image Processing DASIP 2012, Karlsruhe, Germany (2012). (Best Demo Award)

Abstract

This demo will show a prototype of a new software radio enabled broadcast media navigator implemented on an FPGA and quad-core processor, which is able to demodulate simultaneously all channel in the FM band and perform a real time classification of the musical genre. This prototype represents the elementary component of a navigator capable of searching the large quantities of information contained in the radio bands.

N. Houmani, F. B. Vialatte, C. Latchoumane, J. Jeong, G. Dreyfus
Stationary Epoch-based Entropy Estimation for Early Diagnosis of Alzheimer's Disease
IEEE FTFC 2013, Paris (2013)

Several studies showed that EEG signal of Alzheimer's disease patients is less complex than that of healthy subjects. In this article, we propose to characterize the complexity of the EEG signal by an entropy measure based on local density estimation by a Hidden Markov Model. We first show that this measure leads to consistent results qualitatively and quantitatively (in terms of classification accuracy). Indeed, it discriminates AD patients, at an early stage of Alzheimer's disease, from healthy subjects: a classification accuracy of 80% is reached on a dataset including EEG data recorded in different conditions. Based on this measure, we also show that parietal and temporal regions are the first regions affected by complexity loss in the early stage of Alzheimer's disease.

Keywords - EEG signal; Complexity measure; Stationary epochs; Entropy; HMM; Alzheimer's disease.

J. Cai, T. Hueber, S. Manitsaris, P. Roussel, L. Crevier-Buchman, M. Stone, C. Pillot-Loiseau, G. Chollet, G. Dreyfus, B. Denby
Vocal Tract Imaging System for Post-Laryngectomy Voice Replacement
IEEE International Instrumentation and Measurement Technology Conference (I2MTC) (2013)

The article describes a system that uses real time measurements of the vocal tract to drive a voice-replacement system for post-laryngectomy patients. Based on a thermoformed acquisition helmet, miniature ultrasound machine, and video camera, and incorporating Hidden Markov Model speech recognition, the device has been tested on three speakers, one of whom has undergone a total laryngectomy. Results show that the device obtains exploitable recognition rates, and that performances on normal and post-laryngectomy speakers are nearly identical. The technique can also enable voice
communication for normal speakers in situations where silence must be maintained.

Keywords--vocal tract measurement; silent speech interface;
voice replacement; handicapped speech; laryngectomy

Ye Tian, Bruce Denby, Iness Ahriz, Pierre Roussel, Rémi Dubois, Gérard Dreyfus
Practical Indoor Localization using Ambient RF
EEE International Instrumentation and Measurement Technology Conference (I2MTC) (2013)

The article presents a simple, practical approach for indoor localization using Received Signal Strength fingerprints from the GSM network, including an analysis of the relationship between signal strength and location, and the evolution of localization performance over time. Support Vector Machine regression applied to very high dimensional fingerprints does not reveal any smooth functional relationship between fingerprints and position. Classification using Support Vector Machines however provides very good results on discriminating different rooms in an indoor environment, albeit with performance that degrades over time. Transductive inference, introduced as a means of updating models to overcome degradation over time, provides hints that accurate indoor localization can be achieved by applying classification methods to cellular Received Signal Strength fingerprints, performance robustness being maintained via model updating and refining.

K. Boukharouba, P. Roussel, G. Dreyfus, A. Johannet
Flash Flood Forecasting Using Support Vector Regression: An Event Clustering Based Approach
IEEE International Workshop on Machine Learning for Signal Processing, Southampton (2013).

We present a new machine learning approach to flash flood forecasting in the absence of rainfall forecasts, based on the agglomerative hierarchical clustering of flood events. Each cluster contains events whose models have similar behaviors. Specific Support Vector Regression models are then trained from each cluster. The test results show that a specific model may be more accurate than a general model trained from all floods present in the training database.

Index Terms-- Flash flood forecasting, Support vector regression, Hierarchical clustering, NARX model, Thiessen polygon.