Annonce

Les commentaires sont clos.

Postdoc in Pathological Speech Processing at INRIA Bordeaux, France

18 Février 2022


Catégorie : Post-doctorant


Keywords: Pathological speech processing, Glottal source estimation, Inverse filtering, Machine learning, Parkinsonian disorders, Respiratory diseases

Contact and Supervisor: Khalid Daoudi (khalid.daoudi@inria.fr)

INRIA team: GEOSTAT (geostat.bordeaux.inria.fr)

Duration: 14 months(could be extended)

Starting date:between 01/04/2022 and 01/06/2022 (depending on the candidate availability)

Application: via https://recrutement.inria.fr/public/classic/en/offres/2022-04481

Salary: 2653€/month (before taxes, net salary 2132€)

Profile: PhD thesisin signal/speech processing (or a solid post-thesis experience in the field)

Required Knowledge and background: A solid knowledge in speech/signal processing; Basics of machine learning; Programming in Matlab and Python.


 

 



Title: Glottal source inverse filtering for the analysis and classification of pathological speech

Scientific research context

During this century, there has been an ever increasing interest in the development of objective vocal biomarkers to assist in diagnosis and monitoring of neurodegenerative diseases and, recently, respiratory diseases because of the Covid-19 pandemic. The literature is now relativelyrich in methods for objective analysis of dysarthria, a class of motor speech disorders [1], where most of the effort has been made on speech impaired by Parkinson’s disease. However, relatively few studieshave addressed the challenging problem of discrimination between subgroups of Parkinsonian disorders which share similar clinical symptoms, particularly is early disease stages [2]. As for the analysis of speech impaired by respiratory diseases, the field is relatively new (with existing developments in very specialized areas) but is taking a great attention since the beginning of the pandemic.

The speech production mechanism is essentially governed by five subsystems: respiratory, phonatory, articulatory, nasalic and prosodic. In the framework of pathological speech, the phonatory subsystem is the most studied one, usually using sustained phonation (prolonged vowels). Phonatory measurements are generally based on perturbations or/and cepstral features. Though these features are widely used and accepted, they are limited by the fact that the produced speech can be a product of some or all the other subsystems. The latter thus all contribute to the phonatory performance. An appealing way to bi-pass this problem is to try to extract the glottal source from speech in order to isolate the phonatory contribution. This framework is known as glottal source inverse filtering (GSIF) [3]. The primary objective of this proposal is to investigate GSIF methods in pathological speech impaired by dysarthria and respiratory deficit. The secondobjectiveis to use the resulting glottal parameterizationsas inputs to basic machine learning algorithms in order to assist inthe discrimination between subgroups of Parkinsonian disorders (Parkinson’s disease, Multiple-System Atrophy, Progressive Supranuclear Palsy) and in the monitoring of respiratory diseases (Covid-19, Asthma, COPD).

 

Both objectives benefit from a rich dataset of speech and other biosignals recently collected in the framework of two clinical studies in partnership with university hospitals in Bordeaux and Toulouse (for Parkinsonian disorders) and in Paris (for respiratory diseases).

Work description

GSIF consists in building a model to filter out the effect of the vocal tract and lips radiation from the recorded speech signal. This difficult problem, even in the case of healthy speech, becomes more challenging in the case of pathological speech. We will first investigate time-domain methods for the parameterization of the glottal excitation using glottal opening and closure instants. This implies the development of a robust technique to estimate these critical time-instants from dysarthric speech. We will then explorethe alternative approach of learning a parametric model of the entire glottal flow. Finally, we will investigate frequency-domain methods to determine relationships between different spectral measures and the glottal source. These algorithmicdevelopments will be evaluated and validated using a rich set of biosignals obtained from patients with Parkinsonian disordersand from healthy controls. The biosignals are electroglottography and aerodynamic measurementsoforal and nasal airflow as well as intra-oral and sub-glottic pressure.

After dysarthric speech GIFS analysis, we will studythe adaptation/generalization tospeech impaired by respiratory deficits. The developments will be evaluated using manual annotations, by an expert phonetician, of speech signals obtained from patients with respiratory deficit and from healthy controls.

The second aspect of the work consists in manipulating machine learningalgorithms (LDA, logistic regression, decision trees, SVM…) using standard tools (such as Scikit-Learn). The goal here will be to study the discriminative power of the resulting speech features/measures and their complementarity with other features related to different speech subsystems. The ultimate goal being to conceive robust algorithms toassist, first, in the discrimination between Parkinsonian disorders and, second,in the monitoring of respiratory deficit.

Work synergy

- The postdoc will interact closely with an engineer who is developing an open-source software architecture dedicated to pathological speech processing. The validated algorithms will be implemented in this architecture by the engineer, under the co-supervision of the postdoc.

- Giving the multidisciplinary nature of the proposal, the postdoc will interact with the clinicians participating in the two clinical studies.

References:

[1] J. Duffy. Motor Speech Disorders Substrates, Differential Diagnosis, and Management. Elsevier, 2013.

[2] J. Rusz et al. Speech disorders reflect differing pathophysiology in Parkinson's disease, progressive supranuclear palsy and multiple system atrophy. Journal of Neurology, 262(4), 2015.

[3] P. Alku. Glottal inverse filtering analysis of human voice production – A review of estimation and parameterization methods of the glottal excitation and their applications. Sadhana – Academy Proceedings in Engineering Sciences. Vol. 36, Part 5, pp. 623-650, 2011.

 

Starting date: between 01/04/2022 and 01/06/2022 (depending on the candidate availability)