Soutenance de thèse de Youssef Tamaazousti
24 Mai 2018
Catégorie : Soutenance de thèse
Youssef Tamaazousti soutiendra sa thèse intitulée « On the Universality of Visual and Multimodal Representations ». La soutenance aura lieu le 1er juin 2018 à Nano-INNOV (avenue de la Vauve, 91120 Palaiseau -- amphi 33, Bâtiment 862) à 14h.
Youssef Tamaazousti soutiendra sa thèse intitulée « On the Universality of Visual and Multimodal Representations ».
La soutenance aura lieu le 1er juin 2018 à Nano-INNOV (amphi 33, Bâtiment 862) à 14h devant le jury suivant :
Matthieu Cord (Pr. Sorbonne Univ.)
Céline Hudelot (Pr. CentraleSupelec) -- directrice
Philippe-Henri Gosselin (Pr. ENSEA) -- rapporteur
Iasonas Kokkinos (Univ. College London et FAIR)
Hervé Le Borgne (CEA LIST) -- encadrant
Florent Perronnin (Dir. Scientifique Navers Labs Europe) -- rapporteur
Pablo Piantanida (MdC CentraleSupelec)
Because of its key societal, economic and cultural stakes, Artificial Intelligence (AI) is a hot topic. One of its main goal is to develop systems that facilitates the daily life of humans, with applications such as household robots, autonomous vehicle and much more. The rise of AI is highly due to the emergence of tools based on deep neural-networks which make it possible to simultaneously learn, the representation of the data and the task to solve (traditionally learned with statistical models). This resulted from the conjunction of theoretical advances, the growing computational capacity as well as the availability of many annotated data. A long standing goal of AI is to design machines inspired humans, capable of perceiving the world, interacting with humans, in an evolutionary way.
In this Thesis, we categorize the works around AI in the two following learning-approaches:
(i) Specialization: learn representations from few specific tasks with the goal to be able to carry out very specific tasks (specialized in a certain field) with a very good level of performance;
(ii) Universality: learn representations from several general tasks with the goal to perform as many tasks as possible in different contexts.
While specialization was extensively explored by the deep-learning community, only a few implicit attempts were made towards universality. Thus, the goal of this Thesis is to explicitly address the problem of improving universality with deep-learning methods, for image and text data. We have addressed this topic of universality in two different forms: through the implementation of methods to improve universality ("universalizing methods"); and through the establishment of a protocol to quantify its universality.
Concerning universalizing methods, we proposed three technical contributions: (i) in a context of large semantic representations, we proposed a method to reduce redundancy between the detectors through, an adaptive thresholding and the relations between concepts; (ii) in the context of neural-network representations, we proposed an approach that increases the number of detectors without increasing the amount of annotated data; (iii) in a context of multimodal representations, we proposed a method to preserve the semantics of unimodal representations in multimodal ones.
Regarding the quantification of universality, we proposed to evaluate universalizing methods in a Transfer-learning scheme. Indeed, this technical scheme is relevant to assess the universal ability of representations. This also led us to propose a new quantitative evaluation criterion for universalizing methods.