Les commentaires sont clos.

Passage à l'échelle de la recherche et de la fouille de contenus multimédia

Date : 3-11-2010
Lieu : Conservatoire National des Arts et Métiers
Amphi Z (R. Faure)
292 rue St Martin, 75003 Paris

Thèmes scientifiques :
  • B - Image et Vision

Nous vous rappelons que, afin de garantir l'accès de tous les inscrits aux salles de réunion, l'inscription aux réunions est gratuite mais obligatoire.

S'inscrire à la réunion.


7 personnes membres du GdR ISIS, et 0 personnes non membres du GdR, sont inscrits à cette réunion.

Capacité de la salle : personnes.


Journée commune du GDR ISIS (Axe 3 - Recherche d'informations et masses de données image et vidéo, thème B - Image et Vision) et du GDR I3 (thème 4 - Masses de données et accès à l'information).

Des applications comme les moteurs de recherche de données multimédia, la structuration de collections audio-visuelles ou la fouille de bases d'images aériennes motivent les travaux sur les techniques de structuration de bases de contenus multimédia et de recherche par le contenu dans de telles bases. Au-delà des exigences liées à la qualité des résultats, le potentiel applicatif des méthodes mises au point dépend de leur capacité à traiter efficacement de très grandes bases de données.
Après un bref rappel du contexte, nous examinerons les principales difficultés rencontrées dans le passage à l'échelle des méthodes de recherche et de fouille. Seront ensuite abordées les approches actuelles pour résoudre ces problèmes. Nous discuterons enfin de l'évolution des défis présents et de nouveaux défis potentiels.

Envoyez avant le 15/10/2010 vos propositions d'exposés par courriel à Laurent Amsaleg, Michel Crucianu et Frédéric Precioso, en incluant les informations suivantes : titre, auteurs avec affiliations, résumé (10-15 lignes) et, si disponible, un lien vers des travaux accessibles en ligne.

Organisateurs :


9h30 - 10h00 : Accueil et ouverture

10h00 :

  • "SALSAS : Sub-linear Active Learning Strategy with Approximate kNN Search for retrieval", Frédéric Précioso (ENSEA), Matthieu Cord (LIP6, UPMC)
  • "Random Maximum Margin Hashing", Alexis Joly (Projet IMEDIA, INRIA Rocquencourt)
  • "Similarity Search at a Very Large Scale Using Hadoop and Hbase", Stanislav Barton (European Web Archive & Wisdom), Philippe Rigaux (CEDRIC& Wisdom)
  • "Approximate search as a source coding problem, with application to large scale image retrieval", Hervé Jégou (INRIA)

12h00 - 14h00 : Déjeuner

14h00 :

  • "Utilisation de l'analyse factorielle des correspondances pour l'indexation et la recherche d'information dans une grande bases de données d'images", Annie Morin (IRISA) Nguyen-Khang Pham (IRISA - Université de Cantho)
  • "L'alignement dynamique pour comparer des extraits audio ou vidéo : une bonne idée ?", Romain Tavenard (Université de Rennes 1 / IRISA), Laurent Amsaleg (CNRS/ IRISA)
  • "Scalability issues for Earth Observation Image Mining", Mihai Datcu (German Aerospace Center DLR)

16h00 - 16h20 : pause café

16h20 : Information et discussion sur les campagnes d'évaluation du passage à l'échelle

17h00 : Clôture de la journée

Résumés des contributions

SALSAS : Sub-linear Active Learning Strategy with Approximate kNN Search for retrieval

Frédéric Précioso (ENSEA), Matthieu Cord (LIP6, UPMC)

In Content-Based Image Retrieval systems (CBIR), a classic scenario is toformulate the user query only with one example (i.e. one image). However,bridging the semantic gap between which (semantic) concept(s) the user is looking for and the (digital) content of this unique image is quite difficult. Active learning is a powerful technique which allows to interactively refine (through relevance feedback loops) a query concept by askingthe user whether some selected images are relevant or not. To be effective, an active learning strategy must be fast and efficient using as few relevance feedback loops as possible. However, the complexity of such methods is still linear in the size of the database and thus slow down retrieval systems. Scalability is then the major problem for that on-line learning methods. In this presentation, we propose a strategy to overcome this scalability limitation. Our technique exploits approximate k-Nearest-Neighbor Locality Sensitive Hashing (LSH) method and combines it with an active learning strategy dedicated to very large databases. If the Euclidean Locality Sensitive Hashing algorithm, approximating k-NN search in an euclidean space with a sub-linear complexity, is probably the most popular approach, euclidean metric does not always provide as accurate and relevant results as %1FChi2 distance. We thus define a new LSH scheme adapted to Chi%1F2 distance. We perform evaluation of our interactive retrieval system on several databases illustrating its complexity almost constant with the size of the database.

Random Maximum Margin Hashing

Alexis Joly (Projet IMEDIA, INRIA Rocquencourt)

Following the success of hashing methods for indexing purposes, recent research works are interested in using them for approximate distances computations (or sketch distances). Such approaches are not an alternative tousing index structures but a complementary way to reduce both the memory usage and the distances computation cost. Several data dependent hash functions have notably been proposed to closely fit data distribution and provide better selectivity than usual random projections based methods such as LSH. They also allow more genericity in terms of input data, notably kernelized data through recently proposed KLSH method. However, as discussed in this talk, improvements occur only for relatively small hash code sizes up to 64 or 128 bits. For larger hash code sizes, that may be required in many applications, random projection based methods are usually still better. Our claim is that the main issue of data dependent hash function is the lack of independence between the hash functions. Even slight dependences between hash functions may indeed severely affect the uniformity of the partition and the resulting selectivity. We introduce a new hash function family that attempts to solvethis issue by considering uniformity as a primary objective. The underlying principle is to train independent balanced partitions by randomly sampling positive and negative samples from the dataset, regardless their closeness. To maximize the collision probability of close points, we suggest using maximum margin hyperplanes as classifiers, since they are known for their good capacity. Experiments show that our new Random Maximum Margin Hashing (RMMH) scheme outperforms all experimented hash functions for both indexing and sketching.

Similarity Search at a Very Large Scale Using Hadoop and HBase

Stanislav Barton (European Web Archive & Wisdom), Philippe Rigaux (CEDRIC& Wisdom)

A brief recapitulation of an access structure for approximate similarity search in metric spaces will be given. The access structure takes advantage of what we call the locality phenomenon for an extensive collection pruning which allows to store the whole index in the secondary memory comprising of cheap disks. With this approach a traditional data organization method as RDBMS has been successfully deployed. In order to be able to search really large data collections Hadoop as a framework for easy distribution of data intensive tasks and HBase as a seamlessly distributed key/value storage were tested with this access structure. In this talk, preliminary results acquired using the proposed technologies and a data set containing 2 billions of data objects will be presented.

Approximate search as a source coding problem, with application to large scale image retrieval

Hervé Jégou (INRIA)

Handling large amounts of data requires the use of approximate nearest neighbor search techniques. Recently, Hamming embedding methods such as spectral hashing have addressed the problem of obtaining compact binary codes optimizing the trade-off between the memory usage and the probability of finding the correct neighbors.
In this talk, I will show that this problem can be cast into a source coding framework, where the database vectors are approximated by quantization. The code associated with a vector is simply its quantization indexe. The choice of the quantizer is critical to obtain a precise yet compact representation. We adopt a product quantizer that decomposes the space intoa Cartesian product of low dimensional subspaces. The Euclidean distance between two vectors is obtained from their codes. An asymmetric version increases precision, as it computes the approximate distance between a vector and a code. Finally, the method is advantageously combined with an inverted file to avoid exhaustive search. Experiments conducted on a set of 2 billion vectors, on a single machine, demonstrate the interest of this approach, which significantly outperforms the state-of-the-art embedding methods.
I will finally present an application to image search on a large scale and present a live-demo on 10 millions images running on my laptop.

Utilisation de l'analyse factorielle des correspondances pour l'indexation et la recherche d'information dans une grande bases de données d'images

Annie Morin (IRISA) Nguyen-Khang Pham (IRISA - Université de Cantho)

L'analyse des correspondances (AC) est une méthode factorielle utilisée pour traiter des grands tableaux de données textuelles. Dans cet exposé, nous présenterons l'utilisation de l'AC pour l'indexation et la Recherche d'information dans des bases d'images en insistant sur l'importance des indicateurs de l'AC (contribution à l'inertie et qualité de la projection). L'utilisation conjointe de ces indicateurs et de fichiers inversés permet de filtrer les images non pertinentes lors d'une recherche d'information. Par ailleurs, un découpage en blocs et la parallélisation des calculs sur GPU permet de traiter un grand nombre d'images et de réduire le temps de calcul.

L'alignement dynamique pour comparer des extraits audio ou vidéo : une bonne idée ?

Romain Tavenard (Université de Rennes 1 / IRISA), Laurent Amsaleg (CNRS/ IRISA)

Cet exposé présente des résultats préliminaires questionnant l'utilité de l'alignement dynamique (DTW) en tant que mesure de similarité pour des séquences multimédia. La DTW est comparée, d'une part, àun système de type "sac de mots" dans lequel l'enchaînement temporel est ignoré et, d'autre part, à une version améliorée de l'approche "sac de mots" utilisant des n-grammes. Des expériences sont conduites sur des données audio et vidéo réelles pour évaluer les avantages respectifs des 3 approches en termes de temps de réponse à une requête et de qualité des résultats retournés.

Scalability issues for Earth Observation Image Mining

Mihai Datcu (German Aerospace Center DLR)

Very large volumes of heterogenous data, like multimedia, Earth Observation (EO) images, scientific and engineering measurements, for instance, are continuously generated and stored. A typical case is the field of Earth Observation (EO). The widespread availability of very high resolution satellite and airborne images does not only explore the volumes of data, but also brings order at magnitude in the image detail, thus enormously increasing the information content. Only one satellite records daily ca. 1000 images of total volume 100 GB, thus EO archives may storing PB of data. In addition EO images are instrumental measurements, i.e. sensor data of physical parameters, thus very specific issues should be addressed. Due to the high complexity and volume of these data, today’s concepts and technologies are still limited in communicating the information content to people for use in real life applications. In this presentation, we overview the scalability issues of knowledge-driven image information mining and analyze and evaluate them from various perspective: information, theory, signal processing, active learning, auto-annotation, human–machine communication. The communication at a semantic level of representation will be also addressed.