Vous êtes ici : Accueil » Kiosque » Annonce

Identification

Identifiant: 
Mot de passe : 

Mot de passe oublié ?
Détails d'identification oubliés ?

Annonce

27 janvier 2020

Real-time and robust template matching for the detection of partial-copy videos


Catégorie : Doctorant


Description of the position
This PhD position takes part of grant program of the Tours University (UT) in France (call 2020). The host institution at the UT will be the LIFAT Laboratory in the PRIA Group. Details about the position are :
• Starting date : the regular PhD registration at the UT is September 2020.
• Length : it is a full-time PhD contract of 3 years (36 months).
• Salary : 1500 € / a month (take home pay).
• Extra costs : all the extra costs linked to the furniture (laptop, phone, office ...) and travels (publications into conferences, research stays ...) will be supported by the PRIA Group.
• University fees : 615 € / a year paid by the candidate. These fees will cover the social security and mutual insurance.
• Social help : the candidate will be registered as a student of the UT. He will be able to apply to the French financial help for accomodation (up to 40% of the total cost).

Contact information
Please contact the following people for any additional questions about the position :
• Dr Donatello Conte, donatello.conte@univ-tours.fr, +33 247 361 269, LIFAT Laboratory, 64 avenue Jean Portalis, 37200 Tours city, France.
• Dr Mathieu Delalandre, mathieu.delalandre@univ-tours.fr, +33 247 361 432, LIFAT Laboratory, 64 avenue Jean Portalis, 37200 Tours city, France.

 

 

Subject

Key-words: detection, near-duplicate, image, partial-copy video, key-frame, real-time, online, live TV, template matching, optimization, template selection, matchability prediction, robust matching

This PhD addresses the near-duplicate image detection problem. The images that are near-duplicate slightly differ in content. The differences can result from the digitalization, the camera and streaming captures, the perspective distortions, the cropping and resampling, etc. Several researches have been conducted for addressing this problem [1]. The selection
of a suitable method depends on the kind of application use-case which is to be solved.
In this PhD, will have a particular focuss on the application of partial-copy video detection [2, 3, 4]. The detection of partial-copy videos processes with representative key-frames. These key-frames are compared to video content in order to detect the frames that are near-duplicate. The detection is then able to deal with particular sequences in videos corre-
sponding to partial copies. This differs from the copy detection of full videos that processes with global signatures [5, 6].

The detection of partial copies can be done in real-time for the online videos or the live TV. This has practical applications such as the copyright protection within social networks [7, 8], the data journalism and analytic [11], the commercial detection in TV broadcasting [9, 10], etc. In that case, the detection must face to real-time constraints in addition to the image degradation problem. A large amount of features has been investigated for the comparison of frames [12]. When dealing with real-time constraints, low-level features are preferred for the comparison [13].

We propose in this PhD to investigate the template matching to address this problem. Template matching is a well known topic in the image processing and computer vision field [14]. It evaluates the similarity between templates and areas using proper metrics and measures. Template matching is a known approach for the detection of near-duplicate frames within the partial-copy videos [2, 13, 15, 16, 17]. However, it suffers from complexity and starts to fail when facing to complex image transformations. To deal with these issues, we will address different open problems in this PhD.

Fast template matching: template matching has been mainly applied for object detection in the literature. Different approaches have been proposed to optimize the matching for object localization within the image search-space. This includes the partial elimination conditions [19], the use of upper bounding functions [18] or the matching with image integral [20, 21]. However, template matching for partial-copy video detection could be also characterized by a huge search-space for the template models. This requires specific approaches to prune the template search-space [22, 23]. This requires to design specific approaches achieving significant computational and memory space savings without sacrificing detection accuracy. The search-space algorithm must be correlated to the used metrics for the match-
ing and the search criterion (exact or approximate).

Template selection: a key problem with template matching is to select discriminant templates from the reference images / key-frames. This is referred as the template selection problem in the literature. The selection of templates can be done manually [24] or automatically with hand-crafted features [25], noise estimators [26] or learning-based methods [27].
Learning-based methods [27] are preferred to improve the matchability of templates when facing multiple sources of noise and deformation. However, the detection must process in an incremental way as new video content appears periodically. The no-reference methods, as the noise estimators [26], can process in such a situation. An alternative is process with matchability prediction while using learning based methods [28]. A key problem is to mix the matchability and optimization criteria for selection. Indeed, the properties of the selected template could have an important impact for the optimization of the matching process. The selection method must handle a tradeoff between the matchability and optimization. These aspects are little discussed in the literature.

Robust matching: classic methods for template matching use metrics such as the SAD, SSD or NCC. These metrics start to fail when facing to complex image transformations and degradations. Over the last years, numerous methods have been proposed to overcome these limitations [27, 29, 30, 31, 32, 33]. These methods tend to improve the robustness of the
matching to noise, geometric and no-rigid transformations or occlusions. However, they result in a growing of the complexity. Interpretability of deep learning models for template matching [34, 35] could help to design time-efficient and robust methods and systems.

Host institution and place
The LIFAT Laboratory is composed of 47 faculty members including Professors, Assistant Professors, Research Fellows and PhD students. The Laboratory is organized in three research teams/groups involved on specific topics (1) DataBases and Natural Language Processing (2) Operations Research, Scheduling, and Transportation (3) Pattern Recognition
and Image Analysis. The scientific challenges addressed at the LIFAT Laboratory include the design and development of models, methods and algorithms and to provide resources and software to extract information, to infer knowledge from data, by mainstreaming of human-computer interaction, and to solve combinatorial optimization problems with the goal to achieve good results in good computation time.
The Tours city is rich with history and a well preserved heritage. The urban area of Tours (of nearly 300 000 inhabitants) has a leading part to play in the Loire Valley. It lies at the crossroads of the North-South and East-West communication lines of Europe and is only one hour from Paris by high-speed train. Cost of living is attractive (see journaldunet).

Supervisors
Donatello Conte (PhD Director): received the Ph.D. degree in information engineering from the University of Salerno, Fisciano, Italy in 2006. He is currently Associate Professor of Computer Science with the University of Tours in France and at the LIFAT Laboratory. His current research interests include structural pattern recognition, real-time video analysis, and document images processing. He is the author of several research papers on these subjects. D. Conte is a member of the International Association for Pattern Recognition (IAPR) and of the IAPR Technical Committee 15 (Graph-based Representations in Pattern Recognition) since 2002. He is a reviewer for several international conferences and journals.
Mathieu Delalandre (PhD Supervisor): obtained his Ph.D. degree in 2005 at the Rouen University (Rouen, France). Then, starting from 2006 up to 2009 he has worked in different laboratories and institutes Europe-wide as a research fellow: (Nottingham, UK), (La Rochelle, France), (Barcelona, Spain). Starting September 2009, he has been an Assistant Professor at the LIFAT laboratory (Tours, France) in the PRIA group. His ongoing research activities deal with the image processing field focussed on the topics of local detectors, template matching and processing in the transform domain. His application domains are related to video and document image analysis including scene text detection, video copy detection over streaming, document image networking, comics copyright protection and symbol/logo detection and recognition. Mathieu Delalandre has contributed to around ten National and International research projects. He is the co-founder of the ToodTV startup.

Subject Title: Real-time and robust template matching for the detection of partial-copy videos Key-words: detection, near-duplicate, image, partial-copy video, key-frame, real-time, online, live TV, template matching, optimization, template selection, matchability prediction, robust matching Full description: cf "contenu"

Profile of the candidate
Profile of the candidate and requirements are :
• to have a Master degree in Computer Science and/or Electrical Engineering,
• to have good mathematics and programming skills,
• to have a past experience in image processing and pattern recognition,
• to have good communication skills,
• to be fluent in English,
• some knowledge in French would be better, but not mandatory.

Application
The application process will be handled in three main steps.
• Registration: the candidate will have to send a CV plus a motivation letter (in English using the PDF format) to mathieu.delalandre@univ-tours.fr before the 22 th of May 2020 (hard deadline).
• First interview: based on the CVs and motivation letters, candidates will be selected for a first interview with the PhD supervisors. A short list of candidates will be established for the final interview.
• Final interview: the final interview will be scheduled in June 2020 with a full Jury of the Tours University. The candidates will be be notified then about their acceptance.

 

References
[1] L. Morra and al. Benchmarking Unsupervised Near-Duplicate Image Detection. Expert Systems with Applications (2019).
[2] Z.J. Guzman-Zavaleta. Towards a Video Passive Content Fingerprinting Method for Partial-Copy Detection Robust Against Non-Simulated Attacks. PLOS One (2016).
[3] S. Jia and al. Coarse-to-Fine Copy-Move Forgery Detection for Video Forensics. IEEE Access (2018).
[4] Y.G. Jiang and al. Partial Copy Detection in Videos: A Benchmark and An Evaluation of Popular Methods. TBD (2016).
[5] Y. Himeur and al. Robust Video Copy Detection based on Ring Decomposition based Binarized Statistical Image Features and Invariant Color Descriptor (RBSIF-ICD). Multimed Tools Appl (2017).
[6] Z. Zhou. Video Copy Detection Using Spatio-Temporal CNN Features. IEEE Access (2019).
[7] Y. Xu and al. Caught Red-Handed: Toward Practical Video-based Subsequences Matching in the Presence of Real-World Transformations. CVPRW (2017).
[8] E. Gadeski and al. Fast and robust duplicate image detection on the web. Multimed Tools Appl (2017).
[9] M. Li and al. CNN-based Commercial Detection in TV Broadcasting. ICNCC (2017).
[10] A. Gomes and al. Automatic Detection of TV Commercial Blocks: A New Approach Based on Digital On-Screen Graphics Classification. ICSPCS (2017).
[11] G. Kordopatis-Zilos and al. FIVR: Fine-grained Incident Video Retrieval. arXiv (2019).
[12] D.A. Phalke and al. A Systematic Review of Near Duplicate Video Retrieval Techniques. IJPAM (2018).
[13] Z.J. Guzman-Zavaleta and al. Partial-Copy Detection of Non-Simulated Videos Using Learning at Decision Level. Multimedia Tools and Applications (2019).
[14] R. Brunelli. Template Matching in Computer Vision. Wiley (2009).
[15] Y. Zhang. Effective Real-Scenario Video Copy Detection. ICPR (2016).
[16] S. Lameri and al. Near-Duplicate Video Detection Exploiting Noise Residual Traces. ICIP (2017).
[17] M. Delalandre. A Workstation for Real-Time Processing of Multi-Channel TV. AI4TV (2019).
[18] W. Ouyang and al. Performance Evaluation of Full Search Equivalent Pattern Matching Algorithms. PAMI (2012).
[19] A. Mahmood. Optimizing Auto-correlation for Fast Target Search in Large Search Space. arXiv (2015).
[20] G. Facciolo and al. Integral Images for Block Matching. IPOL (2013).
[21] T. Wu. Speed-up Template Matching through Integral Image based Weak Classifiers. JPRR (2014).
[22] N. Ben-Zrihem and al. Approximate Nearest Neighbor Fields in Video. CVPR (2015).
[23] Y. Wu. Computationally Efficient Template-Based Face Recognition. ICPR (2016).
[24] W.D. Chang. Enhanced Template Matching Using Dynamic Positional Warping for Identification of Specific Patterns in Electroencephalogram. Journal of Applied Mathematics (2014).
[25] Y. Zhang and al. Template Selection based Superpixel Earth Mover’s Distance Algorithm for Hand Gesture Recognition. ICSP (2016).
[26] J. Tan and al. Target Recognition of SAR Images via Matching Attributed Scattering Centers with Binary Target Region. Sensors (2018).
[27] J.P. Mercier and al. Deep Object Ranking for Template Matching. WACV (2017).
[28] A. Penate-Sanchez and al. Matchability Prediction for Full-Search Template Matching Algorithms. 3DV (2016).
[29] J. Cheng and al. QATM: Quality-Aware Template Matching For Deep Learning. CVPR (2019).
[30] J. Kim and al. Robust Template Matching Using Scale-Adaptive Deep Convolutional Features. APSIPA (2017).
[31] D. Buniatyan and al. Weakly Supervised Deep Metric Learning for Template Matching. CVC (2019).
[32] S. Oron and al. Best-Buddies Similarity - Robust Template Matching using Mutual Nearest Neighbors. PAMI (2016).
[33] H.C. Shih and al. SPiraL Aggregation Map (SPLAM): A new descriptor for robust template matching with fast algorithm. PR (2015).
[34] Y. Cao and al. Template Matching Based on Geometric Invariance in Deep Neural Network. IEEE Access (2019).
[35] X. Dong and al. How deep learning works - The geometry of deep learning. arXiv (2017).

 

Dans cette rubrique

(c) GdR 720 ISIS - CNRS - 2011-2020.