Enforcing hierarchical label structure for multiple object detection in remote sensing imagery

Cindy Gonzales Co-Author
Lawrence Livermore National Laboratory
 
Wesam Sakla Co-Author
Lawrence Livermore National Laboratory
 
Laura Wendelberger First Author
Lawrence Livermore National Laboratory
 
Laura Wendelberger Presenting Author
Lawrence Livermore National Laboratory
 
Tuesday, Aug 8: 8:50 AM - 9:05 AM
3517 
Contributed Papers 
Metro Toronto Convention Centre 
Automatic object detection in remote sensing imagery flags objects of interest in a scene. We are interested in multi-label classification of images in the FAIR1M benchmark dataset, which contains "ground truth" images labeled with 5 coarse and 37 fine hierarchical object classes summarizing scene content. We propose using a SwinV2 visual backbone that feeds into a transformer to produce multi-label taxonomic classification. The flexibility of deep learning models for computer vision makes it possible to identify targets that would otherwise be difficult to explicitly quantify. However, with naïve application, the classifications from a multi-task model may not uphold known hierarchical structure. Common approaches to acknowledge hierarchical structure of labels include adding penalties for inconsistent predictions in the cost function, using different features to predict coarse/fine classes, and completing classification tasks of different granularities at different depths in the model. We implement the proposed model with additional hierarchy aware modifications and compare to naïve flat classification.

Keywords

deep learning

hierarchical

computer vision

remote sensing

penalization

classification 

Main Sponsor

Section on Statistics in Defense and National Security

Tracks

Machine Learning