Abstract
In multi-task learning (MTL) for visual scene understanding, it is crucial to transfer useful information between multiple tasks with minimal interferences. In this paper, we propose a novel architecture that effectively transfers informative features by applying the attention mechanism to the multi-scale features of the tasks. Since applying the attention module directly to all possible features in terms of scale and task requires a high complexity, we propose to apply the attention module sequentially for the task and scale. The cross-task attention module (CTAM) is first applied to facilitate the exchange of relevant information between the multiple task features of the same scale. The cross-scale attention module (CSAM) then aggregates useful information from feature maps at different resolutions in the same task. Also, we attempt to capture long range dependencies through the self-attention module in the feature extraction network. Extensive experiments demonstrate that our method achieves state-of-the-art performance on the NYUD-v2 and PASCAL-Context dataset. Our code is available at https://github.com/kimsunkyung/SCA-MTL.
Original language | English |
---|---|
Title of host publication | 2022 IEEE International Conference on Image Processing, ICIP 2022 - Proceedings |
Publisher | IEEE Computer Society |
Pages | 2311-2315 |
Number of pages | 5 |
ISBN (Electronic) | 9781665496209 |
DOIs | |
State | Published - 2022 |
Event | 29th IEEE International Conference on Image Processing, ICIP 2022 - Bordeaux, France Duration: 16 Oct 2022 → 19 Oct 2022 |
Publication series
Name | Proceedings - International Conference on Image Processing, ICIP |
---|---|
ISSN (Print) | 1522-4880 |
Conference
Conference | 29th IEEE International Conference on Image Processing, ICIP 2022 |
---|---|
Country/Territory | France |
City | Bordeaux |
Period | 16/10/22 → 19/10/22 |
Bibliographical note
Publisher Copyright:© 2022 IEEE.
Keywords
- Multi-task learning
- cross attention
- monocular depth estimation
- self-attention
- semantic segmentation