Few-shot learning considers the problem of learning unseen categories given only a few labeled samples. As one of the most popular few-shot learning approaches, Prototypical Networks have received considerable attention owing to their simplicity and efficiency. However, a class prototype is typically obtained by averaging a few labeled samples belonging to the same class, which treats the samples as equally important and is thus prone to learning redundant features. Herein, we propose a self-attention based prototype enhancement network (SAPENet) to obtain a more representative prototype for each class. SAPENet utilizes multi-head self-attention mechanisms to selectively augment discriminative features in each sample feature map, and generates channel attention maps between intra-class sample features to attentively retain informative channel features for that class. The augmented feature maps and attention maps are finally fused to obtain representative class prototypes. Thereafter, a local descriptor-based metric module is employed to fully exploit the channel information of the prototypes by searching k similar local descriptors of the prototype for each local descriptor in the unlabeled samples for classification. We performed experiments on multiple benchmark datasets: miniImageNet, tieredImageNet, and CUB-200-2011. The experimental results on these datasets show that SAPENet achieves a considerable improvement compared to Prototypical Networks and also outperforms related state-of-the-art methods.
|State||Published - Mar 2023|
- Few-shot learning
- Image classification
- k-Nearest neighbor
- Multi-head self-attention mechanism