Adapting Grad-CAM for Embedding Networks

The gradient-weighted class activation mapping (Grad-CAM) method can faithfully highlight important regions in images for deep model prediction in image classification, image captioning and many other tasks. It uses the gradients in back-propagation as weights (grad-weights) to explain network decisions. However, applying Grad-CAM to embedding networks raises significant challenges because embedding networks are trained by millions of dynamically paired examples (e.g. triplets). To overcome these challenges, we propose an adaptation of the Grad-CAM method for embedding networks. First, we aggregate grad-weights from multiple training examples to improve the stability of Grad-CAM. Then, we develop an efficient weight-transfer method to explain decisions for any image without back-propagation. We extensively validate the method on the standard CUB200 dataset in which our method produces more accurate visual attention than the original Grad-CAM method. We also apply the method to a house price estimation application using images. The method produces convincing qualitative results, showcasing the practicality of our approach.

Related Research

Unveiling the Role of Computer Vision in Financial Services

Unveiling the Role of Computer Vision in Financial Services

J. He.

Computer Vision

Research
CVPR 2023 Recommended Reading List

CVPR 2023 Recommended Reading List

*R. Aoki, *R. Deng, *M. Zhai, J. He, *H. Zhao, *E. Smith, *V. Bhaskara, and *H. Sharifi.

Computer Vision

Research
Ranking Regularization for Critical Rare Classes: Minimizing False Positives at a High True Positive Rate

Ranking Regularization for Critical Rare Classes: Minimizing False Positives at a High True Positive Rate

H. Zhao.

Computer Vision

Research

Cookies Settings

Related Research