Memory Efficient Neural Processes via Constant Memory Attention Block

Neural Processes (NPs) are popular meta-learning methods for efficiently modelling predictive uncertainty. Recent state-of-the-art methods, however, leverage expensive attention mechanisms, limiting their applications, particularly in low-resource settings. In this work, we propose Constant Memory Attention Block (CMAB), a novel general-purpose attention block that (1) is permutation invariant, (2) computes its output in constant memory, and (3) performs updates in constant computation. Building on CMAB, we propose Constant Memory Attentive Neural Processes (CMANPs), an NP variant which only requires \textbf{constant} memory. Empirically, we show CMANPs achieve state-of-the-art results on popular NP benchmarks (meta-regression and image completion) while being significantly more memory efficient than prior methods.

Bibtex

@misc{feng2024memory,
title={Memory Efficient Neural Processes via Constant Memory Attention Block},
author={Leo Feng and Frederick Tung and Hossein Hajimirsadeghi and Yoshua Bengio and Mohamed Osama Ahmed},
year={2024},
eprint={2305.14567},
archivePrefix={arXiv},
primaryClass={id=’cs.LG’ full_name=’Machine Learning’ is_active=True alt_name=None in_archive=’cs’ is_general=False description=’Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.’}
}

Related Research

EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

Y. Wen, B. Shayegh, C. Huang, Y. Cao, and L. Mou. Workshop at International Conference on Machine Learning (ICML)

Publications
Pre-training multi-billion parameter LLMs on a single GPU with Flora

Pre-training multi-billion parameter LLMs on a single GPU with Flora

Y. Hao, Y. Cao, and L. Mou.

Research
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics

Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics

A. Vani, F. Tung, G. Oliveira, and H. Sharifi. International Conference on Machine Learning (ICML)

Publications

Cookies Settings

Bibtex

Related Research