Efficient Queries Transformer Neural Processes

Neural Processes (NPs) are popular methods in meta-learning that can estimate predictive uncertainty on target datapoints by conditioning on a context dataset. Previous state-of-the-art method Transformer Neural Processes (TNPs) achieve strong performance but require quadratic computation with respect to the number of context datapoints, significantly limiting its scalability. Conversely, existing sub-quadratic NP variants perform significantly worse than that of TNPs. Tackling this issue, we propose Latent Bottlenecked Attentive Neural Processes (LBANPs), a new computationally efficient sub-quadratic NP variant, that has a querying computational complexity independent of the number of context datapoints. The model encodes the context dataset into a constant number of latent vectors on which self-attention is performed. When making predictions, the model retrieves higher-order information from the context dataset via multiple cross-attention mechanisms on the latent vectors. We empirically show that LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits. We demonstrate that LBANPs can trade-off the computational cost and performance according to the number of latent vectors. Finally, we show LBANPs can scale beyond existing attention-based NP variants to larger dataset settings.

Related Research

Speeding up Inference in Transformers

Speeding up Inference in Transformers

S. Prince.

Transformers

Research
Latent Bottlenecked Attentive Neural Processes

Latent Bottlenecked Attentive Neural Processes

L. Feng, H. Hajimirsadeghi, Y. Bengio, and M. O. Ahmed. International Conference on Learning Representations (ICLR)

Transformers

Publications
Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting

Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting

M. Amin Shabani, A. Abdi, L. Meng, and T. Sylvain. International Conference on Learning Representations (ICLR)

Time series Modelling; Transformers

Publications

Cookies Settings

Related Research