October 17, 2024
By Adín Ramírez Rivera, Professor at SFI Visual Intelligence/University of Oslo
Artificial Intelligence (AI) has seen remarkable advancements, especially in how machines learn from visual data. Picture a computer teaching itself to recognize objects in images without any human help---that's the magic of self-supervised learning (SSL). However, this process isn't always smooth, often running into hurdles that make training these AI models tricky and resource-intensive. Enter MaSSL, a new approach that seeks to revolutionize SSL by borrowing a page from human memory.
SSL has become the gold standard in helping AI understand visual data by clustering similar images together. But there's a catch---teaching machines this way can lead to what's known as "training collapse," where the model ends up lumping all images into a single group. Current solutions to this issue involve adding extra rules or "regularizers" to the training process, making it more complex and resource-hungry.
Even advanced techniques using Vision Transformers (ViTs) aren't free from this problem. While these methods do enhance performance, they also significantly increase the computational load and the time required for training.
Inspired by how human memory works, our new approach, Memory Augmented Self-Supervised Learning (MaSSL), brings a fresh twist to SSL. Imagine if the AI had its own memory to keep track of what it has seen before. That's precisely what MaSSL does---it gives the neural network a memory component that stores recent visual experiences.
Here's how MaSSL stands out:
The blog post is continued beneath the image.
MaSSL simplifies the learning process dramatically. By eliminating the need for extra regularizers and optimizing the way AI uses ViTs, MaSSL cuts down on training time and computational needs. But the benefits don't stop there:
We put MaSSL through rigorous testing on numerous vision tasks and datasets, and the results were nothing short of impressive.
Not only did it outperform traditional methods, but it also provided a more stable and resource-efficient training process.
July 29, 2024
Thalles Silva, Helio Pedrini, Adı́n Ramı́rez Rivera
This paper introduces a novel approach to improving the training stability of self-supervised learning (SSL) methods by leveraging a non-parametric memory of seen concepts. The proposed method involves augmenting a neural network with a memory component to stochastically compare current image views with previously encountered concepts. Additionally, we introduce stochastic memory blocks to regularize training and enforce consistency between image views. We extensively benchmark our method on many vision tasks, such as linear probing, transfer learning, few-shot classification, and image retrieval on many datasets. The experimental results consolidate the effectiveness of the proposed approach in achieving stable SSL training without additional regularizers while learning highly transferable representations and requiring less computing time and resources.
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features
Thalles Silva, Helio Pedrini, Adı́n Ramı́rez Rivera
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:45451-45467, 2024
July 29, 2024
Thalles Silva, Helio Pedrini, Adı́n Ramı́rez Rivera
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:45451-45467, 2024
July 29, 2024