Self-supervised learning 

What is Self-supervised learning ?

Self-supervised learning is a term that refers to a type of unsupervised learning within a supervised learning problem. It is a relatively recent learning technique where training data is labelled autonomously.

In self-supervised learning, the system learns to predict part of its input from other parts of its input. In other words, a part of the neuronal input to a network is used as a monitoring element for a predictor fed with the remaining part of the input.

This type of learning uses many more monitoring signals than supervised learning, and much more than reinforced learning. That is why calling it "unsupervised" is misleading.

More can be learned about the structure of the world through self-supervised learning than from the other two paradigms. The main reason: the data is unlimited, and the feedback provided by each example is enormous.

Supervised learning is an onerous paradigm, which requires collecting massive amounts of data, cleaning it up, carrying out manual tagging, training and perfecting a model designed specifically for the classification or regression use case you wish to solve, and then using it to predict tags for unknown data. For example, with images, we collect a large image data set, manually tag objects in images, train the network and then use it for a specific use case.

This type of learning, while easy to understand, is far from the way a person would learn, for example. We learn mainly in an unsupervised and reinforced way, using curiosity and trial-and-error results. We also learn in a supervised way, but we can learn from far fewer samples since if there is one thing that distinguishes human beings, it is that they are quite good at generalising and abstracting information.

Self-supervised learning has similarities to unsupervised learning because the system learns without using explicitly provided labels as input. But it also differs from this because we are not learning the inherent structure of the data. Self-supervised learning, unlike unsupervised learning, does not focus on clustering, dimensionality reduction, recommendation engines, density estimation or anomaly detection.

Self-supervised learning has been extremely successful in natural language processing. For example, Google's BERT model and similar techniques produce excellent text representations.