With Transfer Learning models, there are two main stages involved in preparing the model:
In the pre-training stage, the model is trained on a large collection of unlabeled data. In an unsupervised way, the model learns patterns within the data that will prove to be useful later on. This is like being immersed in an environment where a foreign language is spoken for an extended period of time. Although you are not actively trying to learn the language you become familiar with the sounds, intonations and common phrases which are used just by way of exposure.