-
Book Overview & Buying
-
Table Of Contents
Distributed Machine Learning with Python
By :
In this section, we will discuss how to conduct fine-tuning on pre-trained transformer models. Here, we mainly focus on the BERT model, which is fully trained, and we will work on the SQuAD 2.0 dataset.
The whole code base for running custom training on the BERT model can be easily found on the Hugging Face website (https://huggingface.co/transformers/custom_datasets.html#qa-squad). Our previous model parallelism implementation can be directly applied to this code base to speed up model training and serving.
Here, we highlight the important steps in the workflow of fine-tuning BERT on SQuAD 2.0. The overview is shown in the following screenshot:
Figure 7.11 – Fine-tuning the transformer on downstream tasks
As shown in the preceding screenshot, the whole fine-tuning process involves three steps, as follows: