BERT is a method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering). BERT outperforms previous methods because it is the first unsupervised, deeply bidirectional system for pre-training NLP.

1 SQuAD 1 SQuAD 1.1 with Tensorflow BERT-BASE

1.1 1 About the application and benchmarks

This guide is to be used as a starting point. It does not provide detailed guidance on optimizations and additional tuning. Please follow the guidelines in the Competion Limits section of this document.

1.1.1 About 1 About BERT-BASE

BERT, or Bidirectional Encoder Representations from Transformers, is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks.

...

For the purposes of this challenge, we will be using BERT-BASE.

1.1.2 2 About SQuAD 1.1

The Stanford Question Answering Dataset (SQuAD) is a popular question answering benchmark dataset. BERT (at the time of the release) obtains state-of-the-art results on SQuAD with almost no task-specific network architecture modifications or data augmentation. However, it does require semi-complex data pre-processing and post-processing to deal with (a) the variable-length nature of SQuAD context paragraphs, and (b) the character-level answer annotations which are used for SQuAD training. This processing is implemented and documented in run_squad.py.

1.2 Running 2 Running SQuAD 1.1 fine tuning and inference

1.2.1 Using 1 Using Docker and NVIDIA Docker Image

...

https://github.com/lambdal/bert (This is a fork of the original (Google's) BERT implementation, with added Multi-GPU support with Horovod)

1.2.3 Download 3 Download BERT-BASE model file

The BERT-BASE, Uncased model file contains 12-layer, 768-hidden, 12-heads, 110M parameters. Its download link can be found at https://github.com/google-research/bert

...

Code Block

root@tessa002:/workspace/nvidia-examples/bert/data# mkdir download
root@tessa002:/workspace/nvidia-examples/bert/data# cd download

root@tessa002:/workspace/nvidia-examples/bert/data/download# mkdir google_pretrained_weights

root@tessa002:/workspace/nvidia-examples/bert/data/download# cd google_pretrained_weights/
root@tessa002:/workspace/nvidia-examples/bert/data/download/google_pretrained_weights# wget https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip

root@tessa002:/workspace/nvidia-examples/bert/data/download/google_pretrained_weights# unzip uncased_L-12_H-768_A-12.zip
Archive:  uncased_L-12_H-768_A-12.zip
   creating: uncased_L-12_H-768_A-12/
  inflating: uncased_L-12_H-768_A-12/bert_model.ckpt.meta
  inflating: uncased_L-12_H-768_A-12/bert_model.ckpt.data-00000-of-00001
  inflating: uncased_L-12_H-768_A-12/vocab.txt
  inflating: uncased_L-12_H-768_A-12/bert_model.ckpt.index
  inflating: uncased_L-12_H-768_A-12/bert_config.json

1.2.4 Download 4 Download the SQuAD 1.1 dataset

To run on SQuAD, you will first need to download the dataset. The SQuAD website does not seem to link to the v1.1 datasets any longer, but the necessary files can be found here:

...

Code Block

root@tessa002:/workspace/nvidia-examples/bert/data/download# mkdir squad
root@tessa002:/workspace/nvidia-examples/bert/data/download# cd squad
root@tessa002:/workspace/nvidia-examples/bert/data/download/squad# mkdir v1.1

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad# cd v1.1/

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1#  wget https://github.com/allenai/bi-att-flow/archive/master.zip

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# unzip master.zip

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# cd bi-att-flow-master/

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1/bi-att-flow-master# cd squad

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1/bi-att-flow-master/squad# cp evaluate-v1.1.py /workspace/nvidia-examples/bert/data/download/squad/v1.1/

root@tessa002:cd /workspace/nvidia-examples/bert

1.2.5 Start 5 Start fine tuning

BERT representations can be fine tuned with just one additional output layer for a state-of-the-art Question Answering system. From within the container, you can use the following script to run fine-training for SQuAD.

...

Code Block
bash scripts/run_squad.sh 10 5e-6 fp16 true 4 384 128 base 1.1 data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/bert_model.ckpt 1.1

1.2.6 6 Verify results

Code Block

INFO:tensorflow:-----------------------------
I0326 01:25:43.144953 140630939256640 run_squad.py:1127] -----------------------------
INFO:tensorflow:Total Inference Time = 88.62 for Sentences = 10840
I0326 01:25:43.145423 140630939256640 run_squad.py:1129] Total Inference Time = 88.62 for Sentences = 10840
INFO:tensorflow:Total Inference Time W/O Overhead = 75.86 for Sentences = 10824
I0326 01:25:43.145554 140630939256640 run_squad.py:1131] Total Inference Time W/O Overhead = 75.86 for Sentences = 10824
INFO:tensorflow:Summary Inference Statistics
I0326 01:25:43.145649 140630939256640 run_squad.py:1132] Summary Inference Statistics
INFO:tensorflow:Batch size = 8
I0326 01:25:43.145738 140630939256640 run_squad.py:1133] Batch size = 8
INFO:tensorflow:Sequence Length = 384
I0326 01:25:43.145867 140630939256640 run_squad.py:1134] Sequence Length = 384
INFO:tensorflow:Precision = fp16
I0326 01:25:43.145962 140630939256640 run_squad.py:1135] Precision = fp16
INFO:tensorflow:Latency Confidence Level 50 (ms) = 55.79
I0326 01:25:43.146052 140630939256640 run_squad.py:1136] Latency Confidence Level 50 (ms) = 55.79
INFO:tensorflow:Latency Confidence Level 90 (ms) = 57.03
I0326 01:25:43.146145 140630939256640 run_squad.py:1137] Latency Confidence Level 90 (ms) = 57.03
INFO:tensorflow:Latency Confidence Level 95 (ms) = 57.29
I0326 01:25:43.146225 140630939256640 run_squad.py:1138] Latency Confidence Level 95 (ms) = 57.29
INFO:tensorflow:Latency Confidence Level 99 (ms) = 58.62
I0326 01:25:43.146308 140630939256640 run_squad.py:1139] Latency Confidence Level 99 (ms) = 58.62
INFO:tensorflow:Latency Confidence Level 100 (ms) = 286.80
I0326 01:25:43.146387 140630939256640 run_squad.py:1140] Latency Confidence Level 100 (ms) = 286.80
INFO:tensorflow:Latency Average (ms) = 56.07
I0326 01:25:43.146471 140630939256640 run_squad.py:1141] Latency Average (ms) = 56.07
INFO:tensorflow:Throughput Average (sentences/sec) = 142.68
I0326 01:25:43.146564 140630939256640 run_squad.py:1142] Throughput Average (sentences/sec) = 142.68
INFO:tensorflow:-----------------------------
I0326 01:25:43.146645 140630939256640 run_squad.py:1143] -----------------------------
INFO:tensorflow:Writing predictions to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/predictions.json
I0326 01:25:43.146801 140630939256640 run_squad.py:431] Writing predictions to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/predictions.json
INFO:tensorflow:Writing nbest to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/nbest_predictions.json
I0326 01:25:43.146886 140630939256640 run_squad.py:432] Writing nbest to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/nbest_predictions.json
{"exact_match": 78.0321665089877, "f1": 86.34229152935384}

...

{"exact_match": 78.0321665089877, "f1": 86.34229152935384}

1.2.7 7 (Optional) Alternative method with Lambda Labs

...

{"exact_match": 78.1929990539262, "f1": 86.51319484763773}

1.2.8 8 Example predict Q&A on real data with - github.com/google-research/bert

...

Code Block
root@tessa002:/workspace/nvidia-examples/bert# cd /workspace root@tessa002:/workspace# git clone https://github.com/google-research/bert.git root@tessa002:/workspace# cd bert

1.2.

...

9 Create a sample input file in json format (note the "id" to reference later).

...

Using vi editor should automatically handle the formatting of json, or switch to paste mode (:set paste -> [paste text] -> :set nopaste):

Code Block

{
    "version": "v1.1",
    "data": [
        {
            "title": "your_title",
            "paragraphs": [
                {
                    "qas": [
                        {
                            "question": "Who is current CEO?",
                            "id": "56ddde6b9a695914005b9628",
                            "is_impossible": ""
                        },
                        {
                            "question": "Who founded google?",
                            "id": "56ddde6b9a695914005b9629",
                            "is_impossible": ""
                        },
                        {
                            "question": "when did IPO take place?",
                            "id": "56ddde6b9a695914005b962a",
                            "is_impossible": ""
                        }
                    ],
                    "context": "Google was founded in 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California. Together they own about 14 percent of its shares and control 56 percent of the stockholder voting power through supervoting stock. They incorporated Google as a privately held company on September 4, 1998. An initial public offering (IPO) took place on August 19, 2004, and Google moved to its headquarters in Mountain View, California, nicknamed the Googleplex. In August 2015, Google announced plans to reorganize its various interests as a conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and will continue to be the umbrella company for Alphabet's Internet interests. Sundar Pichai was appointed CEO of Google, replacing Larry Page who became the CEO of Alphabet."                
                 }
            ]
        }
    ]
}

1.2.10 Run 10 Run run_squad.py as do-predict=true using fine-tuned model checkpoint :

...

Code Block

root@tessa002:/workspace/lambdal/bert# python3 run_squad.py --vocab_file=/workspace/nvidia-examples/bert/data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=/workspace/nvidia-examples/bert/data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/bert_config.json   --init_checkpoint=/results/lambdal/squad1/squad_base/model.ckpt-3649  --do_train=False --max_query_length=30 --do_predict=True   --predict_file=test_input.json --predict_batch_size=16 --max_seq_length=384 --doc_stride=128 --output_dir=/results/lambdal/squad1/squad_test/

1.2.11 11 You should see similar output below :

Code Block

I0326 02:11:40.096473 140685488179008 run_squad.py:1259] Processing example: 0
INFO:tensorflow:prediction_loop marked as finished
I0326 02:11:40.165820 140685488179008 error_handling.py:101] prediction_loop marked as finished
INFO:tensorflow:prediction_loop marked as finished
I0326 02:11:40.166095 140685488179008 error_handling.py:101] prediction_loop marked as finished
INFO:tensorflow:Writing predictions to: /results/squad1/squad_test/predictions.json
I0326 02:11:40.166555 140685488179008 run_squad.py:745] Writing predictions to: /results/squad1/squad_test/predictions.json
INFO:tensorflow:Writing nbest to: /results/squad1/squad_test/nbest_predictions.json
I0326 02:11:40.166669 140685488179008 run_squad.py:746] Writing nbest to: /results/squad1/squad_test/nbest_predictions.json

1.2.

...

12 Check correctness in file : predictions.json

Code Block

{
    "56ddde6b9a695914005b9628": "Sundar Pichai",
    "56ddde6b9a695914005b9629": "Larry Page and Sergey Brin",
    "56ddde6b9a695914005b9630": "September 4, 1998",
    "56ddde6b9a695914005b9631": "CEO",
    "56ddde6b9a695914005b9632": "Alphabet Inc"
}

1.2.

...

13 Check accuracy in file: nbest_predictions.json

Code Block

{
    "56ddde6b9a695914005b9628": [
        {
            "text": "Sundar Pichai",
            "probability": 0.6877274611974046,
            "start_logit": 7.016119003295898,
            "end_logit": 6.917689323425293
        },
        {
            "text": "Sundar Pichai was appointed CEO of Google, replacing Larry Page",
            "probability": 0.27466839794889614,
            "start_logit": 7.016119003295898,
            "end_logit": 5.999861240386963
        },
        {
            "text": "Larry Page",
            "probability": 0.02874494871571203,
            "start_logit": 4.759016513824463,
            "end_logit": 5.999861240386963
        },

...

Version	Old Version 11	New Version 12
Changes made by	Scot Schultz	Scot Schultz
Saved on	Apr 02, 2020	Apr 02, 2020

Versions Compared

Key

1 SQuAD 1 SQuAD 1.1 with Tensorflow BERT-BASE

1.1 1 About the application and benchmarks

1.1.1 About 1 About BERT-BASE

1.1.2 2 About SQuAD 1.1

1.2 Running 2 Running SQuAD 1.1 fine tuning and inference

1.2.1 Using 1 Using Docker and NVIDIA Docker Image

1.2.3 Download 3 Download BERT-BASE model file

1.2.4 Download 4 Download the SQuAD 1.1 dataset

1.2.5 Start 5 Start fine tuning

1.2.6 6 Verify results

1.2.7 7 (Optional) Alternative method with Lambda Labs

1.2.8 8 Example predict Q&A on real data with - github.com/google-research/bert

1.2.

9 Create a sample input file in json format (note the "id" to reference later).

1.2.10 Run 10 Run run_squad.py as do-predict=true using fine-tuned model checkpoint :

1.2.11 11 You should see similar output below :

1.2.

12 Check correctness in file : predictions.json

1.2.

13 Check accuracy in file: nbest_predictions.json

Content Comparison

Versions Compared

Key

1 SQuAD 1 SQuAD 1.1 with Tensorflow BERT-BASE

1.1 1 About the application and benchmarks

1.1.1 About 1 About BERT-BASE

1.1.2 2 About SQuAD 1.1

1.2 Running 2 Running SQuAD 1.1 fine tuning and inference

1.2.1 Using 1 Using Docker and NVIDIA Docker Image

1.2.3 Download 3 Download BERT-BASE model file

1.2.4 Download 4 Download the SQuAD 1.1 dataset

1.2.5 Start 5 Start fine tuning

1.2.6 6 Verify results

1.2.7 7 (Optional) Alternative method with Lambda Labs

1.2.8 8 Example predict Q&A on real data with - github.com/google-research/bert

1.2.

9 Create a sample input file in json format (note the "id" to reference later).

1.2.10 Run 10 Run run_squad.py as do-predict=true using fine-tuned model checkpoint :

1.2.11 11 You should see similar output below :

1.2.

12 Check correctness in file : predictions.json

1.2.

13 Check accuracy in file: nbest_predictions.json