Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

·       BERT-LARGE (L=24, H=1024, A=16, Total Parameters=340M).

 

BERT-BASE contains 110M parameters and BERT-LARGE contains 340M parameters.

For the purposes of this challenge, we will be using BERT-BASE.

1.1.2 About SQuAD 1.1

The Stanford Question Answering Dataset (SQuAD) is a popular question answering benchmark dataset. BERT (at the time of the release) obtains state-of-the-art results on SQuAD with almost no task-specific network architecture modifications or data augmentation. However, it does require semi-complex data pre-processing and post-processing to deal with (a) the variable-length nature of SQuAD context paragraphs, and (b) the character-level answer annotations which are used for SQuAD training. This processing is implemented and documented in run_squad.py. 

1.2  Running SQuAD 1.1 fine tuning and inference

1.2.1  Using Docker and NVIDIA Docker Image

 

Code Block
docker pull nvcr.io/nvidia/tensorflow:20.02-tf1-py3
docker images
REPOSITORY                                                       TAG                 IMAGE ID            CREATED             SIZE
nvcr.io/nvidia/tensorflow                                        20.02-tf1-py3       0c7b70421b78        7 weeks ago         9.49GB

...

NVIDIA BERT codes is a publicly available implementation of BERT. It supports Multi-GPU training with Horovod - NVIDIA BERT fine-tune code uses Horovod to implement efficient multi-GPU training with NCCL.

Code Block
[~]# git clone https://github.com/NVIDIA/DeepLearningExamples.git

You may use other implementations, optimize and tune; but you must use the BERT-Base uncased pre-trained model for the purposes of this challenge.

...

/workspace/nvidia-examples/bert/data/download/google_pretrained_weights

Code Block
root@tessa002:/workspace/nvidia-examples/bert/data# mkdir download
root@tessa002:/workspace/nvidia-examples/bert/data# cd download

root@tessa002:/workspace/nvidia-examples/bert/data/download# mkdir google_pretrained_weights

root@tessa002:/workspace/nvidia-examples/bert/data/download# cd google_pretrained_weights/
root@tessa002:/workspace/nvidia-examples/bert/data/download/google_pretrained_weights# wget https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip

root@tessa002:/workspace/nvidia-examples/bert/data/download/google_pretrained_weights# unzip uncased_L-12_H-768_A-12.zip
Archive:  uncased_L-12_H-768_A-12.zip
   creating: uncased_L-12_H-768_A-12/
  inflating: uncased_L-12_H-768_A-12/bert_model.ckpt.meta
  inflating: uncased_L-12_H-768_A-12/bert_model.ckpt.data-00000-of-00001
  inflating: uncased_L-12_H-768_A-12/vocab.txt
  inflating: uncased_L-12_H-768_A-12/bert_model.ckpt.index
  inflating: uncased_L-12_H-768_A-12/bert_config.json

1.2.4   Download the SQuAD 1.1 dataset

...

We will download these to: /workspace/nvidia-examples/bert/data/download/squad/v1.1

Code Block
root@tessa002:/workspace/nvidia-examples/bert/data/download# mkdir squad
root@tessa002:/workspace/nvidia-examples/bert/data/download# cd squad
root@tessa002:/workspace/nvidia-examples/bert/data/download/squad# mkdir v1.1

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad# cd v1.1/

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1#  wget https://github.com/allenai/bi-att-flow/archive/master.zip

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# unzip master.zip

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1# cd bi-att-flow-master/

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1/bi-att-flow-master# cd squad

root@tessa002:/workspace/nvidia-examples/bert/data/download/squad/v1.1/bi-att-flow-master/squad# cp evaluate-v1.1.py /workspace/nvidia-examples/bert/data/download/squad/v1.1/

root@tessa002:cd /workspace/nvidia-examples/bert

1.2.5   Start fine tuning

...

For SQuAD 1.1 FP16 training with XLA using (4) T4 16GB GPU's run:

Code Block
bash scripts/run_squad.sh 10 5e-6 fp16 true 4 384 128 base 1.1 data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/bert_model.ckpt 1.1

1.2.6  Verify results

Code Block
INFO:tensorflow:-----------------------------
I0326 01:25:43.144953 140630939256640 run_squad.py:1127] -----------------------------
INFO:tensorflow:Total Inference Time = 88.62 for Sentences = 10840
I0326 01:25:43.145423 140630939256640 run_squad.py:1129] Total Inference Time = 88.62 for Sentences = 10840
INFO:tensorflow:Total Inference Time W/O Overhead = 75.86 for Sentences = 10824
I0326 01:25:43.145554 140630939256640 run_squad.py:1131] Total Inference Time W/O Overhead = 75.86 for Sentences = 10824
INFO:tensorflow:Summary Inference Statistics
I0326 01:25:43.145649 140630939256640 run_squad.py:1132] Summary Inference Statistics
INFO:tensorflow:Batch size = 8
I0326 01:25:43.145738 140630939256640 run_squad.py:1133] Batch size = 8
INFO:tensorflow:Sequence Length = 384
I0326 01:25:43.145867 140630939256640 run_squad.py:1134] Sequence Length = 384
INFO:tensorflow:Precision = fp16
I0326 01:25:43.145962 140630939256640 run_squad.py:1135] Precision = fp16
INFO:tensorflow:Latency Confidence Level 50 (ms) = 55.79
I0326 01:25:43.146052 140630939256640 run_squad.py:1136] Latency Confidence Level 50 (ms) = 55.79
INFO:tensorflow:Latency Confidence Level 90 (ms) = 57.03
I0326 01:25:43.146145 140630939256640 run_squad.py:1137] Latency Confidence Level 90 (ms) = 57.03
INFO:tensorflow:Latency Confidence Level 95 (ms) = 57.29
I0326 01:25:43.146225 140630939256640 run_squad.py:1138] Latency Confidence Level 95 (ms) = 57.29
INFO:tensorflow:Latency Confidence Level 99 (ms) = 58.62
I0326 01:25:43.146308 140630939256640 run_squad.py:1139] Latency Confidence Level 99 (ms) = 58.62
INFO:tensorflow:Latency Confidence Level 100 (ms) = 286.80
I0326 01:25:43.146387 140630939256640 run_squad.py:1140] Latency Confidence Level 100 (ms) = 286.80
INFO:tensorflow:Latency Average (ms) = 56.07
I0326 01:25:43.146471 140630939256640 run_squad.py:1141] Latency Average (ms) = 56.07
INFO:tensorflow:Throughput Average (sentences/sec) = 142.68
I0326 01:25:43.146564 140630939256640 run_squad.py:1142] Throughput Average (sentences/sec) = 142.68
INFO:tensorflow:-----------------------------
I0326 01:25:43.146645 140630939256640 run_squad.py:1143] -----------------------------
INFO:tensorflow:Writing predictions to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/predictions.json
I0326 01:25:43.146801 140630939256640 run_squad.py:431] Writing predictions to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/predictions.json
INFO:tensorflow:Writing nbest to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/nbest_predictions.json
I0326 01:25:43.146886 140630939256640 run_squad.py:432] Writing nbest to: /results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/nbest_predictions.json
{"exact_match": 78.0321665089877, "f1": 86.34229152935384}

...

(https://github.com/lambdal/bert )

Code Block
root@tessa002:/workspace# mkdir lambdal
root@tessa002:/workspace# cd lambdal
root@tessa002:/workspace/lambdal# git clone https://github.com/lambdal/bert


root@tessa002:/workspace/lambdal/bert# mpirun -np 4 -H localhost:4 -bind-to none -map-by slot -x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH -mca pml ob1 -mca btl ^openib --allow-run-as-root python3 run_squad_hvd.py --vocab_file=/workspace/nvidia-examples/bert/data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/vocab.txt   --bert_config_file=/workspace/nvidia-examples/bert/data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/bert_config.json   --init_checkpoint=/workspace/nvidia-examples/bert/data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/bert_model.ckpt  --do_train=True   --train_file=/workspace/nvidia-examples/bert/data/download/squad/v1.1/train-v1.1.json   --do_predict=True   --predict_file=/workspace/nvidia-examples/bert/data/download/squad/v1.1/dev-v1.1.json --train_batch_size=12 --learning_rate=3e-5   --num_train_epochs=2.0   --max_seq_length=384   --doc_stride=128   --output_dir=/results/lambdal/squad1/squad_base/ --horovod=true

look for similar output

Code Block
I0326 05:55:19.917063 140421161031488 run_squad_hvd.py:747] Writing predictions to: /results/lambdal/squad1/squad_base/predictions.json
INFO:tensorflow:Writing nbest to: /results/lambdal/squad1/squad_base/nbest_predictions.json
I0326 05:55:19.917179 140421161031488 run_squad_hvd.py:748] Writing nbest to: /results/lambdal/squad1/squad_base/nbest_predictions.json

To check score:

Code Block
root@tessa002:/workspace/lambdal/bert# python /workspace/nvidia-examples/bert/data/download/squad/v1.1/evaluate-v1.1.py /workspace/nvidia-examples/bert/data/download/squad/v1.1/dev-v1.1.json /results/lambdal/squad1/squad_base/predictions.json
{"exact_match": 78.1929990539262, "f1": 86.51319484763773}
  • Note : part of your final score includes these results:

...

Note : (This is the method that judges will use to score unseen data)

Code Block
root@tessa002:/workspace/nvidia-examples/bert# cd /workspace
root@tessa002:/workspace# git clone https://github.com/google-research/bert.git

root@tessa002:/workspace# cd bert

...

1.2.9  Create a sample input file in json format (note the "id" to reference later).

...

Code Block
root@tessa002:/workspace/bert# python3 run_squad.py --vocab_file=/workspace/nvidia-examples/bert/data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=/workspace/nvidia-examples/bert/data/download/google_pretrained_weights/uncased_L-12_H-768_A-12/bert_config.json   --init_checkpoint=/results/tf_bert_finetuning_squad_base_fp16_gbs40_200326010711/model.ckpt-2408  --do_train=False --max_query_length=30 --do_predict=True   --predict_file=test_input.json --predict_batch_size=16 --max_seq_length=384 --doc_stride=128 --output_dir=/results/squad1/squad_test/


Note:  If you are using alternative method from Lamdal Labs, you will need to use that checkpoint :

...

1.2.11  You should see similar output below :

Code Block
I0326 02:11:40.096473 140685488179008 run_squad.py:1259] Processing example: 0
INFO:tensorflow:prediction_loop marked as finished
I0326 02:11:40.165820 140685488179008 error_handling.py:101] prediction_loop marked as finished
INFO:tensorflow:prediction_loop marked as finished
I0326 02:11:40.166095 140685488179008 error_handling.py:101] prediction_loop marked as finished
INFO:tensorflow:Writing predictions to: /results/squad1/squad_test/predictions.json
I0326 02:11:40.166555 140685488179008 run_squad.py:745] Writing predictions to: /results/squad1/squad_test/predictions.json
INFO:tensorflow:Writing nbest to: /results/squad1/squad_test/nbest_predictions.json
I0326 02:11:40.166669 140685488179008 run_squad.py:746] Writing nbest to: /results/squad1/squad_test/nbest_predictions.json

1.2.12                      Check 12  Check correctness in file :  predictions.json

Code Block
{
    "56ddde6b9a695914005b9628": "Sundar Pichai",
    "56ddde6b9a695914005b9629": "Larry Page and Sergey Brin",
    "56ddde6b9a695914005b9630": "September 4, 1998",
    "56ddde6b9a695914005b9631": "CEO",
    "56ddde6b9a695914005b9632": "Alphabet Inc"
}

1.2.13                      Check 13  Check accuracy in file:  nbest_predictions.json

...

Scores will be derived from the nbest_predictions.json output for each question on the context.

1.3                          Competition 3  Competition Limits:

Must stick to pre-defined model (BERT-Base, Uncased)

...

You must prove all scripts and methodology used to achieve results

 

1.4                          Teams 4  Teams must produce:

Training scripts with their full training routine and command lines and output

...

run_squad.py predictions.json and nbest_predictions.json

 

 

1.5                          Method of final scoring procedure :

...

Final scores from unseen data of multiple questions; prediction from file, using standard run_squad.py

1.6                          Important6  Important:  Final evaluation scripts and ckpt directories and files must be submitted for approval, 90 minutes before the end of competition.

...