ObjectNet Challenge Documentation
View the Project on GitHub abarbu/objectnet-challenge-doc-ibm-dev
These instructions describe how to build a docker image using the TensorFlow deep learning framework for the ObjectNet Challenge. It assumes you already have a pre-trained TensorFlow model which you intend to submit for evaluation to the ObjectNet Challenge.
If your model is built using a different framework, go to the relevant documentation:
These instructions are split into two sections:
The following section provides example code and a baseline model for the ObjectNet Challenge. The code is structured such that most existing TensorFlow models can be plugged into the example with minimal code changes necessary.
Note: The example code uses batching and parallel data loading to improve inference efficiency. If you are building your own customized docker image with your own code it is highly recommended to use similar optimized inferencing techniques to ensure your submission will complete within the time limit set by the challenge organisers.
The following libraries are required to run this example and must be installed on the local test machine. The same libraries will be automatically installed into the Docker image when the image is built.
For example, you could set up a conda environment with the necessary requirements with a few simple lines. This environment would be named objectnet_env.
conda create -n objectnet_env python=3.7 cudatoolkit=11.0
conda activate objectnet_env
pip install --upgrade pip
pip install tensorflow pillow
Alternatively, you can follow the instructions here to start running TensorFlow in a docker image.
If your local machine has NVIDIA-capable GPUs and you want to test your docker image locally using these GPUs then you will need to ensure the NVIDIA drivers have been installed on your test machine.
Instructions on how to install CUDA toolkit and NVIDIA drivers can be found here, as well as instructions for cuDNN here. Be sure to match the versions of CUDA/NVIDIA installed with the version of TensorFlow and CUDA used to build your docker image - see Building the docker image.
Clone the following git repo to a machine which has docker installed:
$ git clone https://github.com/abarbu/objectnet-template-tensorflow.git
This repo comes with python scripts to perform batch inference using a sample model, validate and score the inferences and also contains a set of test images (input/images
) and a file containing ground truth data for those images (input/answers/answers-test.json
). You will need to download the sample model (ResNet50) used in this example (see 1.6 Testing the example)
objectnet_eval.py
is the main entry point for running this example; it essentially performs batch inference against all images in a supplied input directory (images-dir
).
Full help is available using objectnet_eval.py --help
:
usage: objectnet_eval.py [-h] [--gpus N] [--workers N] [--batch_size N]
[--softmax T/F] [--convert_outputs_mode N]
images-dir output-file model-class-name
model-checkpoint
Evaluate a TensorFlow model on ObjectNet images and output predictions to a
CSV file.
positional arguments:
images-dir path to dataset
output-file path to predictions output file
model-class-name model class name in model_description.py
model-checkpoint path to model checkpoint
optional arguments:
-h, --help show this help message and exit
--gpus N number of GPUs to use
--workers N number of data loading workers (default: total num
CPUs)
--batch_size N mini-batch size (default: 64), this is the batch size
of each GPU on the current node when using Data
Parallel or Distributed Data Parallel
--softmax T/F apply a softmax function to network outputs to convert
output magnitudes to confidence values (default:True)
--convert_outputs_mode N
0: no conversion of prediction IDs, 1: convert from ImageNet prediction IDs to ObjectNet
prediction IDs (default:1)
There follows a description of the code structure used in this repo.
./objectnet_eval.py:
objectnet_eval.py -> load_pretrained_net()
by model.load_state_dict
output-file
)./objectnet_iterator.py:
keras.utils.Sequence
class for parallel data loadingimages-dir
folder and makes a list of files. It ignores any subdirectory folder structuresdata_transform_description.py
and crops out 2 pixel red border on ObjectNet imagesInside of the model directory: (This is the only code that you will have to modify):
./model/model_description.py:
keras.utils.Sequence
to implement any neural net modelcreate_model()
method./model/data_transform_description.py:
transforms()
takes a PIL image as input, performs transformations, and returns the transformed image
./input/images:
./input/answers/answers-test.json:
Before executing the example for the first time you must download the sample model as shown below:
# Download the model:
$ cd objectnet-template-tensorflow
$ mkdir downloads
$ cd downloads
$ wget https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5
$ cp resnet50_weights_tf_dim_ordering_tf_kernels.h5 ../model
$ cd ..
Note: The downloads/
directory is used to store downloaded models so they only need to be downloaded once. If you want to use a model checkpoint which is in downloads/
,
make sure to copy it to model/
as shown in the second last line above. This way, model/
can be kept with only one active model at once, and downloads/
can be used as storage for all models.
Use DemoResNet50
as the model-class-name
argument and model/resnet50_weights_tf_dim_ordering_tf_kernels.h5
as the model-checkpoint
argument to the objectnet_eval.py
script to test the example model:
# Perform batch inference:
$ python3 objectnet_eval.py input/images output/predictions.csv DemoResNet50 model/resnet50_weights_tf_dim_ordering_tf_kernels.h5
**** params ****
images input/images
output_file output/predictions.csv
model_class_name DemoResNet50
model_checkpoint model/resnet50_weights_tf_dim_ordering_tf_kernels.h5
gpus 2
workers 1
batch_size 64
softmax True
convert_outputs_mode 1
****************
Number of devices: 2
Model: "resnet50"
Done. Number of predictions: 10
Results will be written to the predictions.csv
file in the output/
directory. Check
the output conforms to the
format expected by the ObjectNet Challenge.
You can plugin your own existing TensorFlow model into the template. There are a few considerations to keep in mind, which are listed below.
If you want to use a python package that is not included in the default TensorFlow Docker container, then it needs to be listed in the requirements.txt
file so that it is 'pip installed' when the docker image is built. Include it as follows:
# This file specifies python dependencies which are to be installed into the Docker image.
# List one library per line (not as a comment)
# e.g.
#numpy
scipy
The only code changes necessary when incorporating your TensorFlow model should be in the model/
directory.
model/
directory. For example:
$ rm -rf model/resnet50_weights_tf_dim_ordering_tf_kernels.h5
model/
. For example:
$ cp my_model.h5 /model
model/model_description.py
. The
class name will be used as the model-class-name
argument to objectnet_eval.py
.
data_transformation_description.py
to match
those that your model was trained on.objectnet-template-TensorFlow
:$ python3 objectnet_eval.py input/images output/predictions.csv MyModel model/my_model.h5
**** params ****
images input/images
output_file output/predictions.csv
model_class_name MyModel
model_checkpoint model/my_model.h5
workers 16
gpus 2
batch_size 96
softmax True
convert_outputs_mode 1
****************
Number of devices: 2
Model: "MyModel"
Done. Number of predictions: 10
$ rm output/predictions.csv
In order to ensure that the predictions.csv
file is structured according to the ObjectNet Challenge specifications, it is important to validate the output using the validate_and_score.py
script provided in the objectnet-template-tensorflow
repo.
Once your model has successfully executed run the following command to validate your output:
$ python3 validate_and_score.py -a input/answers/answers-test.json -f output/predictions.csv
Note the usage of the -a
and -f
flags as specified in validate_and_score.py --help
below.
usage: validate_and_score.py [-h] --answers ANSWERS --filename FILENAME
[--no-range-check]
optional arguments:
-h, --help show this help message and exit
--answers ANSWERS, -a ANSWERS
ground truth/answer file
--filename FILENAME, -f FILENAME
users result file
--no-range-check, -n allow entries that have out-of-range label indices
Proceed to Section 2 if you receive an output of "prediction_file_status": "VALIDATED"
.
If you received an error in running this command ensure that you have entered the correct file locations for the answer file as well as the result file. For clarification on result file structure refer to the evaluation criteria on the challenge page.
To build and test a docker image locally you will first need to install the docker engine. Follow the instructions on installing docker, along with a quick start guide.
Prior to uploading the docker image to the competition portal for evaluation you should test your docker image locally. If your local machine has NVIDIA-capable GPUs and you wish to test inference using GPUs then you will first need to install the NVIDIA drivers on your machine. See section 1.2 Install NVIDIA drivers above.
Ensure you have been able to successfully test your model on the local host using the objectnet_eval.py
example code - see section 1.8 Validating the predictions of your model for more details.
Docker images are built from a series of statements contained in a Dockerfile
. A template Dockerfile is provided for models built using the TensorFlow deep learning framework.
The TensorFlow docker image template for the ObjectNet Challenge uses one of the official TensorFlow docker images as its base image. These TensorFlow images come with built-in GPU support and with python 3 pre-loaded.
Note: Docker images submitted to the ObjectNet Challenge must be based on a GPU enabled base image and use GPUs for inferencing.
To improve performance the example code batches up inferencing of the ObjectNet images and execute a number of streams (or workers) in parallel.
You can further customise the build of you docker container by specifying the following arguments at docker build time:model_descriptions.py
file. This is passed as the model-class-name
argument to the objectnet_eval.py
module. For example "my_model".model-checkpoint
argument to the objectnet_eval.py
module. For example "my_model.pth".A bash script, build-docker-submission.sh
, has been created to build the Docker image
for you. The script has the following inputs:
This command runs builds your model into a Docker Image
Docker Image will be set to IMAGE:TAG
Default
TAG="latest"
TENSORFLOW_VERSION="2.3.0"
options:
-h, --help show brief help
-v, --tensorflow-version=TF_VERSION specify a tensorflow version to use
-n, --model-class-name=NAME specify a model class name to use
-c, --model-checkpoint=CHECKPOINT specify the path to a model checkpoint to use
-i, --image=IMAGE specify your Docker image
-t, --tag=TAG specify your Docker image tag
-nc, --no-cache bypass cache for docker build/pre>
Create your image by running:
./build-docker-submission.sh -i IMAGE -t TAG -n NAME -c CHECKPOINT -v TF_VERSION
For example, to build a docker image (called 'my_model' with a tag of 'version1') containing the model parameters specified above:
Note: To save space in the built docker image:
model\
directory when building the image, and build-docker-submission.sh
command).downloads/
is excluded from the docker build context and can be used to store files which are not needed for the image being built.Once the build is complete your newly built docker image can be listed using the command:
$ docker images
If the docker was built without version tagging it is given a default tag of latest
.
Test the docker image locally before submitting it to the challenge. For example, a docker image called my-model:version1
is run by:
# First remove the output file
$ rm output/predictions.csv
# Now run the docker image
$ docker run -ti --rm --gpus=all -v $PWD/input/images:/input/ -v $PWD/output:/output my_model:version1
**** params ****
images /input
output_file /output/predictions.csv
model_class_name MyModel
model_checkpoint /workspace/model/my_model.h5
workers 16
gpus 2
batch_size 96
softmax True
convert_outputs_mode 1
****************
initializing model ...
loading pretrained weights from disk ...
Done. Number of predictions: 10
The -v $PWD/input/images:/input
mounts a directory of test images from the local path into /input
within the docker container. Similarly, -v $PWD/output:/output
mounts an output directory from the local path into /output
of the container. Add the --gpus=all
parameter to the docker run
command in order to utilise your GPUs.
A successful run will result in a predicitions.csv
file written to the $PWD/output
path.
If there are errors during the previous step then you will need to debug your docker container. If you make changes to your code there is no need to rebuild the docker container. To quickly test your new code, simply mount the root path of this repo as a volume when you run the container. For example:
$ docker run -ti --rm --gpus=all -v $PWD:/workspace -v $PWD/input/images:/input/ -v $PWD/output:/output --entrypoint /bin/bash my-model:version1
When the docker container is run, the local $PWD
will be mounted over /workspace
directory within the docker image which effectively means any code/model changes made since the last docker build
command will be contained within the running container.
In order to ensure that the predictions.csv
file is structured according to the ObjectNet Challenge specifications, it is important to validate it against the validate_and_score.py
script. Run the following command:
$ python3 validate_and_score.py -a input/answers/answers-test.json -f output/predictions.csv
{
"accuracy": 20.0,
"images_scored": 10,
"prediction_file_errors": [],
"prediction_file_status": "VALIDATED",
"top5_accuracy": 40.0,
"total_images": 10
}
A correctly formatted predictions.csv
file will result in an output of "prediction_file_status": "VALIDATED"
. Otherwise, refer back to 1.8 Validating the predictions to handle any errors.
Once the output of your model has been validated, you are ready to submit the docker image to the ObjectNet Challenge.