ObjectNet

ObjectNet Challenge Documentation

View the Project on GitHub abarbu/objectnet-challenge-doc-ibm-dev

ObjectNet logo

Challenge Portal

Visit the ObjectNet Challenge Portal to register for the challenge.

ObjectNet Support

Experiencing a problem or just have a general question, see ObjectNet Support

ObjectNet Challenge:

Creating your Docker image from the PyTorch template

These instructions describe how to build a docker image using the PyTorch deep learning framework for the ObjectNet Challenge. It assumes you already have a pre-trained TensorFlow model which you intend to submit for evaluation to the ObjectNet Challenge.

If your model is built using a different framework, go to the relevant documentation:

These instructions are split into two sections:

Section 1: ObjectNet competition example model and code

The following section provides example code and a baseline model for the ObjectNet Challenge. The code is structured such that most existing PyTorch models can be plugged into the example with minimal code changes necessary.

Note: The example code uses batching and parallel data loading to improve inference efficiency. If you are building your own customized docker image with your own code it is highly recommended to use similar optimized inferencing techniques to ensure your submission will complete within the time limit set by the challenge organisers.

1.1 Requirements

The following libraries are required to run this example and must be installed on the local test machine. The same libraries will be automatically installed into the Docker image when the image is built.

For example, you could set up a conda environment with the necessary requirements with a few simple lines. This environment would be named objectnet_env.
conda create -n objectnet_env python=3.7
conda activate objectnet_env
conda install pytorch torchvision cudatoolkit=11.0 -c pytorch
conda install tqdm

1.2 Install NVIDIA drivers

If your local machine has NVIDIA-capable GPUs and you want to test your docker image locally using these GPUs then you will need to ensure the NVIDIA drivers have been installed on your test machine.

Instructions on how to install the CUDA toolkit and NVIDIA drivers can be found here. Be sure to match the versions of CUDA/NVIDIA installed with the version of PyTorch and CUDA used to build your docker image - see Building the docker image.

1.3 Clone git repository containing example

Clone the following git repo to a machine which has docker installed:

$ git clone https://github.com/abarbu/objectnet-template-pytorch.git

This repo comes with python scripts to perform batch inference using a sample model, validate and score the inferences and also contains a set of test images (input/images) and a file containing ground truth data for those images (input/answers/answers-test.json). You will need to download the sample model (resnext101_32x48d) used in this example (see 1.6 Testing the example)

1.4 Running objectnet_eval.py

objectnet_eval.py is the main entry point for running this example; it essentially performs batch inference against all images in a supplied input directory (images-dir). Full help is available using objectnet_eval.py --help:

usage: objectnet_eval.py [-h] [--workers N] [--gpus N] [--batch_size N]
                         [--softmax T/F] [--convert_outputs_mode N]
                         images-dir output-file model-class-name
                         model-checkpoint

Evaluate a PyTorch model on ObjectNet images and output predictions to a CSV
file.

positional arguments:
  images-dir            path to dataset
  output-file           path to predictions output file
  model-class-name      model class name in model_description.py
  model-checkpoint      path to model checkpoint

optional arguments:
  -h, --help            show this help message and exit
  --workers N           number of data loading workers (default: total num
                        CPUs)
  --gpus N              number of GPUs to use
  --batch_size N        mini-batch size (default: 96), this is the batch size
                        of each GPU on the current node when using Data
                        Parallel or Distributed Data Parallel
  --softmax T/F         apply a softmax function to network outputs to convert
                        output magnitudes to confidence values (default:True)
  --convert_outputs_mode N
                        0: no conversion of prediction IDs, 1: convert from
                        pytorch ImageNet prediction IDs to ObjectNet
                        prediction IDs (default:1)
Note: The default values for `workers` and `batch_size` are tuned for this example. Please do not modify these properties when making an ObjectNet submission using the sample code.

1.5 Code structure

There follows a description of the code structure used in this repo.

./objectnet_eval.py:

./objectnet_pytorch_dataloader.py:

Inside of the model directory: (This is the only code that you will have to modify):

./model/model_description.py:

./model/data_transform_description.py:

./input/images:

./input/answers/answers-test.json:

1.6 Testing the example

Before executing the example for the first time you must download the sample model as shown below:

# Download the model:
$ cd objectnet-template-pytorch
$ mkdir downloads
$ cd downloads
$ wget https://download.pytorch.org/models/ig_resnext101_32x48-3e41cc8a.pth
$ cp ig_resnext101_32x48-3e41cc8a.pth ../model
$ cd ..

Note: The downloads/ directory is used to store downloaded models so they only need to be downloaded once. If you want to use a model which is in downloads/, make sure to copy it to model/ as shown in the second last line above. This way, model/ can be kept with only one active model at once, and downloads/ can be used as storage for all models.

Use resnext101_32x48d_wsl as the model-class-name argument and model/ig_resnext101_32x48-3e41cc8a.pth as the model-checkpoint argument to the objectnet_eval.py script to test the example model:

# Perform batch inference:
$ python3 objectnet_eval.py input/images output/predictions.csv resnext101_32x48d_wsl model/ig_resnext101_32x48-3e41cc8a.pth

**** params ****
images input/images
output_file output/predictions.csv
model_class_name resnext101_32x48d_wsl
model_checkpoint model/ig_resnext101_32x48-3e41cc8a.pth
workers 16
gpus 2
batch_size 96
softmax True
convert_outputs_mode 1
****************

initializing model ...
loading pretrained weights from disk ...
Done. Number of predictions:  10

Results will be written to the predictions.csv file in the output/ directory. Check the output conforms to the format expected by the ObjectNet Challenge.

1.7 Modifying the code to use your own PyTorch model

You can plugin your own existing PyTorch model into the template. To illustrate the technique the process to plug-in a pre-trained InceptionV3 model is shown below.

1.7.1 Requirements

The InceptionV3 model uses the SciPy library. As this is not included in the default PyTorch Docker container it needs to be listed in the requirements.txt file so that it is 'pip installed' when the docker image is built. Include it as follows:

# This file specifies python dependencies which are to be installed into the Docker image.
# List one library per line (not as a comment)
# e.g.
#numpy
scipy

1.7.2 Template changes

The only code changes necessary when incorporating your PyTorch model should be in the model/ directory.

  1. Before downloading your model checkpoint file, remove the existing checkpoint file from the model/ directory. For example:
    $ rm -rf model/ig_resnext101_32x48-3e41cc8a.pth
  2. Download your model checkpoint file and copy into model/. For example:
  3. $ cd downloads
    $ wget https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth
    $ cp inception_v3_google-1a9a5a14.pth ../model
    $ cd ..
    Note: When your docker image is submitted to the challenge for evaluation the image will not have internet access and as such will not be able to download model checkpoints from the internet. For this reason it is essential that your model is included in the built docker image.

  4. Add your model description as a class to model/model_description.py. The class name will be used as the model-class-name argument to objectnet_eval.py.

    For the inception_v3 model copy the Inception3 class, and all classes it depends on, from inception.py into model_description.py. In total, you would need copy the following classes: Inception3, InceptionA, InceptionB, InceptionC, InceptionD, InceptionE, InceptionAux, and BasicConv2d Additionally, add the following import statements at the beginning of the file:

    from collections import namedtuple
    import warnings
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    from torch import Tensor
    from typing import Callable, Any, Optional, Tuple, List
    
    You will also need to copy a couple of lines from the start of the file:
    InceptionOutputs = namedtuple('InceptionOutputs', ['logits', 'aux_logits'])
    InceptionOutputs.__annotations__ = {'logits': torch.Tensor, 'aux_logits': Optional[torch.Tensor]}
    
  5. Amend the following parameters in data_transformation_description.py to match those that your model was trained on.

    For the InceptionV3 model we used:

  6. self.model_pretrain_params['input_size'] = [3, 299, 299]
    self.model_pretrain_params['mean'] = [0.5, 0.5, 0.5]
    self.model_pretrain_params['std'] = [0.5, 0.5, 0.5]
  7. Test your model's inference using the test images and ground-truth data provided in the objectnet-template-pytorch:
  8. $ python3 objectnet_eval.py input/images output/predictions.csv Inception3 model/inception_v3_google-1a9a5a14.pth
    
      **** params ****
      images input/images
      output_file output/predictions.csv
      model_class_name Inception3
      model_checkpoint model/inception_v3_google-1a9a5a14.pth
      workers 16
      gpus 2
      batch_size 96
      softmax True
      convert_outputs_mode 1
      ****************
    
      initializing model ...
      loading pretrained weights from disk ...
      Done. Number of predictions:  10
    
    Note: If you want to run inference again or with another model, you will first have to delete the predictions output file.
    $ rm output/predictions.csv
    

1.8 Validating the predictions of your model

In order to ensure that the predictions.csv file is structured according to the ObjectNet Challenge specifications, it is important to validate the output using the validate_and_score.py script provided in the objectnet-template-pytorch repo. Once your model has successfully executed run the following command to validate your output:

$ python3 validate_and_score.py -a input/answers/answers-test.json -f output/predictions.csv

Note the usage of the -a and -f flags as specified in validate_and_score.py --help below.

usage: validate_and_score.py [-h] --answers ANSWERS --filename FILENAME
                              [--no-range-check]
  optional arguments:
    -h, --help            show this help message and exit
    --answers ANSWERS, -a ANSWERS
                          ground truth/answer file
    --filename FILENAME, -f FILENAME
                          users result file
    --no-range-check, -n  allow entries that have out-of-range label indices

Proceed to Section 2 if you receive an output of "prediction_file_status": "VALIDATED".

If you received an error in running this command ensure that you have entered the correct file locations for the answer file as well as the result file. For clarification on result file structure refer to the evaluation criteria on the challenge page.


Section 2: Building the docker image

2.1 Install the Docker engine

To build and test a docker image locally you will first need to install the docker engine. Follow the instructions on installing docker, along with a quick start guide.

2.2 Install NVIDIA drivers

Prior to uploading the docker image to the competition portal for evaluation you should test your docker image locally. If your local machine has NVIDIA-capable GPUs and you wish to test inference using GPUs then you will first need to install the NVIDIA drivers on your machine. See section 1.2 Install NVIDIA drivers above.

2.3 Add your model & supporting code

Ensure you have been able to successfully test your model on the local host using the objectnet_eval.py example code - see section 1.8 Validating the predictions of your model for more details.

Note: Your model must have been saved using torch.save(model, "<PATH TO SAVED MODEL FILE>").

2.4 Build the docker image

Docker images are built from a series of statements contained in a Dockerfile. A template Dockerfile is provided for models built using the PyTorch deep learning framework and saved using the torch.save api.

The PyTorch docker image template for the ObjectNet Challenge uses one of the official PyTorch docker images as its base image. These PyTorch images come with built-in GPU support and with python 3 pre-loaded.

Note: Docker images submitted to the ObjectNet Challenge must be based on a GPU enabled base image and use GPUs for inferencing.

You can customise the PyTorch and cuda versions used for the base image by editing the following line of the Dockerfile file. Choose an image base from the official list (linked above) which most closely matches the versions used to build your model:

# Here we're using PyTorch 1.7.0 with CUDA 11.0
FROM pytorch/pytorch:1.7.0-cuda11.0-cudnn8-runtime

To improve performance the example code batches up inferencing of the ObjectNet images and execute a number of streams (or workers) in parallel.

You can further customise the build of your docker container by specifying the following arguments at docker build time:

A bash script, build-docker-submission.sh, has been created to build the Docker image for you. The script has the following inputs:

This command builds your model into a Docker Image
Docker Image will be set to IMAGE:TAG

Default
TAG=latest

options:
-h, --help                        show brief help
-n, --model-class-name=NAME       specify a model class name to use
-c, --model-checkpoint=CHECKPOINT specify the path to a model checkpoint to use
-i, --image=IMAGE                 specify your Docker image
-t, --tag=TAG                     specify your Docker image tag
-nc, --no-cache                   bypass cache for docker build

Create your image by running:

./build-docker-submission.sh -i IMAGE -t TAG -n NAME -c CHECKPOINT

For example, to build a docker image (called 'my_model' with a tag of 'version1') containing the resnet model:

Note: To save space in the built docker image:

Once the build is complete your newly built docker image can be listed using the command:

$ docker images

If the docker was built without version tagging it is given a default tag of latest.

2.5 Testing the docker image locally

Test the docker image locally before submitting it to the challenge. For example, a docker image called my-model:version1 is run by:

# First remove the output file
$ rm output/predictions.csv
# Now run the docker image
$ docker run -ti --rm --gpus=all -v $PWD/input/images:/input/ -v $PWD/output:/output my_model:version1

**** params ****
images /input
output_file /output/predictions.csv
model_class_name resnext101_32x48d_wsl
model_checkpoint /workspace/model/ig_resnext101_32x48-3e41cc8a.pth
workers 16
gpus 2
batch_size 96
softmax True
convert_outputs_mode 1
****************

initializing model ...
loading pretrained weights from disk ...
Done. Number of predictions:  10

The -v $PWD/input/images:/input mounts a directory of test images from the local path into /input within the docker container. Similarly, -v $PWD/output:/output mounts an output directory from the local path into /output of the container. Add the --gpus=all parameter to the docker run command in order to utilise your GPUs.

A successful run will result in a predicitions.csv file written to the $PWD/output path.

2.6 Debugging your docker image locally

If there are errors during the previous step then you will need to debug your docker container. If you make changes to your code there is no need to rebuild the docker container. To quickly test your new code, simply mount the root path of this repo as a volume when you run the container. For example:

$ docker run -ti --rm --gpus=all -v $PWD:/workspace -v $PWD/input/images:/input/ -v $PWD/output:/output --entrypoint /bin/bash my-model:version1

When the docker container is run, the local $PWD will be mounted over /workspace directory within the docker image which effectively means any code/model changes made since the last docker build command will be contained within the running container.

2.7 Validating the predictions

In order to ensure that the predictions.csv file is structured according to the ObjectNet Challenge specifications, it is important to validate it against the validate_and_score.py script. Run the following command:

$ python3 validate_and_score.py -a input/answers/answers-test.json -f output/predictions.csv

{
  "accuracy": 20.0,
  "images_scored": 10,
  "prediction_file_errors": [],
  "prediction_file_status": "VALIDATED",
  "top5_accuracy": 30.0,
  "total_images": 10
}

A correctly formatted predictions.csv file will result in an output of "prediction_file_status": "VALIDATED". Otherwise, refer back to 1.8 Validating the predictions to handle any errors.

Once the output of your model has been validated, you are ready to submit the docker image to the ObjectNet Challenge.