How to setup this codebase?

This codebase requires Python 3.6+ or higher. It uses PyTorch v1.0, and has out of the box support with CUDA 9 and CuDNN 7. The recommended way to set this codebase up is through Anaconda or Miniconda, although this should work just as fine with VirtualEnv.

Install Dependencies

For these steps to install through Anaconda / Miniconda.

  1. Install Anaconda or Miniconda distribution based on Python3+ from their downloads site.

  2. Clone the repository first.

    git clone https://www.github.com/kdexd/probnmn-clevr
    
  3. Create a conda environment and install all the dependencies.

    cd probnmn-clevr
    conda create -n probnmn python=3.6
    conda activate probnmn
    pip install -r requirements.txt
    
  4. Install this codebase as a package in development version.

    python setup.py develop
    

Now you can import probnmn from anywhere in your filesystem as long as you have this conda environment activated.

Download and Preprocess Data

  1. This codebase assumes all the data to be in $PROJECT_ROOT/data directory by default, although custom paths can be provided through config. Download CLEVR v1.0 dataset from here and symlink it as follows:

    $PROJECT_ROOT/data
        |—— CLEVR_test_questions.json
        |—— CLEVR_train_questions.json
        |—— CLEVR_val_questions.json
        +—— images
            |—— train
            |   |—— CLEVR_train_000000.png
            |   +—— CLEVR_train_000001.png ...
            |—— val
            |   |—— CLEVR_val_000000.png
            |   +—— CLEVR_val_000001.png ...
            +—— test
                |—— CLEVR_test_000000.png
                +—— CLEVR_test_000001.png ...
    
  2. Build a vocabulary out of CLEVR programs, questions and answers, which is compatible with AllenNLP, and will be used throughout the training and evaluation. This will create a directory with separate text files containing unique tokens of questions, programs and answers.

    python scripts/preprocess/build_vocabulary.py \
        --clevr-jsonpath data/CLEVR_train_questions.json \
        --output-dirpath data/clevr_vocabulary
    
  3. Tokenize programs, questions and answers of CLEVR training, validation (and test) splits using this vocabulary mapping.

    python scripts/preprocess/preprocess_questions.py \
        --clevr-jsonpath data/CLEVR_train_questions.json \
        --vocab-dirpath data/clevr_vocabulary \
        --output-h5path data/clevr_train_tokens.h5 \
        --split train
    
  4. Extract image features using pre-trained ResNet-101 from torchvision model zoo.

    python scripts/preprocess/extract_features.py \
        --image-dir data/images/train \
        --output-h5path data/clevr_train_features.h5 \
        --split train
    

That’s it! Steps 3 and 4 will create necessary H5 files which can be used by probnmn.data.readers and further probnmn.data.datasets.