probnmn.models.nmn¶

class probnmn.models.nmn.NeuralModuleNetwork(vocabulary: allennlp.data.vocabulary.Vocabulary, image_feature_size: Tuple[int, int, int] = (1024, 14, 14), module_channels: int = 128, class_projection_channels: int = 1024, classifier_linear_size: int = 1024)[source]¶

Bases: torch.nn.modules.module.Module

A NeuralModuleNetwork holds neural modules, a stem network, and a classifier network. It hooks these all together to answer a question given some scene and a program describing how to arrange the neural modules.

Parameters

vocabulary: allennlp.data.vocabulary.Vocabulary: AllenNLP’s vocabulary. This vocabulary has three namespaces - “questions”, “programs” and “answers”, which contain respective token to integer mappings.
image_feature_size: tuple (K, R, C), optional (default = (1024, 14, 14)): Shape of input image features, in the form (channel, height, width).
module_channels: int, optional (default = 128): Number of channels for each neural module’s convolutional blocks.
class_projection_channels: int, optional (default = 512): Number of channels in projected final feature map (input to classifier).
classifier_linear_size: int, optional (default = 1024): Size of input to the linear classifier.

classmethod from_config(config:probnmn.config.Config)[source]¶: Instantiate this class directly from a Config.

forward(self, features:torch.Tensor, programs:torch.Tensor, answers:Union[torch.Tensor, NoneType]=None)[source]¶

Given image features and program sequences, lay out a modular network and pass through the image features, further take the final feature representation output from modular network and pass it throuh the classifier to get the answer distribution.

Parameters

features: torch.Tensor: Input image features of shape (batch, channels, height, width).
programs: torch.Tensor: Program sequences padded up to maximum length, shape (batch_size, max_program_length).
answers: torch.Tensor, optional (default = None): Target answers for corresponding images and programs, shape (batch_size, ).

Returns

Dict[str, Any]

Model predictions, answer cross-entropy loss and (if training, ) batch metrics. When answer targets are not provided, it returns negative log-probabilities of predicted answers. A dict with structure:

{
    "predictions": torch.Tensor (shape: (batch_size, )),
    "loss": torch.Tensor (shape: (batch_size, )),
    "metrics": {
        "answer_accuracy": float,
        "average_invalid": float,
    }
}

Notes

The structure of modular network is different for each program sequence, so we just loop through all programs of a batch and do forward pass for each example in the loop.

get_metrics(self, reset:bool=True) → Dict[str, float][source]¶

Return recorded answer accuracy and average invalid programs per batch.

Parameters

reset: bool, optional (default = True): Whether to reset the accumulated metrics after retrieving them.

Returns

Dict[str, float]: A dictionary with metrics {"answer_accuracy", "average_invalid"}.

probnmn.models.nmn¶

ProbNMN

Navigation

Related Topics