probnmn.models.nmn¶
-
class
probnmn.models.nmn.
NeuralModuleNetwork
(vocabulary: allennlp.data.vocabulary.Vocabulary, image_feature_size: Tuple[int, int, int] = (1024, 14, 14), module_channels: int = 128, class_projection_channels: int = 1024, classifier_linear_size: int = 1024)[source]¶ Bases:
torch.nn.modules.module.Module
A
NeuralModuleNetwork
holds neural modules, a stem network, and a classifier network. It hooks these all together to answer a question given some scene and a program describing how to arrange the neural modules.- Parameters
- vocabulary: allennlp.data.vocabulary.Vocabulary
AllenNLP’s vocabulary. This vocabulary has three namespaces - “questions”, “programs” and “answers”, which contain respective token to integer mappings.
- image_feature_size: tuple (K, R, C), optional (default = (1024, 14, 14))
Shape of input image features, in the form (channel, height, width).
- module_channels: int, optional (default = 128)
Number of channels for each neural module’s convolutional blocks.
- class_projection_channels: int, optional (default = 512)
Number of channels in projected final feature map (input to classifier).
- classifier_linear_size: int, optional (default = 1024)
Size of input to the linear classifier.
-
classmethod
from_config
(config:probnmn.config.Config)[source]¶ Instantiate this class directly from a
Config
.
-
forward
(self, features:torch.Tensor, programs:torch.Tensor, answers:Union[torch.Tensor, NoneType]=None)[source]¶ Given image features and program sequences, lay out a modular network and pass through the image features, further take the final feature representation output from modular network and pass it throuh the classifier to get the answer distribution.
- Parameters
- features: torch.Tensor
Input image features of shape (batch, channels, height, width).
- programs: torch.Tensor
Program sequences padded up to maximum length, shape (batch_size, max_program_length).
- answers: torch.Tensor, optional (default = None)
Target answers for corresponding images and programs, shape (batch_size, ).
- Returns
- Dict[str, Any]
Model predictions, answer cross-entropy loss and (if training, ) batch metrics. When answer targets are not provided, it returns negative log-probabilities of predicted answers. A dict with structure:
{ "predictions": torch.Tensor (shape: (batch_size, )), "loss": torch.Tensor (shape: (batch_size, )), "metrics": { "answer_accuracy": float, "average_invalid": float, } }
Notes
The structure of modular network is different for each program sequence, so we just loop through all programs of a batch and do forward pass for each example in the loop.
-
get_metrics
(self, reset:bool=True) → Dict[str, float][source]¶ Return recorded answer accuracy and average invalid programs per batch.
- Parameters
- reset: bool, optional (default = True)
Whether to reset the accumulated metrics after retrieving them.
- Returns
- Dict[str, float]
A dictionary with metrics
{"answer_accuracy", "average_invalid"}
.