probnmn.evaluators._evaluator¶

class probnmn.evaluators._evaluator._Evaluator(config: probnmn.config.Config, dataloader: torch.utils.data.dataloader.DataLoader, models: Dict[str, Type[torch.nn.modules.module.Module]], gpu_ids: List[int] = [0])[source]¶

Bases: object

A base class for generic evaluation of models. This class can have multiple models interacting with each other, rather than a single model, which is suitable to our use-case (for example, module_training phase has two models: ProgramGenerator and NeuralModuleNetwork). It offers full flexibility, with sensible defaults which may be changed (or disabled) while extending this class.

Extend this class and override _do_iteration() method, with core evaluation loop - what happens every iteration, given a batch from the dataloader this class holds.

Parameters

config: Config: A Config object with all the relevant configuration parameters.
dataloader: torch.utils.data.DataLoader: A DataLoader which provides batches of evaluation examples. It wraps one of probnmn.data.datasets depending on the evaluation phase.
models: Dict[str, Type[nn.Module]]: All the models which interact with each other for evaluation. These are one or more from probnmn.models depending on the evaluation phase.
gpu_ids: List[int], optional (default=[0]): List of GPU IDs to use or evaluation, [-1] - use CPU.

Notes

All models are passed by assignment, so they could be shared with an external trainer. Do not set self._models = ... anywhere while extending this class.
An instantiation of this class will always be paired in conjunction to a _Trainer. Pass the models of trainer class while instantiating this class.

evaluate(self, num_batches:Union[int, NoneType]=None) → Dict[str, Any][source]¶

Perform evaluation using first num_batches of dataloader and return all evaluation metrics from the models.

Parameters

num_batches: int, optional (default=None): Number of batches to use from dataloader. If None, use all batches.

Returns

Dict[str, Any]: Final evaluation metrics for all the models.

_do_iteration(self, batch:Dict[str, Any]) → Dict[str, Any][source]¶

Core evaluation logic for one iteration, operates on a batch. This base class has a dummy implementation - just forward pass through some “model”.

Parameters

batch: Dict[str, Any]: A batch of evaluation examples sampled from dataloader. See evaluate() on how this batch is sampled.

Returns

Dict[str, Any]: An output dictionary typically returned by the models. This may contain predictions from models, validation loss etc.

probnmn.evaluators._evaluator¶

ProbNMN

Navigation

Related Topics