probnmn.evaluators._evaluator¶
-
class
probnmn.evaluators._evaluator.
_Evaluator
(config: probnmn.config.Config, dataloader: torch.utils.data.dataloader.DataLoader, models: Dict[str, Type[torch.nn.modules.module.Module]], gpu_ids: List[int] = [0])[source]¶ Bases:
object
A base class for generic evaluation of models. This class can have multiple models interacting with each other, rather than a single model, which is suitable to our use-case (for example,
module_training
phase has two models:ProgramGenerator
andNeuralModuleNetwork
). It offers full flexibility, with sensible defaults which may be changed (or disabled) while extending this class.Extend this class and override
_do_iteration()
method, with core evaluation loop - what happens every iteration, given abatch
from the dataloader this class holds.- Parameters
- config: Config
A
Config
object with all the relevant configuration parameters.- dataloader: torch.utils.data.DataLoader
A
DataLoader
which provides batches of evaluation examples. It wraps one ofprobnmn.data.datasets
depending on the evaluation phase.- models: Dict[str, Type[nn.Module]]
All the models which interact with each other for evaluation. These are one or more from
probnmn.models
depending on the evaluation phase.- gpu_ids: List[int], optional (default=[0])
List of GPU IDs to use or evaluation,
[-1]
- use CPU.
Notes
All models are passed by assignment, so they could be shared with an external trainer. Do not set
self._models = ...
anywhere while extending this class.An instantiation of this class will always be paired in conjunction to a
_Trainer
. Pass the models of trainer class while instantiating this class.
-
evaluate
(self, num_batches:Union[int, NoneType]=None) → Dict[str, Any][source]¶ Perform evaluation using first
num_batches
of dataloader and return all evaluation metrics from the models.- Parameters
- num_batches: int, optional (default=None)
Number of batches to use from dataloader. If
None
, use all batches.
- Returns
- Dict[str, Any]
Final evaluation metrics for all the models.
-
_do_iteration
(self, batch:Dict[str, Any]) → Dict[str, Any][source]¶ Core evaluation logic for one iteration, operates on a batch. This base class has a dummy implementation - just forward pass through some “model”.
- Parameters
- batch: Dict[str, Any]
A batch of evaluation examples sampled from dataloader. See
evaluate()
on how this batch is sampled.
- Returns
- Dict[str, Any]
An output dictionary typically returned by the models. This may contain predictions from models, validation loss etc.