probnmn.modules.elbo¶
-
class
probnmn.modules.elbo.
Reinforce
(baseline_decay: float = 0.99)[source]¶ Bases:
torch.nn.modules.module.Module
A PyTorch module which applies REINFORCE to inputs using a specified reward, and internally keeps track of a decaying moving average baseline.
- Parameters
- baseline_decay: float, optional (default = 0.99)
Factor by which the moving average baseline decays on every call.
-
class
probnmn.modules.elbo.
_ElboWithReinforce
(beta: float = 0.1, baseline_decay: float = 0.99)[source]¶ Bases:
torch.nn.modules.module.Module
A PyTorch Module to compute the Fully Monte Carlo form of Evidence Lower Bound, given the inference likelihood, reconstruction likelihood and a REINFORCE reward. Accepting any scalar as REINFORCE reward allows flexibility in ELBO objective - like we have an extra answer log-likelihood term during Joint Training.
This class is not used directly, instead its extended classes
QuestionCodingElbo
andJointTrainingElbo
are used in corresponding phases.
-
class
probnmn.modules.elbo.
QuestionCodingElbo
(program_generator: probnmn.models.program_generator.ProgramGenerator, question_reconstructor: probnmn.models.question_reconstructor.QuestionReconstructor, program_prior: probnmn.models.program_prior.ProgramPrior, beta: float = 0.1, baseline_decay: float = 0.99)[source]¶ Bases:
probnmn.modules.elbo._ElboWithReinforce
A PyTorch module to compute Evidence Lower Bound for observed questions without (GT) program supervision. This implementation takes the Fully Monte Carlo form, and uses
Reinforce
estimator for gradient estimation of parameters of the inference model (ProgramGenerator
).- Parameters
- program_generator: ProgramGenerator
A
ProgramGenerator
, serves as inference model of the posterior (programs).- question_reconstructor: QuestionReconstructor
A
QuestionReconstructor
, serves as reconstruction model of observed data (questions).- program_prior: ProgramPrior
A
ProgramPrior
, serves as prior of the posterior distribution (programs).- beta: float, optional (default = 0.1)
KL co-efficient. Refer
BETA
inConfig
.- baseline_decay: float, optional (default = 0.99)
Decay co-efficient for moving average REINFORCE baseline. Refer
DELTA
inConfig
.
-
class
probnmn.modules.elbo.
JointTrainingElbo
(program_generator: probnmn.models.program_generator.ProgramGenerator, question_reconstructor: probnmn.models.question_reconstructor.QuestionReconstructor, program_prior: probnmn.models.program_prior.ProgramPrior, nmn: probnmn.models.nmn.NeuralModuleNetwork, beta: float = 0.1, gamma: float = 10, baseline_decay: float = 0.99, objective: str = 'ours')[source]¶ Bases:
probnmn.modules.elbo._ElboWithReinforce
A PyTorch module to compute Evidence Lower Bound for observed questions without (GT) program supervision with the added answer log-likelihood term in the bound, from Joint Training objective. This implementation takes the Fully Monte Carlo form, and uses
Reinforce
estimator for gradient estimation of parameters of the inference model (ProgramGenerator
).- Parameters
- program_generator: ProgramGenerator
A
ProgramGenerator
, serves as inference model of the posterior (programs).- question_reconstructor: QuestionReconstructor
A
QuestionReconstructor
, serves as reconstruction model of observed data (questions).- program_prior: ProgramPrior
A
ProgramPrior
, serves as prior of the posterior distribution (programs).- nmn: NeuralModuleNetwork
A
NeuralModuleNetwork
, for answer log-likelihood term in the objective.- beta: float, optional (default = 0.1)
KL co-efficient. Refer
BETA
inConfig
.- gamma: float, optional (default = 10)
Answer log-likelihood scaling co-efficient. Refer
GAMMA
inConfig
.- baseline_decay: float, optional (default = 0.99)
Decay co-efficient for moving average REINFORCE baseline. Refer
DELTA
inConfig
.- objective: str, optional (default = “ours”)
Training objective, “baseline” - REINFORCE reward would only have answer log-likelihood. “ours” - REINFORCE reward would have the full Evidence Lower Bound added.