GaussianNLLLoss¶
- class torch.nn.GaussianNLLLoss(*, full=False, eps=1e-06, reduction='mean')[source]¶
Gaussian negative log likelihood loss.
The targets are treated as samples from Gaussian distributions with expectations and variances predicted by the neural network. For a
target
tensor modelled as having Gaussian distribution with a tensor of expectationsinput
and a tensor of positive variancesvar
the loss is:where
eps
is used for stability. By default, the constant term of the loss function is omitted unlessfull
isTrue
. Ifvar
is not the same size asinput
(due to a homoscedastic assumption), it must either have a final dimension of 1 or have one fewer dimension (with all other sizes being the same) for correct broadcasting.- Parameters:
full (bool, optional) – include the constant term in the loss calculation. Default:
False
.eps (float, optional) – value used to clamp
var
(see note below), for stability. Default: 1e-6.reduction (str, optional) – specifies the reduction to apply to the output:
'none'
|'mean'
|'sum'
.'none'
: no reduction will be applied,'mean'
: the output is the average of all batch member losses,'sum'
: the output is the sum of all batch member losses. Default:'mean'
.
- Shape:
Input: or where means any number of additional dimensions
Target: or , same shape as the input, or same shape as the input but with one dimension equal to 1 (to allow for broadcasting)
Var: or , same shape as the input, or same shape as the input but with one dimension equal to 1, or same shape as the input but with one fewer dimension (to allow for broadcasting)
Output: scalar if
reduction
is'mean'
(default) or'sum'
. Ifreduction
is'none'
, then , same shape as the input
- Examples::
>>> loss = nn.GaussianNLLLoss() >>> input = torch.randn(5, 2, requires_grad=True) >>> target = torch.randn(5, 2) >>> var = torch.ones(5, 2, requires_grad=True) #heteroscedastic >>> output = loss(input, target, var) >>> output.backward()
>>> loss = nn.GaussianNLLLoss() >>> input = torch.randn(5, 2, requires_grad=True) >>> target = torch.randn(5, 2) >>> var = torch.ones(5, 1, requires_grad=True) #homoscedastic >>> output = loss(input, target, var) >>> output.backward()
Note
The clamping of
var
is ignored with respect to autograd, and so the gradients are unaffected by it.- Reference:
Nix, D. A. and Weigend, A. S., “Estimating the mean and variance of the target probability distribution”, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94), Orlando, FL, USA, 1994, pp. 55-60 vol.1, doi: 10.1109/ICNN.1994.374138.