Smooth L1 loss
Creates a criterion that uses a squared term if the absolute
element-wise error falls below 1 and an L1 term otherwise.
It is less sensitive to outliers than the MSELoss
and in some cases
prevents exploding gradients (e.g. see Fast R-CNN
paper by Ross Girshick).
Also known as the Huber loss:
nn_smooth_l1_loss(reduction = "mean")
reduction |
(string, optional): Specifies the reduction to apply to the output:
|
\mbox{loss}(x, y) = \frac{1}{n} ∑_{i} z_{i}
where z_{i} is given by:
z_{i} = \begin{array}{ll} 0.5 (x_i - y_i)^2, & \mbox{if } |x_i - y_i| < 1 \\ |x_i - y_i| - 0.5, & \mbox{otherwise } \end{array}
x and y arbitrary shapes with a total of n elements each
the sum operation still operates over all the elements, and divides by n.
The division by n can be avoided if sets reduction = 'sum'
.
Input: (N, *) where * means, any number of additional dimensions
Target: (N, *), same shape as the input
Output: scalar. If reduction
is 'none'
, then
(N, *), same shape as the input
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.