torch.nn.functional.cross_entropy#

torch.nn.functional.cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean', label_smoothing=0.0)[source]#

计算输入 logits 和 target 之间的交叉熵损失。

有关详细信息，请参阅 CrossEntropyLoss。

参数

input (Tensor) – 预测的未归一化 logits；请参阅下面的 Shape 部分了解支持的形状。
target (Tensor) – 真实类索引或类概率；请参阅下面的 Shape 部分了解支持的形状。
weight (Tensor, optional) – 手动为每个类别指定的重缩放权重。如果提供，则必须是大小为 C 的 Tensor。
size_average (bool, optional) – 已弃用（请参阅 reduction）。
ignore_index (int, optional) – 指定一个被忽略且不计入输入梯度的目标值。当 size_average 为 True 时，损失将根据非忽略目标计算平均值。请注意，ignore_index 仅在 target 包含类索引时适用。默认值：-100
reduce (bool, optional) – 已弃用（请参阅 reduction）。
reduction (str, optional) – 指定应用于输出的缩减方式：'none' | 'mean' | 'sum'。 'none'：不应用缩减， 'mean'：输出的总和将除以输出中的元素数量， 'sum'：输出将求和。注意：size_average 和 reduce 正在被弃用，在此期间，指定其中任何一个参数都将覆盖 reduction。默认值：'mean'
label_smoothing (float, optional) – 一个在 [0.0, 1.0] 范围内的浮点数。指定计算损失时的平滑量，0.0 表示无平滑。目标变为原始真实标签和均匀分布的混合，如 Rethinking the Inception Architecture for Computer Vision 中所述。默认值： $0.0$ 。

返回类型

张量

形状

输入：形状为 $(C)$ 、 $(N, C)$ 或 $(N, C, d_1, d_2, ..., d_K)$ K 维损失情况下的 $K \geq 1$ 。
Target: 如果包含类索引，形状为 $()$ 、 $(N)$ 或 $(N, d_1, d_2, ..., d_K)$ K 维损失情况下的 $K \geq 1$ ，其中每个值应介于 $[0, C)$ 。如果包含类概率，则形状与输入相同，并且每个值应介于 $[0, 1]$ 。

其中

\begin{aligned} C ={} & \text{number of classes} \\ N ={} & \text{batch size} \\ \end{aligned}

示例

>>> # Example of target with class indices
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.randint(5, (3,), dtype=torch.int64)
>>> loss = F.cross_entropy(input, target)
>>> loss.backward()
>>>
>>> # Example of target with class probabilities
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.randn(3, 5).softmax(dim=1)
>>> loss = F.cross_entropy(input, target)
>>> loss.backward()

torch.nn.functional.cross_entropy#

文档

教程

资源