Quantization API Reference — PyTorch 2.8 文档

torch.ao.quantization#

This module contains Eager mode quantization APIs.

Top level APIs#

`quantize`	Quantize the input float model with post training static quantization.
`quantize_dynamic`	Converts a float model to dynamic (i.e.
`quantize_qat`	Do quantization aware training and output a quantized model
`prepare`	Prepares a copy of the model for quantization calibration or quantization-aware training.
`prepare_qat`	Prepares a copy of the model for quantization calibration or quantization-aware training and converts it to quantized version.
`convert`	Converts submodules in input module to a different module according to mapping by calling from_float method on the target module class.

Preparing model for quantization#

`fuse_modules.fuse_modules`	Fuse a list of modules into a single module.
`QuantStub`	Quantize stub module, before calibration, this is same as an observer, it will be swapped as nnq.Quantize in convert.
`DeQuantStub`	Dequantize stub module, before calibration, this is same as identity, this will be swapped as nnq.DeQuantize in convert.
`QuantWrapper`	A wrapper class that wraps the input module, adds QuantStub and DeQuantStub and surround the call to module with call to quant and dequant modules.
`add_quant_dequant`	Wrap the leaf child module in QuantWrapper if it has a valid qconfig Note that this function will modify the children of module inplace and it can return a new module which wraps the input module as well.

Utility functions#

swap_module

Swaps the module if it has a quantized counterpart and it has an observer attached.

propagate_qconfig_

Propagate qconfig through the module hierarchy and assign qconfig attribute on each leaf module

default_eval_fn

Define the default evaluation function.

torch.ao.quantization.quantize_fx#

This module contains FX graph mode quantization APIs (prototype).

`prepare_fx`	Prepare a model for post training quantization
`prepare_qat_fx`	Prepare a model for quantization aware training
`convert_fx`	Convert a calibrated or trained model to a quantized model
`fuse_fx`	Fuse modules like conv+bn, conv+bn+relu etc, model must be in eval mode.

torch.ao.quantization.qconfig_mapping#

This module contains QConfigMapping for configuring FX graph mode quantization.

QConfigMapping

Mapping from model ops to torch.ao.quantization.QConfig s.

get_default_qconfig_mapping

Return the default QConfigMapping for post training quantization.

get_default_qat_qconfig_mapping

Return the default QConfigMapping for quantization aware training.

torch.ao.quantization.backend_config#

This module contains BackendConfig, a config object that defines how quantization is supported in a backend. Currently only used by FX Graph Mode Quantization, but we may extend Eager Mode Quantization to work with this as well.

`BackendConfig`	Config that defines the set of patterns that can be quantized on a given backend, and how reference quantized models can be produced from these patterns.
`BackendPatternConfig`	Config object that specifies quantization behavior for a given operator pattern.
`DTypeConfig`	Config object that specifies the supported data types passed as arguments to quantize ops in the reference model spec, for input and output activations, weights, and biases.
`DTypeWithConstraints`	Config for specifying additional constraints for a given dtype, such as quantization value ranges, scale value ranges, and fixed quantization params, to be used in `DTypeConfig`.
`ObservationType`	An enum that represents different ways of how an operator/operator pattern should be observed

torch.ao.quantization.fx.custom_config#

This module contains a few CustomConfig classes that’s used in both eager mode and FX graph mode quantization

`FuseCustomConfig`	Custom configuration for `fuse_fx()`.
`PrepareCustomConfig`	Custom configuration for `prepare_fx()` and `prepare_qat_fx()`.
`ConvertCustomConfig`	Custom configuration for `convert_fx()`.
`StandaloneModuleConfigEntry`

torch.ao.quantization.quantizer#

torch.ao.quantization.pt2e (quantization in pytorch 2.0 export implementation)#

torch.ao.quantization.pt2e.export_utils#

model_is_exported

Return True if the torch.nn.Module was exported, False otherwise (e.g.

torch.ao.quantization.pt2e.lowering#

lower_pt2e_quantized_to_x86

Lower a PT2E-qantized model to x86 backend.

PT2 Export (pt2e) Numeric Debugger#

`generate_numeric_debug_handle`	Attach numeric_debug_handle_id for all nodes in the graph module of the given ExportedProgram, like conv2d, squeeze, conv1d, etc, except for placeholder.
`CUSTOM_KEY`	str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
`NUMERIC_DEBUG_HANDLE_KEY`	str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
`prepare_for_propagation_comparison`	Add output loggers to node that has numeric_debug_handle
`extract_results_from_loggers`	For a given model, extract the tensors stats and related information for each debug handle.
`compare_results`	Given two dict mapping from debug_handle_id (int) to list of tensors return a map from debug_handle_id to NodeAccuracySummary that contains comparison information like SQNR, MSE etc.

torch (quantization related functions)#

This describes the quantization related functions of the torch namespace.

quantize_per_tensor

将浮点张量转换为具有给定缩放和零点的量化张量。

quantize_per_channel

将浮点张量转换为具有给定缩放和零点的逐通道量化张量。

dequantize

通过对量化张量进行反量化，返回一个fp32张量。

torch.Tensor (quantization related methods)#

Quantized Tensors support a limited subset of data manipulation methods of the regular full-precision tensor.

`view`	Returns a new tensor with the same data as the `self` tensor but of a different `shape`.
`as_strided`	See `torch.as_strided()`
`expand`	返回 `self` 张量的新视图，其中单例维度已扩展到更大的大小。
`flatten`	See `torch.flatten()`
`select`	See `torch.select()`
`ne`	See `torch.ne()`.
`eq`	See `torch.eq()`
`ge`	See `torch.ge()`.
`le`	See `torch.le()`.
`gt`	See `torch.gt()`.
`lt`	See `torch.lt()`.
`copy_`	Copies the elements from `src` into `self` tensor and returns `self`.
`clone`	See `torch.clone()`
`dequantize`	给定一个量化张量，对其进行去量化并返回去量化的浮点张量。
`equal`	See `torch.equal()`
`int_repr`	给定一个量化张量，`self.int_repr()` 返回一个 CPU 张量，其数据类型为 uint8_t，用于存储给定张量的底层 uint8_t 值。
`max`	See `torch.max()`
`mean`	查看 `torch.mean()`
`min`	查看 `torch.min()`
`q_scale`	给定一个通过线性（仿射）量化量化的张量，返回底层量化器() 的尺度。
`q_zero_point`	给定一个通过线性（仿射）量化量化的张量，返回底层量化器() 的零点。
`q_per_channel_scales`	对于通过线性（仿射）逐通道量化而量化的张量，返回底层量化器的尺度（scale）张量。
`q_per_channel_zero_points`	对于通过线性（仿射）逐通道量化而量化的张量，返回底层量化器的零点（zero_point）张量。
`q_per_channel_axis`	给定一个通过线性（仿射）逐通道量化量化的张量，返回应用逐通道量化的维度的索引。
`resize_`	将 `self` 张量的大小调整为指定的尺寸。
`sort`	查看 `torch.sort()`
`topk`	查看 `torch.topk()`

torch.ao.quantization.observer#

此模块包含观察者（observer），用于在校准（PTQ）或训练（QAT）期间收集数值统计信息。

`ObserverBase`	基础观察者模块。
`MinMaxObserver`	用于基于运行时的最小/最大值计算量化参数的观察者模块。
`MovingAverageMinMaxObserver`	用于基于最小/最大值的移动平均值计算量化参数的观察者模块。
`PerChannelMinMaxObserver`	用于基于运行时的逐通道最小/最大值计算量化参数的观察者模块。
`MovingAveragePerChannelMinMaxObserver`	用于基于运行时的逐通道最小/最大值计算量化参数的观察者模块。
`HistogramObserver`	该模块记录张量值的运行直方图以及最小/最大值。
`PlaceholderObserver`	一个不执行任何操作的观察者，仅将其配置传递给量化模块的 `.from_float()`。
`RecordingObserver`	该模块主要用于调试，并记录运行时期间的张量值。
`NoopObserver`	一个不执行任何操作的观察者，仅将其配置传递给量化模块的 `.from_float()`。
`get_observer_state_dict`	返回与观察者统计信息对应的状态字典。
`load_observer_state_dict`	给定输入模型和包含模型观察者统计信息的 state_dict，将统计信息加载回模型。
`default_observer`	静态量化的默认观察者，通常用于调试。
`default_placeholder_observer`	默认占位符观察者，通常用于量化为 torch.float16。
`default_debug_observer`	默认仅调试观察者。
`default_weight_observer`	默认权重观察者。
`default_histogram_observer`	默认直方图观察者，通常用于 PTQ。
`default_per_channel_weight_observer`	默认逐通道权重观察者，通常用于支持逐通道权重量化的后端，例如 fbgemm。
`default_dynamic_quant_observer`	动态量化的默认观察者。
`default_float_qparams_observer`	浮点零点的默认观察者。
`AffineQuantizedObserverBase`	仿射量化（pytorch/ao）的观察者模块
`Granularity`	表示量化粒度的基类。
`MappingType`	浮点数如何映射到整数
`PerAxis`	表示量化中的逐轴粒度。
`PerBlock`	表示量化中的逐块粒度。
`PerGroup`	表示量化中的逐通道组粒度。
`PerRow`	表示逐行粒度。
`PerTensor`	表示逐张量粒度。
`PerToken`	表示逐 token 粒度。
`TorchAODType`	PyTorch 核心中尚不存在的数据类型的占位符。
`ZeroPointDomain`	枚举，指示零点是在整数域还是浮点域
`get_block_size`	根据输入形状和粒度类型获取块大小。

torch.ao.quantization.fake_quantize#

此模块实现了在 QAT 期间用于执行伪量化的模块。

`FakeQuantizeBase`	基础伪量化模块。
`FakeQuantize`	在训练时模拟量化和反量化操作。
`FixedQParamsFakeQuantize`	在训练时模拟量化和反量化。
`FusedMovingAvgObsFakeQuantize`	定义一个融合模块来观察张量。
`default_fake_quant`	激活的默认伪量化。
`default_weight_fake_quant`	权重的默认伪量化。
`default_per_channel_weight_fake_quant`	逐通道权重的默认伪量化。
`default_histogram_fake_quant`	使用直方图的激活伪量化。
`default_fused_act_fake_quant`	默认伪量化的融合版本，性能更佳。
`default_fused_wt_fake_quant`	默认权重伪量化的融合版本，性能更佳。
`default_fused_per_channel_wt_fake_quant`	默认逐通道权重伪量化的融合版本，性能更佳。
`disable_fake_quant`	禁用模块的伪量化。
`enable_fake_quant`	启用模块的伪量化。
`disable_observer`	禁用此模块的观察。
`enable_observer`	启用此模块的观察。

torch.ao.quantization.qconfig#

此模块定义了 QConfig 对象，这些对象用于配置各个算子的量化设置。

`QConfig`	通过为激活和权重分别提供设置（观察者类）来描述如何量化一个层或网络的一部分。
`default_qconfig`	默认 qconfig 配置。
`default_debug_qconfig`	用于调试的默认 qconfig 配置。
`default_per_channel_qconfig`	用于逐通道权重量化的默认 qconfig 配置。
`default_dynamic_qconfig`	默认动态 qconfig。
`float16_dynamic_qconfig`	权重量化为 torch.float16 的动态 qconfig。
`float16_static_qconfig`	激活和权重均量化为 torch.float16 的动态 qconfig。
`per_channel_dynamic_qconfig`	权重逐通道量化的动态 qconfig。
`float_qparams_weight_only_qconfig`	权重使用浮点零点量化的动态 qconfig。
`default_qat_qconfig`	QAT 的默认 qconfig。
`default_weight_only_qconfig`	仅量化权重的默认 qconfig。
`default_activation_only_qconfig`	仅量化激活的默认 qconfig。
`default_qat_qconfig_v2`	默认 qat_config 的融合版本，具有性能优势。

torch.ao.nn.intrinsic#

此模块实现了可以量化的组合（融合）模块 conv + relu。

`ConvReLU1d`	这是一个顺序容器，它调用 Conv1d 和 ReLU 模块。
`ConvReLU2d`	这是一个顺序容器，它调用 Conv2d 和 ReLU 模块。
`ConvReLU3d`	这是一个顺序容器，它调用 Conv3d 和 ReLU 模块。
`LinearReLU`	这是一个顺序容器，它调用 Linear 和 ReLU 模块。
`ConvBn1d`	这是一个顺序容器，它调用 Conv 1d 和 Batch Norm 1d 模块。
`ConvBn2d`	这是一个顺序容器，它调用 Conv 2d 和 Batch Norm 2d 模块。
`ConvBn3d`	这是一个顺序容器，它调用 Conv 3d 和 Batch Norm 3d 模块。
`ConvBnReLU1d`	这是一个顺序容器，它调用 Conv 1d、Batch Norm 1d 和 ReLU 模块。
`ConvBnReLU2d`	这是一个顺序容器，它调用 Conv 2d、Batch Norm 2d 和 ReLU 模块。
`ConvBnReLU3d`	这是一个顺序容器，它调用 Conv 3d、Batch Norm 3d 和 ReLU 模块。
`BNReLU2d`	这是一个顺序容器，它调用 BatchNorm 2d 和 ReLU 模块。
`BNReLU3d`	这是一个顺序容器，它调用 BatchNorm 3d 和 ReLU 模块。

torch.ao.nn.intrinsic.qat#

此模块实现了量化感知训练所需的融合操作版本。

`LinearReLU`	一个由 Linear 和 ReLU 模块融合的 LinearReLU 模块，附加了用于权重的 FakeQuantize 模块，用于量化感知训练。
`ConvBn1d`	ConvBn1d 模块是由 Conv1d 和 BatchNorm1d 融合的模块，附加了用于权重的 FakeQuantize 模块，用于量化感知训练。
`ConvBnReLU1d`	ConvBnReLU1d 模块是由 Conv1d、BatchNorm1d 和 ReLU 融合的模块，附加了用于权重的 FakeQuantize 模块，用于量化感知训练。
`ConvBn2d`	ConvBn2d 模块是由 Conv2d 和 BatchNorm2d 融合的模块，附加了用于权重的 FakeQuantize 模块，用于量化感知训练。
`ConvBnReLU2d`	ConvBnReLU2d 模块是由 Conv2d、BatchNorm2d 和 ReLU 融合的模块，附加了用于权重的 FakeQuantize 模块，用于量化感知训练。
`ConvReLU2d`	ConvReLU2d 模块是由 Conv2d 和 ReLU 融合的模块，附加了用于量化感知训练的权重的 FakeQuantize 模块。
`ConvBn3d`	ConvBn3d 模块是由 Conv3d 和 BatchNorm3d 融合的模块，附加了用于权重的 FakeQuantize 模块，用于量化感知训练。
`ConvBnReLU3d`	ConvBnReLU3d 模块是由 Conv3d、BatchNorm3d 和 ReLU 融合的模块，附加了用于权重的 FakeQuantize 模块，用于量化感知训练。
`ConvReLU3d`	ConvReLU3d 模块是由 Conv3d 和 ReLU 融合的模块，附加了用于量化感知训练的权重的 FakeQuantize 模块。
`update_bn_stats`
`freeze_bn_stats`

torch.ao.nn.intrinsic.quantized#

此模块实现了像 conv + relu 这样的融合操作的量化实现。没有 BatchNorm 变体，因为它们通常在推理时折叠到卷积中。

`BNReLU2d`	BNReLU2d 模块是 BatchNorm2d 和 ReLU 的融合模块
`BNReLU3d`	BNReLU3d 模块是 BatchNorm3d 和 ReLU 的融合模块
`ConvReLU1d`	ConvReLU1d 模块是 Conv1d 和 ReLU 的融合模块
`ConvReLU2d`	ConvReLU2d 模块是 Conv2d 和 ReLU 的融合模块
`ConvReLU3d`	ConvReLU3d 模块是 Conv3d 和 ReLU 的融合模块
`LinearReLU`	由 Linear 和 ReLU 模块融合的 LinearReLU 模块

torch.ao.nn.intrinsic.quantized.dynamic#

此模块实现了像 linear + relu 这样的融合操作的量化动态实现。

LinearReLU

由 Linear 和 ReLU 模块融合的 LinearReLU 模块，可用于动态量化。

torch.ao.nn.qat#

此模块实现了关键的 nn 模块 **Conv2d()** 和 **Linear()** 的版本，它们在 FP32 中运行，但应用了舍入以模拟 INT8 量化的效果。

Conv2d

一个附加了用于权重的 FakeQuantize 模块的 Conv2d 模块，用于量化感知训练。

Conv3d

一个附加了用于权重的 FakeQuantize 模块的 Conv3d 模块，用于量化感知训练。

Linear

一个附加了用于权重的 FakeQuantize 模块的 Linear 模块，用于量化感知训练。

torch.ao.nn.qat.dynamic#

此模块实现了诸如 **Linear()** 等关键 nn 模块的版本，它们在 FP32 中运行，但应用了舍入以模拟 INT8 量化的效果，并在推理时动态量化。

Linear

一个附加了用于权重的 FakeQuantize 模块的 Linear 模块，用于动态量化感知训练。

torch.ao.nn.quantized#

此模块实现了像 ~torch.nn.Conv2d 和 torch.nn.ReLU 等 nn 层的量化版本。

`ReLU6`	逐元素应用函数
`Hardswish`	这是 `Hardswish` 的量化版本。
`ELU`	这是 `ELU` 的量化等效版本。
`LeakyReLU`	这是 `LeakyReLU` 的量化等效版本。
`Sigmoid`	这是 `Sigmoid` 的量化等效版本。
`BatchNorm2d`	这是 `BatchNorm2d` 的量化版本。
`BatchNorm3d`	这是 `BatchNorm3d` 的量化版本。
`Conv1d`	对由多个量化输入平面组成的量化输入信号应用一维卷积。
`Conv2d`	对由多个量化输入平面组成的量化输入信号应用二维卷积。
`Conv3d`	对由多个量化输入平面组成的量化输入信号应用三维卷积。
`ConvTranspose1d`	对由多个输入平面组成的输入图像应用 1D 转置卷积运算符。
`ConvTranspose2d`	对由多个输入平面组成的输入图像应用 2D 转置卷积运算符。
`ConvTranspose3d`	对由多个输入平面组成的输入图像应用三维转置卷积算子。
`Embedding`	一个量化的 Embedding 模块，输入为量化的打包权重。
`EmbeddingBag`	一个量化的 EmbeddingBag 模块，输入为量化的打包权重。
`FloatFunctional`	浮点运算的状态收集器类。
`FXFloatFunctional`	在 FX 图模式量化之前替换 FloatFunctional 模块的模块，因为 activation_post_process 将直接插入到顶层模块中
`QFunctional`	量化操作的包装类。
`Linear`	一个量化的线性模块，输入和输出均为量化张量。
`LayerNorm`	这是 `LayerNorm` 的量化版本。
`GroupNorm`	这是 `GroupNorm` 的量化版本。
`InstanceNorm1d`	这是 `InstanceNorm1d` 的量化版本。
`InstanceNorm2d`	这是 `InstanceNorm2d` 的量化版本。
`InstanceNorm3d`	这是 `InstanceNorm3d` 的量化版本。

torch.ao.nn.quantized.functional#

功能接口（量化）。

此模块实现了像 ~torch.nn.functional.conv2d 和 torch.nn.functional.relu 这样的功能层的量化版本。注意： $~torch.nn.functional.relu$ 支持量化输入。

`avg_pool2d`	在 $kH \times kW$ 区域上，以 $sH \times sW$ 的步长进行二维平均池化操作。
`avg_pool3d`	在 $kD \ times kH \times kW$ 区域上，以 $sD \times sH \times sW$ 的步长进行三维平均池化操作。
`adaptive_avg_pool2d`	对量化输入信号（由多个量化输入平面组成）应用二维自适应平均池化。
`adaptive_avg_pool3d`	对量化输入信号（由多个量化输入平面组成）应用三维自适应平均池化。
`conv1d`	对量化一维输入（由多个输入平面组成）应用一维卷积。
`conv2d`	对量化二维输入（由多个输入平面组成）应用二维卷积。
`conv3d`	对量化三维输入（由多个输入平面组成）应用三维卷积。
`interpolate`	将输入下采样或上采样到给定的 `size` 或给定的 `scale_factor`。
`linear`	对输入量化数据应用线性变换： $y = xA^T + b$ 。
`max_pool1d`	对量化输入信号（由多个量化输入平面组成）应用一维最大池化。
`max_pool2d`	对量化输入信号（由多个量化输入平面组成）应用二维最大池化。
`celu`	逐元素应用量化的 CELU 函数。
`leaky_relu`	的量化版本。
`hardtanh`	这是 `hardtanh()` 的量化版本。
`hardswish`	这是 `hardswish()` 的量化版本。
`threshold`	逐元素应用量化的阈值函数
`elu`	这是 `elu()` 的量化版本。
`hardsigmoid`	这是 `hardsigmoid()` 的量化版本。
`clamp`	float(input, min_, max_) -> Tensor
`upsample`	将输入上采样到给定的 `size` 或给定的 `scale_factor`。
`upsample_bilinear`	使用双线性上采样对输入进行上采样。
`upsample_nearest`	使用最近邻像素值对输入进行上采样。

torch.ao.nn.quantizable#

此模块实现了某些 nn 层的可量化版本。这些模块可以与自定义模块机制结合使用，通过向 prepare 和 convert 参数提供 custom_module_config 来实现。

LSTM

可量化的长短期记忆（LSTM）。

MultiheadAttention

torch.ao.nn.quantized.dynamic#

动态量化的 Linear、LSTM、LSTMCell、GRUCell 和 RNNCell。

`Linear`	具有浮点张量作为输入和输出的动态量化线性模块。
`LSTM`	具有浮点张量作为输入和输出的动态量化 LSTM 模块。
`GRU`	对输入序列应用多层门控循环单元（GRU）RNN。
`RNNCell`	一个具有 tanh 或 ReLU 非线性的 Elman RNN 单元。
`LSTMCell`	一个长短期记忆 (LSTM) 单元。
`GRUCell`	门控循环单元（GRU）单元

量化数据类型和量化方案#

请注意，当前算子实现仅支持 **conv** 和 **linear** 算子权重的逐通道量化。此外，输入数据通过以下方式线性映射到量化数据，反之亦然：

$\begin{aligned} \text{Quantization:}&\\ &Q_\text{out} = \text{clamp}(x_\text{input}/s+z, Q_\text{min}, Q_\text{max})\\ \text{Dequantization:}&\\ &x_\text{out} = (Q_\text{input}-z)*s \end{aligned}$

其中 $\text{clamp}(.)$ 与 clamp() 相同，而比例因子 $s$ 和零点 $z$ 的计算则如 MinMaxObserver 中所述，具体而言：

$\begin{aligned} \text{if Symmetric:}&\\ &s = 2 \max(|x_\text{min}|, x_\text{max}) / \left( Q_\text{max} - Q_\text{min} \right) \\ &z = \begin{cases} 0 & \text{if dtype is qint8} \\ 128 & \text{otherwise} \end{cases}\\ \text{Otherwise:}&\\ &s = \left( x_\text{max} - x_\text{min} \right ) / \left( Q_\text{max} - Q_\text{min} \right ) \\ &z = Q_\text{min} - \text{round}(x_\text{min} / s) \end{aligned}$

其中 :math:[x_\text{min}, x_\text{max}] 表示输入数据的范围，而 :math:Q_\text{min} 和 :math:Q_\text{max} 分别是量化数据类型的最小值和最大值。

请注意，选择 :math:s 和 :math:z 的方式意味着，当零处于输入数据范围之内或使用对称量化时，零将不带量化误差地表示。

可以通过 自定义运算符机制 <https://pytorch.ac.cn/tutorials/advanced/torch_script_custom_ops.html>_ 实现额外的数据类型和量化方案。

torch.qscheme — 用于描述张量量化方案的类型。支持的类型
- torch.per_tensor_affine — 按张量，非对称
- torch.per_channel_affine — 按通道，非对称
- torch.per_tensor_symmetric — 按张量，对称
- torch.per_channel_symmetric — 按通道，对称
torch.dtype — 用于描述数据的类型。支持的类型
- torch.quint8 — 8 位无符号整数
- torch.qint8 — 8 位有符号整数
- torch.qint32 — 32 位有符号整数

QAT 模块。

此包正在被弃用。请改用 torch.ao.nn.qat.modules。

QAT 动态模块。

此包正在被弃用。请改用 torch.ao.nn.qat.dynamic。

此文件正在迁移到 torch/ao/quantization，并在迁移过程中保留于此以兼容。如果您要添加新的条目/功能，请将其添加到 torch/ao/quantization/fx/ 下的相应文件中，并在此处添加导入语句。