BackendPatternConfig#
- class torch.ao.quantization.backend_config.BackendPatternConfig(pattern=None)[source]#
Config object that specifies quantization behavior for a given operator pattern. For a detailed example usage, see
BackendConfig
.- add_dtype_config(dtype_config)[source]#
Add a set of supported data types passed as arguments to quantize ops in the reference model spec.
- 返回类型
- classmethod from_dict(backend_pattern_config_dict)[source]#
Create a
BackendPatternConfig
from a dictionary with the following items“pattern”: the pattern being configured “observation_type”: the
ObservationType
that specifies how observers should be inserted for this pattern “dtype_configs”: a list of dictionaries that representsDTypeConfig
s “root_module”: atorch.nn.Module
that represents the root for this pattern “qat_module”: atorch.nn.Module
that represents the QAT implementation for this pattern “reference_quantized_module”: atorch.nn.Module
that represents the reference quantized implementation for this pattern’s root module. “fused_module”: atorch.nn.Module
that represents the fused implementation for this pattern “fuser_method”: a function that specifies how to fuse the pattern for this pattern “pattern_complex_format”: the pattern specified in the reversed nested tuple format (deprecated)- 返回类型
- set_dtype_configs(dtype_configs)[source]#
Set the supported data types passed as arguments to quantize ops in the reference model spec, overriding all previously registered data types.
- 返回类型
- set_fused_module(fused_module)[source]#
Set the module that represents the fused implementation for this pattern.
- 返回类型
- set_fuser_method(fuser_method)[source]#
Set the function that specifies how to fuse this BackendPatternConfig’s pattern.
The first argument of this function should be is_qat, and the rest of the arguments should be the items in the tuple pattern. The return value of this function should be the resulting fused module.
For example, the fuser method for the pattern (torch.nn.Linear, torch.nn.ReLU) can be
- def fuse_linear_relu(is_qat, linear, relu)
return torch.ao.nn.intrinsic.LinearReLU(linear, relu)
For a more complicated example, see https://gist.github.com/jerryzh168/8bea7180a8ba3c279f2c9b050f2a69a6.
- 返回类型
- set_observation_type(observation_type)[source]#
Set how observers should be inserted in the graph for this pattern.
Observation type here refers to how observers (or quant-dequant ops) will be placed in the graph. This is used to produce the desired reference patterns understood by the backend. Weighted ops such as linear and conv require different observers (or quantization parameters passed to quantize ops in the reference model) for the input and the output.
There are two observation types
OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT (default): the output observer instance will be different from the input. This is the most common observation type.
OUTPUT_SHARE_OBSERVER_WITH_INPUT: the output observer instance will be the same as the input. This is useful for operators like cat.
Note: This will be renamed in the near future, since we will soon insert QuantDeQuantStubs with observers (and fake quantizes) attached instead of observers themselves.
- 返回类型
- set_pattern(pattern)[source]#
Set the pattern to configure.
该模式可以是浮点模块、函数式算子、PyTorch 算子,或者上述的组合元组。元组模式被视为顺序模式,目前仅支持包含 2 或 3 个元素的元组。
- 返回类型
- set_reference_quantized_module(reference_quantized_module)[源]#
设置代表此模式根模块的参考量化实现的模块。
有关更多详细信息,请参阅
set_root_module()
。- 返回类型