FakeQuantizedLinear¶
- class torchao.quantization.qat.FakeQuantizedLinear(in_features: int, out_features: int, bias: bool = False, activation_config: Optional[FakeQuantizeConfigBase] = None, weight_config: Optional[FakeQuantizeConfigBase] = None, *args, **kwargs)[源代码]¶
具有假量化权重和激活的一般线性层。
通过权重和激活的独立配置来指定特定的目标数据类型、粒度、方案等。
使用示例
activation_config = IntxFakeQuantizeConfig( dtype=torch.int8, granularity="per_token", is_symmetric=False, ) weight_config = IntxFakeQuantizeConfig( dtype=torch.int4, group_size=8, is_symmetric=True, ) fq_linear = FakeQuantizedLinear( 16, 32, False, activation_config, weight_config, ) fq_linear(torch.randn(16))