FakeQuantize¶
-
class
torch.quantization.fake_quantize.
FakeQuantize
(observer=<class 'torch.ao.quantization.observer.MovingAverageMinMaxObserver'>, quant_min=0, quant_max=255, **observer_kwargs)[source]¶ Simulate the quantize and dequantize operations in training time. The output of this module is given by:
x_out = ( clamp(round(x/scale + zero_point), quant_min, quant_max) - zero_point ) * scale
scale
defines the scale factor used for quantization.zero_point
specifies the quantized value to which 0 in floating point maps toquant_min
specifies the minimum allowable quantized value.quant_max
specifies the maximum allowable quantized value.fake_quant_enabled
controls the application of fake quantization on tensors, note that statistics can still be updated.observer_enabled
controls statistics collection on tensorsdtype
specifies the quantized dtype that is being emulated with fake-quantization,allowable values are torch.qint8 and torch.quint8. The values of quant_min and quant_max should be chosen to be consistent with the dtype
- Parameters
- Variables
~FakeQuantize.observer (Module) – User provided module that collects statistics on the input tensor and provides a method to calculate scale and zero-point.