MovingAverageMinMaxObserver¶
- class torch.ao.quantization.observer.MovingAverageMinMaxObserver(averaging_constant=0.01, dtype=torch.quint8, qscheme=torch.per_tensor_affine, reduce_range=False, quant_min=None, quant_max=None, eps=1.1920928955078125e-07, is_dynamic=False, **kwargs)[source]¶
Observer module for computing the quantization parameters based on the moving average of the min and max values.
This observer computes the quantization parameters based on the moving averages of minimums and maximums of the incoming tensors. The module records the average minimum and maximum of incoming tensors, and uses this statistic to compute the quantization parameters.
- Parameters
averaging_constant – Averaging constant for min/max.
dtype – dtype argument to the quantize node needed to implement the reference model spec.
qscheme – Quantization scheme to be used
reduce_range – Reduces the range of the quantized data type by 1 bit
quant_min – Minimum quantization value. If unspecified, it will follow the 8-bit setup.
quant_max – Maximum quantization value. If unspecified, it will follow the 8-bit setup.
eps (Tensor) – Epsilon value for float32, Defaults to torch.finfo(torch.float32).eps.
The moving average min/max is computed as follows
where is the running average min/max, is is the incoming tensor, and is the
averaging_constant
.The scale and zero point are then computed as in
MinMaxObserver
.Note
Only works with
torch.per_tensor_affine
quantization scheme.Note
If the running minimum equals to the running maximum, the scale and zero_point are set to 1.0 and 0.