aihwkit.simulator.parameters.quantization module
Quantization configuration parameters
- class aihwkit.simulator.parameters.quantization.ActivationQuantConfig(n_bits=0, symmetric=True, range_estimator=RangeEstimators.running_minmax, range_estimator_params=None)[source]
Bases:
BaseQuantConfigThe quantization config that should be used for activation quantization. This class wraps the BaseQuantConfig class, by initializing the range_estimator object in the running_minmax mode.
- Parameters:
n_bits (int)
symmetric (bool)
range_estimator (RangeEstimators)
range_estimator_params (Any | None)
- class aihwkit.simulator.parameters.quantization.BaseQuantConfig(_range_estimator, n_bits=0, symmetric=True)[source]
Bases:
_PrintableMixinBase class for quantization parameter configuration, that contains necessary fields to configure the type of quantization for either activations of weights. The user should use the ActivationQuantConfig and WeightQuantConfig as interfaces to these parameters, as they include default initializations and an easier API.
- Parameters:
_range_estimator (RangeEstimators)
n_bits (int)
symmetric (bool)
- n_bits: int = 0
The number of bits for the quantization of the operations. If <= 0 is selected, no quantization will be applied. By default 0 (no quantization).
- property range_estimator: RangeEstimators
range_estimator property
- property range_estimator_params: Any
range_estimator_params property
- symmetric: bool = True
If True, the quantization will be symmetrical (just scale). If False, asymmetric quantization will be used. This option is valid only if n_bits > 0.
- class aihwkit.simulator.parameters.quantization.QuantizationConfig(activation_quant=<factory>, weight_quant=<factory>)[source]
Bases:
_PrintableMixinHolds the activation and weight quantization configuration objects for a layer
- Parameters:
activation_quant (ActivationQuantConfig)
weight_quant (WeightQuantConfig)
- activation_quant: ActivationQuantConfig
Configuration for the activation quantization of a layer.
NOTE: The convention is that this activation quantizer of a layer corresponds to the OUTPUT activations and not its inputs activations (these are considered already quantized by the previous layer’s quantizer). See QuantizationMap.input_activation_qconfig_map and QuantizedInputModule for ways to define a layer that should also have its inputs quantized.
- weight_quant: WeightQuantConfig
Configuration for the weight quantization of a layer
- class aihwkit.simulator.parameters.quantization.QuantizationMap(default_qconfig=<factory>, module_qconfig_map=<factory>, instance_qconfig_map=<factory>, input_activation_qconfig_map=<factory>, excluded_modules=<factory>)[source]
Bases:
_PrintableMixinThis is the datastructure that is consumed by the convert_to_quantized function and the quantized modules. It defines how to replace a module to a quantized counterpart. It offers the capability to define specific quantization options per-layer-instance for maximum flexibility but also per-module-type to reduce the definition code. See below for the available options and how to use each option.
- Parameters:
default_qconfig (QuantizationConfig)
module_qconfig_map (Dict[Module, QuantizedModuleConfig])
instance_qconfig_map (Dict[str, QuantizedModuleConfig])
input_activation_qconfig_map (Dict[str, ActivationQuantConfig])
excluded_modules (List[str])
- default_qconfig: QuantizationConfig
This is a utility field and it is NOT used during the convert_to_quantized call. It exists to simplify development code in the case where most layers use the same quantization configuration, so that the user defines it here and then shares it when he defines the instance_qconfig_map and module_qconfig_map fields (see append_default_conversions function for such a use).
- excluded_modules: List[str]
This field is a list of modules, identified by their state dict string, to be excluded from ANY conversion that is defined for their type or for their instance. This takes precedence over all other conversion steps.
- input_activation_qconfig_map: Dict[str, ActivationQuantConfig]
This field defines a map of the modules that should be wrapped in the QuantizedInputModule and the parameters of how to quantize the input activations. Since the convention used in this database is that each quantized layer quantizes its output activations, there are cases that a module could receive unquantized data as inputs (for example if it’s the first layer, or if it follows a functional call that cannot be quantized with the scheme defined here). For these reasons, an arbitrary Module could be wrapped in the QuantizedInputModule class to allow for full customizability. The modules here are defined based on the string identifier, as appears in the state dict of the model, while the value of the dict is an ActivationQuantConfig object.
NOTE: This conversion follows any quantization conversion defined in the instance_qconfig_map and module_qconfig_map. That means that if a quantization for a module is defined in one of the other maps and here, the quantized module will be properly wrapped in the QuantizedInputModule.
Examples
>>> # Quantize the inputs of the first layer of a network >>> quantization_map.input_activation_qconfig_map['firstblock.firstlayer'] = ( ... ActivationQuantConfig(n_bits=8, symmetric=False) ... )
- instance_qconfig_map: Dict[str, QuantizedModuleConfig]
This field defines a map of how to convert specific instances of modules to quantized quanterparts. It is a dictionary where the keys are the string identifier of a layer, as it appears in the state dict of the model (e.g., ‘block1.linear1’) and the value is an instance of QuantizedModuleConfig, which includes a quantized Module to replace the original one (e.g., QuantLinear) and a QuantizationConfig instance with the parameters for the new quantized layer.
NOTE: This takes priority over a possible conversion defined in the module_qconfig_map for the type of the layer identified by the string identifier (e.g. for QuantLinear layers). If a layer is both included in this field and in the excluded_modules list (by user error), the latter has priority.
Examples
>>> # Replace a specific instance of a Linear layer differently than the others >>> quantization_map.instance_qconfig_map['block1.linear1'] = QuantizedModuleConfig( ... quantized_module=QuantLinear, module_qconfig=custom_layer1_qconfig ... )
- module_qconfig_map: Dict[Module, QuantizedModuleConfig]
This field defines a map of how to convert various module types to quantized quanterparts. It is a dictionary where the keys are a type of a Module (e.g., nn.Linear) and the value is an instance of QuantizedModuleConfig, which includes a quantized Module to replace the original one (e.g., QuantLinear) and a QuantizationConfig instance with the parameters for the new quantized layer.
NOTE: This is the lowest priority for the conversions. If an instance of a module is defined in the instance_qconfig_map, the conversion defined here will be ignored, and the conversion defined in the instance_qconfig_map will happen. The same applies if an instance of a module is included in the excluded_modules list.
Examples
>>> # Replace every instance of nn.Linear with QuantLinear and use the default qconfig >>> quantization_map.module_qconfig_map[nn.Linear] = QuantizedModuleConfig( ... quantized_module=QuantLinear, module_qconfig=quantization_map.default_qconfig ... )
- class aihwkit.simulator.parameters.quantization.QuantizedModuleConfig(quantized_module, module_qconfig)[source]
Bases:
_PrintableMixinUtility dataclass that pairs a torch Module, aimed to be a quantized implementation of a module, with a QuantizationConfig. It’s used to define the module_qconfig_map and instance_qconfig_map fields of the QuantizationMap dataclass
- Parameters:
quantized_module (Module)
module_qconfig (QuantizationConfig)
- module_qconfig: QuantizationConfig
- quantized_module: Module
- class aihwkit.simulator.parameters.quantization.WeightQuantConfig(n_bits=0, symmetric=True, per_channel=False, range_estimator=RangeEstimators.current_minmax, range_estimator_params=None)[source]
Bases:
BaseQuantConfigThe quantization config that should be used for weight quantization. This class wraps the BaseQuantConfig class, by initializing the range_estimator object in the current_minmax mode, and adding the per_channel option.
- Parameters:
n_bits (int)
symmetric (bool)
per_channel (bool)
range_estimator (RangeEstimators)
range_estimator_params (Any | None)