aihwkit.nn.modules.base module
Base class for adding functionality to analog layers.
- class aihwkit.nn.modules.base.AnalogLayerBase[source]
Bases:
object
Mixin that adds functionality on the layer level.
In general, the defined methods will be looped for all analog tile modules and delegate the function.
- IS_CONTAINER: bool = False
Class constant indicating whether sub-layers exist or whether this layer is a leave node (that is only having tile modules)
- analog_layers()[source]
Generator over analog layers only.
Note
Here analog layers are all sub modules of the current module that derive from
AnalogLayerBase
(such asAnalogLinear
) _except_AnalogSequential
.- Return type:
Generator[AnalogLayerBase, None, None]
- analog_modules()[source]
Generator over analog layers and containers.
Note
Similar to
analog_layers()
but also returning all analog containers- Return type:
Generator[AnalogLayerBase, None, None]
- analog_tile_count()[source]
Returns the number of tiles.
Caution
This is a static number only counted when first called.
- Returns:
Number of AnalogTileModules in this layer.
- Return type:
int
- analog_tiles()[source]
Generator to loop over all registered analog tiles of the module
- Return type:
Generator[TileModule, None, None]
- apply_to_analog_layers(fn)[source]
Apply a function to all the analog layers.
Note
Here analog layers are all sub modules of the current module that derive from
AnalogLayerBase
(such asAnalogLinear
) _except_AnalogSequential
.- Parameters:
fn (Callable) – function to be applied.
- Returns:
This module after the function has been applied.
- Return type:
- apply_to_analog_tiles(fn)[source]
Apply a function to all the analog tiles of all layers in this module.
Example:
model.apply_to_analog_tiles(lambda tile: tile.reset())
This would reset each analog tile in the whole DNN looping through all layers and all tiles that might exist in a particular layer.
- Parameters:
fn (Callable) – function to be applied.
- Returns:
This module after the function has been applied.
- Return type:
- drift_analog_weights(t_inference=0.0)[source]
(Program) and drift the analog weights.
- Parameters:
t_inference (float) – assumed time of inference (in sec).
- Raises:
ModuleError – if the layer is not in evaluation mode.
- Return type:
None
- extra_repr()[source]
Set the extra representation of the module.
- Returns:
A string with the extra representation.
- Return type:
str
- get_analog_tile_devices()[source]
Return a list of the devices used by the analog tiles.
- Returns:
List of torch devices.
- Return type:
List[str | device | int | None]
- get_weights(**kwargs)[source]
Get the weight (and bias) tensors from the analog crossbar.
- Parameters:
**kwargs (Any) – see tile level, e.g.
get_weights()
.- Returns:
weight matrix, bias vector
- Return type:
tuple
- Raises:
ModuleError – if not of type TileModule.
- load_state_dict(state_dict, strict=True, load_rpu_config=None, strict_rpu_config_check=None)[source]
Specializes torch’s
load_state_dict
to add a flag whether to load the RPU config from the saved state.- Parameters:
state_dict (OrderedDict[str, Tensor]) – see torch’s
load_state_dict
strict (bool) – see torch’s
load_state_dict
load_rpu_config (bool | None) –
Whether to load the saved RPU config or use the current RPU config of the model.
Caution
If
load_rpu_config=False
the RPU config can be changed from the stored model. However, the user has to make sure that the changed RPU config makes sense.For instance, changing the device type might change the expected fields in the hidden parameters and result in an error.
strict_rpu_config_check (bool | None) – Whether to check and throw an error if the current
rpu_config
is not of the same class type when settingload_rpu_config
to False. In case ofFalse
the user has to make sure that therpu_config
are compatible.
- Returns:
see torch’s
load_state_dict
- Raises:
ModuleError – in case the rpu_config class mismatches
or mapping parameter mismatch for –
load_rpu_config=False` –
- Return type:
NamedTuple
- named_analog_layers()[source]
Generator over analog layers only.
Note
Here analog layers are all sub-modules of the current module that derive from
AnalogLayerBase
(such asAnalogLinear
) _except_ those that are containers (IS_CONTAINER=True) such asAnalogSequential
.- Return type:
Generator[Tuple[str, AnalogLayerBase], None, None]
- named_analog_modules()[source]
Generator over analog layers.
Note
Similar to
named_analog_layers()
but also returning all analog containers- Return type:
Generator[Tuple[str, AnalogLayerBase], None, None]
- named_analog_tiles()[source]
Generator to loop over all registered analog tiles of the module with names.
- Return type:
Generator[Tuple[str, TileModule], None, None]
- prepare_for_ddp()[source]
Adds ignores to avoid broadcasting the analog tile states in case of distributed training.
Note
Call this function before the mode is converted with DDP.
Important
Only InferenceTile supports DDP.
- Raises:
ModuleError – In case analog tiles with are used that do not support data-parallel model, ie. all RPUCUda training tiles.
- Return type:
None
- program_analog_weights(noise_model=None)[source]
Program the analog weights.
- Parameters:
noise_model (BaseNoiseModel | None) –
Optional defining the noise model to be used. If not given, it will use the noise model defined in the RPUConfig.
Caution
If given a noise model here it will overwrite the stored rpu_config.noise_model definition in the tiles.
- Raises:
ModuleError – if the layer is not in evaluation mode.
- Return type:
None
- remap_analog_weights(weight_scaling_omega=1.0)[source]
Gets and re-sets the weights in case of using the weight scaling.
This re-sets the weights with applied mapping scales, so that the weight mapping scales are updated.
In case of hardware-aware training, this would update the weight mapping scales so that the absolute max analog weights are set to 1 (as specified in the
weight_scaling
configuration ofMappingParameter
).Note
By default the weight scaling omega factor is set to 1 here (overriding any setting in the
rpu_config
). This means that the max weight value is set to 1 internally for the analog weights.Caution
This should typically not be called for analog. Use
program_weights
to re-program.- Parameters:
weight_scaling_omega (float | None) – The weight scaling omega factor (see
MappingParameter
). If set to None here, it will take the value in the mapping parameters. Default is however 1.0.- Return type:
None
- replace_rpu_config(rpu_config)[source]
Modifies the RPUConfig for all underlying analog tiles.
Each tile will be recreated, to apply the RPUConfig changes.
Note
Typically, the RPUConfig class needs to be the same otherwise an error will be raised.
Caution
If analog tiles have different RPUConfigs, these differences will be overwritten
- Parameters:
rpu_config (RPUConfigBase) – New RPUConfig to apply
- Return type:
None
- set_weights(weight, bias=None, **kwargs)[source]
Set the weight (and bias) tensors to the analog crossbar.
- Parameters:
weight (Tensor) – the weight tensor
bias (Tensor | None) – the bias tensor is available
**kwargs (Any) – see tile level, e.g.
set_weights()
- Raises:
ModuleError – if not of type TileModule.
- Return type:
None
- unregister_parameter(param_name)[source]
Unregister module parameter from parameters.
- Raises:
ModuleError – In case parameter is not found.
- Parameters:
param_name (str) –
- Return type:
None