aihwkit.nn.modules.base module

Base class for adding functionality to analog layers.

class aihwkit.nn.modules.base.AnalogLayerBase[source]

Bases: object

Mixin that adds functionality on the layer level.

In general, the defined methods will be looped for all analog tile modules and delegate the function.

IS_CONTAINER: bool = False

Class constant indicating whether sub-layers exist or whether this layer is a leave node (that is only having tile modules)

analog_layers()[source]

Generator over analog layers only.

Note

Here analog layers are all sub modules of the current module that derive from AnalogLayerBase (such as AnalogLinear) _except_ AnalogSequential.

Return type:

Generator[AnalogLayerBase, None, None]

analog_modules()[source]

Generator over analog layers and containers.

Note

Similar to analog_layers() but also returning all analog containers

Return type:

Generator[AnalogLayerBase, None, None]

analog_tile_count()[source]

Returns the number of tiles.

Caution

This is a static number only counted when first called.

Returns:

Number of AnalogTileModules in this layer.

Return type:

int

analog_tiles()[source]

Generator to loop over all registered analog tiles of the module

Return type:

Generator[TileModule, None, None]

apply_to_analog_layers(fn)[source]

Apply a function to all the analog layers.

Note

Here analog layers are all sub modules of the current module that derive from AnalogLayerBase (such as AnalogLinear) _except_ AnalogSequential.

Parameters:

fn (Callable) – function to be applied.

Returns:

This module after the function has been applied.

Return type:

AnalogLayerBase

apply_to_analog_tiles(fn)[source]

Apply a function to all the analog tiles of all layers in this module.

Example:

model.apply_to_analog_tiles(lambda tile: tile.reset())

This would reset each analog tile in the whole DNN looping through all layers and all tiles that might exist in a particular layer.

Parameters:

fn (Callable) – function to be applied.

Returns:

This module after the function has been applied.

Return type:

AnalogLayerBase

drift_analog_weights(t_inference=0.0)[source]

(Program) and drift the analog weights.

Parameters:

t_inference (float) – assumed time of inference (in sec).

Raises:

ModuleError – if the layer is not in evaluation mode.

Return type:

None

extra_repr()[source]

Set the extra representation of the module.

Returns:

A string with the extra representation.

Return type:

str

get_analog_tile_devices()[source]

Return a list of the devices used by the analog tiles.

Returns:

List of torch devices.

Return type:

List[str | device | int | None]

get_weights(**kwargs)[source]

Get the weight (and bias) tensors from the analog crossbar.

Parameters:

**kwargs (Any) – see tile level, e.g. get_weights().

Returns:

weight matrix, bias vector

Return type:

tuple

Raises:

ModuleError – if not of type TileModule.

load_state_dict(state_dict, strict=True, load_rpu_config=None, strict_rpu_config_check=None)[source]

Specializes torch’s load_state_dict to add a flag whether to load the RPU config from the saved state.

Parameters:
  • state_dict (OrderedDict[str, Tensor]) – see torch’s load_state_dict

  • strict (bool) – see torch’s load_state_dict

  • load_rpu_config (bool | None) –

    Whether to load the saved RPU config or use the current RPU config of the model.

    Caution

    If load_rpu_config=False the RPU config can be changed from the stored model. However, the user has to make sure that the changed RPU config makes sense.

    For instance, changing the device type might change the expected fields in the hidden parameters and result in an error.

  • strict_rpu_config_check (bool | None) – Whether to check and throw an error if the current rpu_config is not of the same class type when setting load_rpu_config to False. In case of False the user has to make sure that the rpu_config are compatible.

Returns:

see torch’s load_state_dict

Raises:
  • ModuleError – in case the rpu_config class mismatches

  • or mapping parameter mismatch for

  • load_rpu_config=False`

Return type:

NamedTuple

named_analog_layers()[source]

Generator over analog layers only.

Note

Here analog layers are all sub-modules of the current module that derive from AnalogLayerBase (such as AnalogLinear) _except_ those that are containers (IS_CONTAINER=True) such as AnalogSequential.

Return type:

Generator[Tuple[str, AnalogLayerBase], None, None]

named_analog_modules()[source]

Generator over analog layers.

Note

Similar to named_analog_layers() but also returning all analog containers

Return type:

Generator[Tuple[str, AnalogLayerBase], None, None]

named_analog_tiles()[source]

Generator to loop over all registered analog tiles of the module with names.

Return type:

Generator[Tuple[str, TileModule], None, None]

prepare_for_ddp()[source]

Adds ignores to avoid broadcasting the analog tile states in case of distributed training.

Note

Call this function before the mode is converted with DDP.

Important

Only InferenceTile supports DDP.

Raises:

ModuleError – In case analog tiles with are used that do not support data-parallel model, ie. all RPUCUda training tiles.

Return type:

None

program_analog_weights(noise_model=None)[source]

Program the analog weights.

Parameters:

noise_model (BaseNoiseModel | None) –

Optional defining the noise model to be used. If not given, it will use the noise model defined in the RPUConfig.

Caution

If given a noise model here it will overwrite the stored rpu_config.noise_model definition in the tiles.

Raises:

ModuleError – if the layer is not in evaluation mode.

Return type:

None

remap_analog_weights(weight_scaling_omega=1.0)[source]

Gets and re-sets the weights in case of using the weight scaling.

This re-sets the weights with applied mapping scales, so that the weight mapping scales are updated.

In case of hardware-aware training, this would update the weight mapping scales so that the absolute max analog weights are set to 1 (as specified in the weight_scaling configuration of MappingParameter).

Note

By default the weight scaling omega factor is set to 1 here (overriding any setting in the rpu_config). This means that the max weight value is set to 1 internally for the analog weights.

Caution

This should typically not be called for analog. Use program_weights to re-program.

Parameters:

weight_scaling_omega (float | None) – The weight scaling omega factor (see MappingParameter). If set to None here, it will take the value in the mapping parameters. Default is however 1.0.

Return type:

None

replace_rpu_config(rpu_config)[source]

Modifies the RPUConfig for all underlying analog tiles.

Each tile will be recreated, to apply the RPUConfig changes.

Note

Typically, the RPUConfig class needs to be the same otherwise an error will be raised.

Caution

If analog tiles have different RPUConfigs, these differences will be overwritten

Parameters:

rpu_config (RPUConfigBase) – New RPUConfig to apply

Return type:

None

set_weights(weight, bias=None, **kwargs)[source]

Set the weight (and bias) tensors to the analog crossbar.

Parameters:
  • weight (Tensor) – the weight tensor

  • bias (Tensor | None) – the bias tensor is available

  • **kwargs (Any) – see tile level, e.g. set_weights()

Raises:

ModuleError – if not of type TileModule.

Return type:

None

unregister_parameter(param_name)[source]

Unregister module parameter from parameters.

Raises:

ModuleError – In case parameter is not found.

Parameters:

param_name (str) –

Return type:

None