aihwkit.nn.modules.base module

Base class for analog Modules.

class aihwkit.nn.modules.base.AnalogModuleBase(in_features, out_features, bias, realistic_read_write=False, mapping=None)[source]

Bases: torch.nn.modules.module.Module

Base class for analog Modules.

Base Module for analog layers that use analog tiles. When subclassing, please note:

  • the _setup_tile() method is expected to be called by the subclass constructor, and it does not only create a tile.

  • register_analog_tile() needs to be called for each created analog tile

  • this module does not call torch’s Module init as the child is likely again derived from Module

  • the weight and bias Parameters are not guaranteed to be in sync with the tile weights and biases during the lifetime of the instance, for performance reasons. The canonical way of reading and writing weights is via the set_weights() and get_weights() as opposed to using the attributes directly.

  • the BaseTile subclass that is created is retrieved from the rpu_config.tile_class attribute.

Parameters
  • in_features (int) – input vector size (number of columns).

  • out_features (int) – output vector size (number of rows).

  • bias (bool) – whether to use a bias row on the analog tile or not.

  • realistic_read_write (bool) – whether to enable realistic read/write for setting initial weights and during reading of the weights.

  • mapping (Optional[aihwkit.simulator.configs.utils.MappingParameter]) – Configuration of the hardware architecture (e.g. tile size).

Return type

None

ANALOG_CTX_PREFIX: str = 'analog_ctx_'
ANALOG_OUT_SCALING_ALPHA_PREFIX: str = 'analog_out_scaling_alpha_'
ANALOG_SHARED_WEIGHT_PREFIX: str = 'analog_shared_weights_'
ANALOG_STATE_PREFIX: str = 'analog_tile_state_'
analog_tile_count()[source]

Return the number of registered tiles.

Returns

Number of registered tiles

Return type

int

analog_tiles()[source]

Generator to loop over all registered analog tiles of the module

Return type

Generator[BaseTile, None, None]

drift_analog_weights(t_inference=0.0)[source]

(Program) and drift the analog weights.

Parameters

t_inference (float) – assumed time of inference (in sec)

Raises

ModuleError – if the layer is not in evaluation mode.

Return type

None

extra_repr()[source]

Set the extra representation of the module.

Returns

A string with the extra representation.

Return type

str

get_analog_tile_devices()[source]

Return a list of the devices used by the analog tiles.

Returns

List of torch devices

Return type

List[Optional[Union[torch.device, str, int]]]

get_weights(force_exact=False, apply_out_scales=True)[source]

Get the weight (and bias) tensors.

This uses an realistic read if the property realistic_read_write of the layer is set, unless it is overwritten by force_exact. It scales the analog weights by the digital alpha scale if weight_scaling_omega is positive (see get_weights_scaled()).

Note

This is the recommended way for setting the weight/bias matrix from the analog tile, as it will correctly fetch the weights from the internal memory. Accessing self.weight and self.bias might yield wrong results as they are not always in sync with the analog tile library, for performance reasons.

Parameters
  • force_exact (bool) – Forces an exact read to the analog tiles

  • apply_out_scales (bool) – Whether to return the weights with the (digital) output scaling factors applied. Note the “logical” weights of the layer which the DNN is effectively using are those with the output scales applied. If apply_out_scales is set to False, then only the weight values that is programmed onto the crossbar array are returned, without applying the digital scales.

Returns

weight matrix, bias vector

Return type

tuple

Raises

ModuleError – in case of multiple defined analog tiles in the module

load_state_dict(state_dict, strict=True, load_rpu_config=True)[source]

Specializes torch’s load_state_dict to add a flag whether to load the RPU config from the saved state.

Parameters
  • state_dict (OrderedDict[str, Tensor]) – see torch’s load_state_dict

  • strict (bool) – see torch’s load_state_dict

  • load_rpu_config (bool) –

    Whether to load the saved RPU config or use the current RPU config of the model.

    Caution

    If load_rpu_config=False the RPU config can be changed from the stored model. However, the user has to make sure that the changed RPU config makes sense.

    For instance, changing the device type might change the expected fields in the hidden parameters and result in an error.

Returns

see torch’s load_state_dict

Return type

NamedTuple

Raises: ModuleError: in case the rpu_config class mismatches

for load_rpu_config=False.

named_analog_tiles()[source]

Generator to loop over all registered analog tiles of the module with names.

Return type

Generator[Tuple[str, BaseTile], None, None]

program_analog_weights()[source]

Program the analog weights.

Raises

ModuleError – if the layer is not in evaluation mode.

Return type

None

register_analog_tile(tile, name=None)[source]

Register the analog context of the tile.

Note

Needs to be called at the end init to register the tile for the analog optimizers.

Parameters
  • tile (BaseTile) – tile to register

  • name (Optional[str]) – Optional tile name used as the parameter name

Return type

None

set_weights(weight, bias=None, force_exact=False, remap_weights=True, weight_scaling_omega=None)[source]

Set the weight (and bias) values with given tensors.

This uses an realistic write if the property realistic_read_write of the layer is set, unless it is overwritten by force_exact.

If weight_scaling_omega is larger than 0, the weights are set in a scaled manner (assuming a digital output scale). See set_weights_scaled() for details.

Note

This is the recommended way for setting the weight/bias matrix of the analog tile, as it will correctly store the weights into the internal memory. Directly writing to self.weight and self.bias might yield wrong results as they are not always in sync with the analog tile Parameters, for performance reasons.

Parameters
  • weight (torch.Tensor) – weight matrix

  • bias (Optional[torch.Tensor]) – bias vector

  • force_exact (bool) – forces an exact write to the analog tiles

  • remap_weights (bool) – Whether to rescale the given weight matrix and populate the digital output scaling factors as specified in the configuration MappingParameter. A new weight_scaling_omega can be given. Note that this will overwrite the existing digital out scaling factors.

  • weight_scaling_omega (Optional[float]) – The weight scaling omega factor (see MappingParameter). If given explicitly here, it will overwrite the value in the mapping field.

Raises

ModuleError – in case of multiple defined analog tiles in the module

Return type

None

state_dict(destination=None, prefix='', keep_vars=False)[source]

Return a dictionary containing a whole state of the module.

Parameters
  • destination (Optional[Any]) –

  • prefix (str) –

  • keep_vars (bool) –

Return type

Dict

unregister_parameter(param_name)[source]

Unregister module parameter from parameters.

Raises

ModuleError – In case parameter is not found

Parameters

param_name (str) –

Return type

None

aihwkit.nn.modules.base.ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)Tensor

Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument size.

Parameters

size (int...) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.

Keyword Arguments
  • out (Tensor, optional) – the output tensor.

  • dtype (torch.dtype, optional) – the desired data type of returned tensor. Default: if None, uses a global default (see torch.set_default_tensor_type()).

  • layout (torch.layout, optional) – the desired layout of returned Tensor. Default: torch.strided.

  • device (torch.device, optional) – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

  • requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default: False.

Example:

>>> torch.ones(2, 3)
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])

>>> torch.ones(5)
tensor([ 1.,  1.,  1.,  1.,  1.])