aihwkit.optim.context module

Parameter context for analog tiles.

class aihwkit.optim.context.AnalogContext(analog_tile, parameter=None)[source]

Bases: Parameter

Context for analog optimizer.

If analog_bias (which is provided by analog_tile) is False,

data has the same meaning as torch.nn.Parameter

If analog_bias (which is provided by analog_tile) is True,

The last column of data is the bias term

For diagnostic purposes, AnalogContext provides three public data view modes. Consider the code:

— layer = AnalogLinear(4, 3, bias=False, rpu_config=rpu_config) analog_tile = layer.analog_module analog_ctx = analog_tile.analog_ctx weight = analog_tile.get_weights()[0] —

where weight is the logical weight view, which is already physical weights x scaling

Data view modes are controlled by analog_ctx.data_view_mode and the corresponding methods:

— analog_ctx.enable_placeholder(). # PLACEHOLDER mode (default) analog_ctx.enable_data_view(). # DATA_VIEW mode analog_ctx.enable_buffer(). # BUFFER mode —

  • PLACEHOLDER (default): only metadata, such as size(), shape.
    Since the RPU conductance values is not directly accessible in physic, the weight values,

    as well as value-based operations, such as norm(), are blocked by default Access them raises RuntimeError.

    — # inspect metadata without reading values: analog_ctx.size() # [4, 3] analog_ctx.device() # ‘cpu’ analog_ctx.norm() # RuntimeError —

  • DATA_VIEW: exposes a read-only logical weight view through the data attribute,

    which is equivalent to analog_tile.get_weights()[0]. This allows users to inspect the effective weights. Since the changes of both weights and scaling affect the logical weights,

    we adopt the convetion that this logical view is read-only

    Therefore, in-place operations, such as add_, mul_, etc, are blocked — # The following three lines will print the same value: analog_ctx.size() analog_ctx.data.size() weight.size() # Accessing values is allowed, but they are read-only: analog_ctx.norm() # Successfully returns the norm analog_ctx.norm() == weight.norm() # True analog_ctx.add_(1.0) # RuntimeError —

  • BUFFER: exposes a zero-initialized tensor with the logical weight shape through the data

    At that mode, data is an independent buffer that is not connected to the analog tile. It is intended for optimizers with digital auxiliary state,

    such as mixed-precision training or TT-v2.

    — analog_ctx.norm() == weight.norm() # Typically False, since the buffer is independent analog_ctx.add_(1.0) # Successfully adds 1.0 to the buffer, but does not

    affect the analog tile weights

To update the internal analog weights, use the following update methods instead of

writing data directly in the analog optimizer: — analog_ctx.analog_tile.update(…) analog_ctx.analog_tile.update_indexed(…) —

Caution: Even though DATA_VIEW mode allows us to access the weights directly,

always keep in mind that it is used only for diagnostic purposes. To simulate the real reading, call the read_weights method instead, i.e. given analog_ctx: AnalogContext, estimated_weights, estimated_bias = analog_ctx.analog_tile.read_weights()

Parameters:
Return type:

AnalogContext

cpu()[source]

Move the context to CPU.

Note

This is a no-op for CPU context.

Returns:

self

Return type:

AnalogContext

cuda(device=None)[source]

Move the context to a cuda device.

Parameters:

device (device | str | int | None) – the desired device of the tile.

Returns:

This context in the specified device.

Return type:

AnalogContext

property data_view_mode: AnalogContextDataViewMode

Return the active public data access mode.

enable_buffer()[source]

Enable an independent zero-initialized digital data buffer.

Return type:

AnalogContext

enable_data_view()[source]

Enable read-only logical weight reads for diagnostics.

Return type:

AnalogContext

enable_placeholder()[source]

Enable metadata-only placeholder mode.

Return type:

AnalogContext

get_data()[source]

Get a detached tensor from the active public data view.

Return type:

Tensor

has_gradient()[source]

Return whether a gradient trace was stored.

Return type:

bool

reset(analog_tile=None)[source]

Reset the gradient trace and optionally sets the tile pointer.

Parameters:

analog_tile (SimulatorTileWrapper | None)

Return type:

None

set_indexed(value=True)[source]

Set the context to forward_indexed.

Parameters:

value (bool)

Return type:

None

to(*args, **kwargs)[source]

Move analog tiles of the current context to a device.

Note

Please be aware that moving analog tiles from GPU to CPU is currently not supported.

Caution

Other tensor conversions than moving the device to CUDA, such as changing the data type are not supported for analog tiles and will be simply ignored.

Returns:

This module in the specified device.

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

AnalogContext