aihwkit.simulator.tiles.base module
High level analog tiles (base).
- class aihwkit.simulator.tiles.base.AnalogTileStateNames[source]
Bases:
object
Class defining analog tile state name constants.
Caution
Do not edit. Some names are attribute names of the tile.
- ANALOG_STATE_NAME = 'analog_tile_state'
- ANALOG_STATE_PREFIX = 'analog_tile_state_'
- CLASS = 'analog_tile_class'
- CONTEXT = 'analog_ctx'
- EXTRA = 'state_extra'
- HIDDEN_PARAMETERS = 'analog_tile_hidden_parameters'
- HIDDEN_PARAMETER_NAMES = 'analog_tile_hidden_parameter_names'
- LR = 'analog_lr'
- MAPPING_SCALES = 'mapping_scales'
- OUT_SCALING = 'out_scaling_alpha'
- RPU_CONFIG = 'rpu_config'
- SHARED_WEIGHTS = 'shared_weights'
- VERSION = 'aihwkit_version'
- WEIGHTS = 'analog_tile_weights'
- class aihwkit.simulator.tiles.base.BaseTile[source]
Bases:
object
Base class for tile classes (without
torch.Module
dependence).- backward(d_input, ctx=None)[source]
Perform the backward pass.
- Parameters:
d_input (Tensor) –
[N, out_size]
tensor. Ifout_trans
is set, transposed.ctx (Any | None) – torch auto-grad context [Optional]
- Returns:
[N, in_size]
tensor. Ifin_trans
is set, transposed.- Return type:
torch.Tensor
- joint_forward(x_input, is_test=False, ctx=None)[source]
Perform the joint forward method.
Calls first the
pre_forward
, then the tile forward, and finally thepost_forward
step.Note
The full forward pass is not using autograd, thus all pre and post functions need to be handled appropriately in the pre/post backward functions.
- Parameters:
x_input (Tensor) –
[N, in_size]
tensor. Ifin_trans
is set, transposed.is_test (bool) – whether to assume testing mode.
ctx (Any | None) – torch auto-grad context [Optional]
- Returns:
[N, out_size]
tensor. Ifout_trans
is set, transposed.- Return type:
torch.Tensor
- update(x_input, d_input)[source]
Perform the update pass.
Calls the
pre_update
method to pre-process the inputs.- Parameters:
x_input (Tensor) –
[..., in_size]
tensor. Ifin_trans
is set,[in_size, ...]
.d_input (Tensor) –
[..., out_size]
tensor. Ifout_trans
is set,[out_size, ...]
.
- Returns:
None
- Return type:
None
- class aihwkit.simulator.tiles.base.SimulatorTile[source]
Bases:
object
Minimal class interface for implementing the simulator tile.
Note
This tile is generated by
_create_simulator_tile
in theSimulatorTileWrapper
.- backward(d_input, bias=False, in_trans=False, out_trans=False, non_blocking=False)[source]
Backward pass.
Only needs to be implemented if torch autograd is not used.
- Parameters:
d_input (Tensor) –
bias (bool) –
in_trans (bool) –
out_trans (bool) –
non_blocking (bool) –
- Return type:
Tensor
- dump_extra()[source]
Dumps any extra states / attributed necessary for checkpointing.
For Tiles based on Modules, this should be normally handled by torch automatically.
- Return type:
Dict | None
- forward(x_input, bias=False, in_trans=False, out_trans=False, is_test=False, non_blocking=False)[source]
General simulator tile forward.
- Parameters:
x_input (Tensor) –
bias (bool) –
in_trans (bool) –
out_trans (bool) –
is_test (bool) –
non_blocking (bool) –
- Return type:
Tensor
Get the hidden parameters names.
Each name corresponds to a slice in the Tensor slice of the
get_hidden_parameters
tensor.- Returns:
List of names.
- Return type:
List[str]
Get the hidden parameters of the tile.
- Returns:
Hidden parameter tensor.
- Return type:
Tensor
- get_learning_rate()[source]
Get the learning rate of the tile.
- Returns:
learning rate if exists.
- Return type:
float | None
- load_extra(extra, strict=False)[source]
Load any extra states / attributed necessary for loading from checkpoint.
For Tiles based on Modules, this should be normally handled by torch automatically.
Note
Expects the exact same RPUConfig / device etc for applying the states. Cross-loading of state-dicts is not supported for extra states, they will be just ignored.
- Parameters:
extra (Dict) – dictionary of states from dump_extra.
strict (bool) – Whether to throw an error if keys are not found.
- Return type:
None
Set the hidden parameters of the tile.
- Parameters:
params (Tensor) –
- Return type:
None
- set_learning_rate(learning_rate)[source]
Set the learning rate of the tile.
No-op for tiles that do not need a learning rate.
- Parameters:
rate (learning) – learning rate to set
learning_rate (float | None) –
- Return type:
None
- set_weights(weight)[source]
Stets the analog weights.
- Parameters:
weight (Tensor) –
- Return type:
None
- set_weights_uniform_random(bmin, bmax)[source]
Sets the weights to uniform random numbers.
- Parameters:
bmin (float) – min value
bmax (float) – max value
- Return type:
None
- update(x_input, d_input, bias=False, in_trans=False, out_trans=False, non_blocking=False)[source]
Update.
Only needs to be implemented if torch autograd update is not used.
- Parameters:
x_input (Tensor) –
d_input (Tensor) –
bias (bool) –
in_trans (bool) –
out_trans (bool) –
non_blocking (bool) –
- Return type:
Tensor
- class aihwkit.simulator.tiles.base.SimulatorTileWrapper(out_size, in_size, rpu_config, bias=True, in_trans=False, out_trans=False, torch_update=False, handle_output_bound=False, ignore_analog_state=False)[source]
Bases:
object
Wrapper base class for defining the necessary tile functionality.
Will be overloaded extended for C++ or for any TorchTile.
- Parameters:
out_size (int) – output size
in_size (int) – input size
rpu_config (InferenceRPUConfig | SingleRPUConfig | UnitCellRPUConfig | TorchInferenceRPUConfig | DigitalRankUpdateRPUConfig) – resistive processing unit configuration.
bias (bool) – whether to add a bias column to the tile.
in_trans (bool) – Whether to assume an transposed input (batch first)
out_trans (bool) – Whether to assume an transposed output (batch first)
shared_weights – optional shared weights tensor memory that should be used.
handle_output_bound (bool) – whether the bound clamp gradient should be inserted
ignore_analog_state (bool) – whether to ignore the analog state when __getstate__ is called
torch_update (bool) –
- cuda(device=None)[source]
Return a copy of the tile in CUDA memory.
- Parameters:
device (str | device | int | None) – CUDA device
- Returns:
Self with the underlying C++ tile moved to CUDA memory.
- Raises:
CudaError – if the library has not been compiled with CUDA.
- Return type:
- get_analog_ctx()[source]
Return the analog context of the tile to be used in
AnalogFunction
.- Return type:
- get_analog_state()[source]
Get the analog state for the state_dict.
Excludes the non-analog state names that might be added for pickling. Only fields defined in
AnalogTileStateNames
are returned.- Return type:
Dict
- get_forward_out_bound()[source]
Helper for getting the output bound to correct the gradients using the AnalogFunction.
- Return type:
float | None
Get the hidden parameters of the tile.
- Returns:
Ordered dictionary of hidden parameter tensors.
- Return type:
OrderedDict
- get_learning_rate()[source]
Return the tile learning rate.
- Returns:
the tile learning rate.
- Return type:
float
- get_tensor_view(ndim, dim=None)[source]
Return the tensor view for ndim vector at dim.
- Parameters:
ndim (int) – number of dimensions
dim (int | None) – the dimension to set to -1
- Returns:
Tuple of ones with the dim` index sets to -1
- Return type:
tuple
- post_update_step()[source]
Operators that need to be called once per mini-batch.
Note
This function is called by the analog optimizer.
Caution
If no analog optimizer is used, the post update steps will not be performed.
- Return type:
None
Set the hidden parameters of the tile.
Caution
Usually the hidden parameters are drawn according to the parameter definitions (those given in the RPU config). If the hidden parameters are arbitrary set by the user, then this correspondence might be broken. This might cause problems in the learning, in particular, the weight granularity (usually
dw_min
, depending on the device) is needed for the dynamic adjustment of the bit length (update_bl_management
, seeUpdateParameters
).Currently, the new
dw_min
parameter is tried to be estimated from the average of hidden parameters if the discrepancy with thedw_min
from the definition is too large.- Parameters:
ordered_parameters (OrderedDict) – Ordered dictionary of hidden parameter tensors.
- Raises:
TileError – In case the ordered dict keys do not conform with the current rpu config tile structure of the hidden parameters
- Return type:
None
- set_learning_rate(learning_rate)[source]
Set the tile learning rate.
Set the tile learning rate to
-learning_rate
. Note that the learning rate is always taken to be negative (because of the meaning in gradient descent) and positive learning rates are not supported.- Parameters:
learning_rate (float | None) – the desired learning rate.
- Return type:
None
- set_verbosity_level(verbose)[source]
Set the verbosity level.
- Parameters:
verbose (int) – level of verbosity
- Return type:
None