aihwkit.nn.modules.container module¶
Analog Modules that contain children Modules.
- class aihwkit.nn.modules.container.AnalogSequential(*args: torch.nn.modules.module.Module)[source]¶
- class aihwkit.nn.modules.container.AnalogSequential(arg: OrderedDict[str, Module])
Bases:
torch.nn.modules.container.Sequential
An analog-aware sequential container.
Specialization of torch
nn.Sequential
with extra functionality for handling analog layers:correct handling of
.cuda()
for children modules.apply analog-specific functions to all its children (drift and program weights).
Note
This class is recommended to be used in place of
nn.Sequential
in order to correctly propagate the actions to all the children analog layers. If using regular containers, please be aware that operations need to be applied manually to the children analog layers when needed.- analog_modules()[source]¶
Generator over analog modules only
- Return type
Generator[aihwkit.nn.modules.base.AnalogModuleBase, None, None]
- apply_to_analog_modules(fn)[source]¶
Apply a function to all the analog modules.
- Parameters
fn (Callable) – function to be applied.
- Returns
This module after the function has been applied.
- Return type
- apply_to_analog_tiles(fn)[source]¶
Apply a function to all the analog tiles of all layers in this module.
Example:
model.apply_to_analog_tiles(lambda tile: tile.reset())
This would reset each analog tile in the whole DNN looping through all layers and all tiles that might exist in a particular layer.
- Parameters
fn (Callable) – function to be applied.
- Returns
This module after the function has been applied.
- Return type
- cpu()[source]¶
Moves all model parameters and buffers to the CPU.
Note
This method modifies the module in-place.
- Returns
self
- Return type
Module
- cuda(device=None)[source]¶
Moves all model parameters and buffers to the GPU.
This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.
Note
This method modifies the module in-place.
- Parameters
device (int, optional) – if specified, all parameters will be copied to that device
- Returns
self
- Return type
Module
- drift_analog_weights(t_inference=0.0)[source]¶
(Program) and drift all analog inference layers of a given model.
- Parameters
t_inference (float) – assumed time of inference (in sec)
- Raises
ModuleError – if the layer is not in evaluation mode.
- Return type
None
- classmethod from_digital(module, *args, **kwargs)[source]¶
Construct AnalogSequential in-place from Sequential.
- Parameters
module (torch.nn.modules.container.Sequential) –
args (Any) –
kwargs (Any) –
- Return type
- get_analog_tile_device()[source]¶
Return the devices used by the analog tiles.
- Returns
torch device
- Return type
device
- Raises
TileError – in case the model is on non-unique devices
- load_state_dict(state_dict, strict=True, load_rpu_config=True)[source]¶
Specializes torch’s
load_state_dict
to add a flag whether to load the RPU config from the saved state.- Parameters
state_dict (OrderedDict[str, Tensor]) – see torch’s
load_state_dict
strict (bool) – see torch’s
load_state_dict
load_rpu_config (bool) –
Whether to load the saved RPU config or use the current RPU config of the model.
Caution
If
load_rpu_config=False
the RPU config can be changed from the stored model. However, the user has to make sure that the changed RPU config makes sense.For instance, changing the device type might change the expected fields in the hidden parameters and result in an error.
- Returns
see torch’s
load_state_dict
- Return type
NamedTuple
- Raises: ModuleError: in case the rpu_config class mismatches
for
load_rpu_config=False
.
- named_analog_modules()[source]¶
Generator over analog modules only
- Return type
Generator[Tuple[str, aihwkit.nn.modules.base.AnalogModuleBase], None, None]
- prepare_for_ddp()[source]¶
Adds ignores to avoid broadcasting the analog tile states in case of distributed training.
Note
Call this function before the mode is converted with DDP.
Important
Only InferenceTile supports DDP.
- Raises
ModuleError – In case analog tiles are used that do not support data-parallel model, ie. all analog training tiles.
- Return type
None
- program_analog_weights()[source]¶
Program all analog inference layers of a given model.
- Raises
ModuleError – if the layer is not in evaluation mode.
- Return type
None
- remap_analog_weights(weight_scaling_omega=1.0)[source]¶
Remap the analog weights and set the digital out scales.
Caution
This should typically not be called for analog training unless realistic_read_write is set. In this case, it would perform a full re-write of the weights. However, typically, this this method is intended to correct the mapping for hardware-aware trained models before doing the inference with programmed weights.
- Parameters
weight_scaling_omega (Optional[float]) – The optional value to remap the weight max to. If None it will take the value set initially in the
RPUConfig.mapping
. Defaults to 1.0.- Return type
None
- to(device=None)[source]¶
Move and/or cast the parameters, buffers and analog tiles.
Note
Please be aware that moving analog layers from GPU to CPU is currently not supported.
- Parameters
device (Optional[Union[torch.device, str, int]]) – the desired device of the parameters, buffers and analog tiles in this module.
- Returns
This module in the specified device.
- Return type