aihwkit.simulator.parameters.training module

Forward / backward / update related parameters for resistive processing units.

class aihwkit.simulator.parameters.training.UpdateParameters(desired_bl=31, fixed_bl=True, pulse_type=PulseType.STOCHASTIC_COMPRESSED, res=0, x_res_implicit=0, d_res_implicit=0, d_sparsity=False, sto_round=False, update_bl_management=True, update_management=True, um_grad_scale=1.0)[source]

Bases: _PrintableMixin

Parameter that modify the update behaviour of a pulsed device.

Parameters:

desired_bl (int) –
fixed_bl (bool) –
pulse_type (PulseType) –
res (float) –
x_res_implicit (float) –
d_res_implicit (float) –
d_sparsity (bool) –
sto_round (bool) –
update_bl_management (bool) –
update_management (bool) –
um_grad_scale (float) –

bindings_class: ClassVar[str | Type | None] = 'AnalogTileUpdateParameter'

bindings_module: ClassVar[str] = 'devices'

d_res_implicit: float = 0

Resolution of each quantization step for the error d.

Resolution (ie. bin width) of each quantization step for the error d in case of DeterministicImplicit pulse trains. See PulseTypeMap for details.

d_sparsity: bool = False: Whether to compute gradient sparsity.

desired_bl: int = 31

Desired length of the pulse trains.

For update BL management, it is the maximal pulse train length.

fixed_bl: bool = True

Whether to fix the length of the pulse trains.

See also PulseTypeMap for details.

Important

Pulsing can also be turned off in which case the update is done as if in floating point and all other update related parameter are ignored.

res: float = 0

Resolution of the update probability for the stochastic bit line generation.

Resolution ie. bin width in 0..1) of the update probability for the stochastic bit line generation. Use -1 for turning discretization off. Can be given as number of steps as well.

sto_round: bool = False: Whether to enable stochastic rounding.

um_grad_scale: float = 1.0

Scales the gradient for the update management.

The factor \(\alpha\) for the update_management. If smaller than 1 it means that the gradient will be earlier clipped when learning rate is too large (ie. exceeding the maximal pulse number times the weight granularity). If 1, both d and x inputs are clipped for the same learning rate.

update_bl_management: bool = True

Whether to enable dynamical adjustment of A,``B``,and BL:

BL = ceil(learning_rate * abs(x_j) * abs(d_i) / weight_granularity);
BL  = min(BL,desired_BL);
A = B = sqrt(learning_rate / (weight_granularity * BL));

The weight_granularity is usually equal to dw_min.

update_management: bool = True

Whether to apply additional scaling.

After the above setting an additional scaling (always on when using update_bl_management`) is applied to account for the different input strengths. If

\[\gamma \equiv \max_i |x_i| / (\alpha \max_j |d_j|)\]

is the ratio between the two maximal inputs, then A is additionally scaled by \(\gamma\) and B is scaled by \(1/\gamma\).

The gradient scale \(\alpha\) can be set with um_grad_scale

x_res_implicit: float = 0

Resolution of each quantization step for the inputs x.

Resolution (ie. bin width) of each quantization step for the inputs x in case of DeterministicImplicit pulse trains. See PulseTypeMap for details.