aihwkit.simulator.configs.utils module

Utility parameters for resistive processing units.

class aihwkit.simulator.configs.utils.BoundManagementType(value)

Bases: enum.Enum

Bound management type.

In the case Iterative the MAC is iteratively recomputed with inputs iteratively halved, when the output bound was hit.

Caution

Bound management is only available for the forward pass. It will be ignored when used for the backward pass.

ITERATIVE = 'Iterative'

Iteratively recomputes input scale set to \(\alpha\leftarrow\alpha/2\).

It iteratively recomputes the bounds up to limit of passes (given by max_bm_factor or max_bm_res).

ITERATIVE_WORST_CASE = 'IterativeWorstCase'

Worst case bound management.

Uses AbsMax noise management for the first pass and only when output bound is hit, the AbsMaxNPSum for the second. Thus, at most 2 passes are computed.

NONE = 'None'

No bound management.

SHIFT = 'Shift'

Shift bound management.

Shifts the output by adding the difference output_bound - max_output to the analog output value. This is only useful to increase the dynamic range before the softmax, where the max can be safely.

Note

Shifting needs hardware implementations.

class aihwkit.simulator.configs.utils.DriftParameter(nu=0.0, t_0=1.0, reset_tol=1e-07, nu_dtod=0.0, nu_std=0.0, wg_ratio=1.0, g_offset=0.0, w_offset=0.0, nu_k=0.0, log_g0=0.0, w_noise_std=0.0)

Bases: aihwkit.simulator.configs.utils.SimpleDriftParameter

Parameter for a power law drift.

The drift is based on the model described by Oh et al (2019).

It computes: .. math:

w_{ij}*\left(\frac{t + \Delta t}{t_0}\right)^(-\nu^\text{actual}_{ij})

where the drift coefficient is drawn once at the beginning and might depend on device. It also can depend on the actual weight value.

The actual drift coefficient is computed as: .. math:

\nu_{ij}^\text{actual} =  \nu_{ij} - \nu_k \log \frac{(w_{ij} - w_\text{off}) / r_\text{wg}
+ g_\text{off}}{G_0}  + \nu\sigma_\nu\xi

here \(w_{ij}\) is the actual weight and nu_{ij} fixed for each device given by the mean \(\nu\) and the device-to-device variation: \(\nu_{ij} = \nu + \nu_dtod\nu\xi\) and are only drawn once at the beginning (tile instantiation). xi is Gaussian noise.

Note

If the weight has changed from the last drift call (determined by the reset_tol parameter), for instance due to update, decay or noise, then the drift time \(t\) will be reset and start from new, however, the drift coefficients \(\nu_{ij}\) are not changed. On the other hand, if the weights has not changed since last call, \(t\) will accumulate the time.

Caution

Note that the drift coefficient does not depend on the initially programmed weight value at \(t=0\) in the current implementation (ie G0 is a constant for all devices), but instead on the actual weight. In some materials (e.g. phase changed materials), that might be not accurate.

g_offset: float = 0.0

g_min to convert to physical units.

log_g0: float = 0.0

Log g0.

nu_dtod: float = 0.0

Device-to-device variation of the \(\nu\) values.

nu_k: float = 0.0

nu with \(W\).

ie. \(\nu(R) = nu_0 - k \log(G/G_0)\). See Oh et al.

Type

Variation of “math

nu_std: float = 0.0

Cycle-to-cycle variation of \(\nu\).

A more realistic way to add noise of the drift might be using w_noise_std.

w_noise_std: float = 0.0

Additional weight noise (Gaussian diffusion) added to the weights after the drift is applied.

w_offset: float = 0.0

w(g_min), i.e. to what value g_min is mapped to in w-space.

wg_ratio: float = 1.0

(w_max-w_min)/(g_max-g_min) to convert to physical units.

class aihwkit.simulator.configs.utils.IOParameters(bm_test_negative_bound=True, bound_management=<BoundManagementType.ITERATIVE: 'Iterative'>, inp_bound=1.0, inp_noise=0.0, inp_res=0.007936507936507936, inp_sto_round=False, is_perfect=False, max_bm_factor=1000, max_bm_res=0.25, nm_thres=0.0, noise_management=<NoiseManagementType.ABS_MAX: 'AbsMax'>, out_bound=12.0, out_noise=0.06, out_res=0.00196078431372549, out_scale=1.0, out_sto_round=False, w_noise=0.0, w_noise_type=<WeightNoiseType.NONE: 'None'>)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Parameters that modify the IO behavior.

bm_test_negative_bound: bool = True
bound_management: aihwkit.simulator.configs.utils.BoundManagementType = 'Iterative'

Type of bound management, see BoundManagementType.

Caution

Bound management is only available for the forward pass. It will be ignored when used for the backward pass.

inp_bound: float = 1.0

Input bound and ranges for the digital-to-analog converter (DAC).

inp_noise: float = 0.0

Std deviation of Gaussian input noise (\(\sigma_\text{inp}\)).

i.e. noisiness of the analog input (at the stage after DAC and before the multiplication).

inp_res: float = 0.007936507936507936

Number of discretization steps for DAC (\(\le0\) means infinite steps) or resolution (1/steps).

inp_sto_round: bool = False

Whether to enable stochastic rounding of DAC.

is_perfect: bool = False

Short-cut to compute a perfect forward pass.

If True, it assumes an ideal forward pass (e.g. no bound, ADC etc…). Will disregard all other settings in this case.

max_bm_factor: int = 1000

Maximal bound management factor.

If this factor is reached then the iterative process is stopped.

max_bm_res: float = 0.25

Limit the maximal number of iterations of the bound management.

Another way to limit the maximal number of iterations of the bound management. The max effective resolution number of the inputs, e.g. use \(1/4\) for 2 bits.

nm_thres: float = 0.0

Constant noise management value for type Constant.

In other cases, this is a upper threshold \(\theta\) above which the noise management factor is saturated. E.g. for AbsMax:

\begin{equation*} \alpha=\begin{cases}\max_i|x_i|, & \text{if} \max_i|x_i|<\theta \\ \theta, & \text{otherwise}\end{cases} \end{equation*}

Caution

If nm_thres is set (and type is not Constant), the noise management will clip some large input values, in favor of having a better SNR for smaller input values.

noise_management: aihwkit.simulator.configs.utils.NoiseManagementType = 'AbsMax'

Type of noise management, see NoiseManagementType.

out_bound: float = 12.0

Output bound and ranges for analog-to-digital converter (ADC).

out_noise: float = 0.06

Std deviation of Gaussian output noise (\(\sigma_\text{out}\)).

i.e. noisiness of device summation at the output.

out_res: float = 0.00196078431372549

Number of discretization steps for ADC or resolution.

Number of discretization steps for ADC (\(<=0\) means infinite steps) or resolution (1/steps).

out_scale: float = 1.0

Additional fixed scalar factor.

out_sto_round: bool = False

Whether to enable stochastic rounding of ADC.

w_noise: float = 0.0

Scale of output referred weight noise (\(\sigma_w\)) for a given w_noise_type.

w_noise_type: aihwkit.simulator.configs.utils.WeightNoiseType = 'None'

Type as specified in OutputWeightNoiseType.

Note

This noise us applied each time anew as it is referred to the output. It will not change the conductance values of the weight matrix. For the latter one can apply diffuse_weights().

class aihwkit.simulator.configs.utils.NoiseManagementType(value)

Bases: enum.Enum

Noise management type.

Noise management determines a factor \(\alpha\) how the input is reduced:

\[\mathbf{y} = \alpha\;F_\text{analog-mac}\left(\mathbf{x}/\alpha\right)\]
ABS_MAX = 'AbsMax'

Use \(\alpha\equiv\max{|\mathbf{x}|}\).

ABS_MAX_NP_SUM = 'AbsMaxNPSum'

Assume weight value is constant and given by nm_assumed_wmax.

Takes a worst case scenario of the weight matrix to calculate the input scale to ensure that output is not clipping. Assumed weight value is constant and given by nm_assumed_wmax.

AVERAGE_ABS_MAX = 'AverageAbsMax'

Moment-based scale input scale estimation.

Computes the average abs max over the mini-batch and applies nm_decay to update the value with the history.

Note

nm_decay is 1-momentum and always given in mini-batches. However, the CUDA implementation does not discount values within mini-batches, whereas the CPU implementation does.

CONSTANT = 'Constant'

A constant value (given by parameter nm_thres).

MAX = 'Max'

Use \(\alpha\equiv\max{\mathbf{x}}\).

NONE = 'None'

No noise management.

class aihwkit.simulator.configs.utils.PulseType(value)

Bases: enum.Enum

Pulse type.

DETERMINISTIC_IMPLICIT = 'DeterministicImplicit'

Coincidences are computed in deterministic manner.

Coincidences are calculated by \(b_l x_q d_q\) where BL is the desired bit length (possibly subject to dynamic adjustments using update_bl_management) and \(x_q\) and \(d_q\) are the quantized input and error values, respectively, normalized to the range \(0,\ldots,1\). It can be shown that explicit bit lines exist that generate these coincidences.

MEAN_COUNT = 'MeanCount'

Coincidence based in prob (\(p_a p_b\)).

NONE = 'None'

Floating point update instead of pulses.

NONE_WITH_DEVICE = 'NoneWithDevice'

Floating point like None, but with analog devices (e.g. weight clipping).

STOCHASTIC = 'Stochastic'

Two passes for plus and minus (only CPU).

STOCHASTIC_COMPRESSED = 'StochasticCompressed'

Generates actual stochastic bit lines.

Plus and minus pulses are taken in the same pass.

class aihwkit.simulator.configs.utils.SimpleDriftParameter(nu=0.0, t_0=1.0, reset_tol=1e-07)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Parameter for a simple power law drift.

The drift as a simple power law drift without device-to-device variation or conductance dependence.

It computes: .. math:

w_{ij}*\left(\frac{t + \Delta t}{t_0}\right)^(-\nu)
nu: float = 0.0

Average drift \(\nu\) value.

Need to non-zero to actually use the drift.

reset_tol: float = 1e-07

Reset tolerance.

This should a number smaller than the expected weight change as it is used to detect any changes in the weight from the last drift call. Every change to the weight above this tolerance will reset the drift time.

Caution

Any write noise or diffusion on the weight might thus interfere with the drift.

t_0: float = 1.0

Time between write and first read.

Usually assumed in milliseconds, however, it really determines the time units of time_since_last_call when calling the drift.

class aihwkit.simulator.configs.utils.UpdateParameters(desired_bl=31, fixed_bl=True, pulse_type=<PulseType.STOCHASTIC_COMPRESSED: 'StochasticCompressed'>, res=0, x_res_implicit=0, d_res_implicit=0, sto_round=False, update_bl_management=True, update_management=True)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Parameter that modify the update behaviour of a pulsed device.

d_res_implicit: float = 0

Resolution of each quantization step for the error d.

Resolution (ie. bin width) of each quantization step for the error d in case of DeterministicImplicit pulse trains. See PulseTypeMap for details.

desired_bl: int = 31

Desired length of the pulse trains.

For update BL management, it is the maximal pulse train length.

fixed_bl: bool = True

Whether to fix the length of the pulse trains.

See also update_bl_management.

In case of True (where dw_min is the mean minimal weight change step size) it is:

BL = desired_BL
A = B =  sqrt(learning_rate / (dw_min * BL));

In case of False:

if dw_min * desired_BL < learning_rate:
    A = B = 1;
    BL = ceil(learning_rate / dw_min;
else:
    # same as for fixed_BL=True
pulse_type: aihwkit.simulator.configs.utils.PulseType = 'StochasticCompressed'

Switching between different pulse types.

See also PulseTypeMap for details.

Important

Pulsing can also be turned off in which case the update is done as if in floating point and all other update related parameter are ignored.

res: float = 0

Resolution of the update probability for the stochastic bit line generation.

Resolution ie. bin width in 0..1) of the update probability for the stochastic bit line generation. Use -1 for turning discretization off. Can be given as number of steps as well.

sto_round: bool = False

Whether to enable stochastic rounding.

update_bl_management: bool = True

Whether to enable dynamical adjustment of A,``B``,and BL:

BL = ceil(learning_rate * abs(x_j) * abs(d_i) / weight_granularity);
BL  = min(BL,desired_BL);
A = B = sqrt(learning_rate / (weight_granularity * BL));

The weight_granularity is usually equal to dw_min.

update_management: bool = True

Whether to apply additional scaling.

After the above setting an additional scaling (always on when using update_bl_management`) is applied to account for the different input strengths. If

\[\gamma \equiv \max_i |x_i| / \max_j |d_j|\]

is the ratio between the two maximal inputs, then A is additionally scaled by \(\gamma\) and B is scaled by \(1/\gamma\).

x_res_implicit: float = 0

Resolution of each quantization step for the inputs x.

Resolution (ie. bin width) of each quantization step for the inputs x in case of DeterministicImplicit pulse trains. See PulseTypeMap for details.

class aihwkit.simulator.configs.utils.VectorUnitCellUpdatePolicy(value)

Bases: enum.Enum

Vector unit cell update policy.

ALL = 'All'

All devices updated simultaneously.

SINGLE_FIXED = 'SingleFixed'

Device index is not changed. Can be set initially and/or updated on the fly.

SINGLE_RANDOM = 'SingleRandom'

A single device is selected by random choice each mini-batch.

SINGLE_SEQUENTIAL = 'SingleSequential'

Each device one at a time in sequence.

class aihwkit.simulator.configs.utils.WeightClipParameter(fixed_value=1.0, sigma=2.5, type=<WeightClipType.NONE: 'None'>)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Parameter that clip the weights during hardware-aware training.

fixed_value: float = 1.0

Clipping value in case of FixedValue type.

sigma: float = 2.5

Sigma value for clipping for the LayerGaussian type.

type: aihwkit.simulator.configs.utils.WeightClipType = 'None'

Type of clipping.

class aihwkit.simulator.configs.utils.WeightClipType(value)

Bases: enum.Enum

Weight clipper type.

AVERAGE_CHANNEL_MAX = 'AverageChannelMax'

Calculates the abs max of each output channel (row of the weight matrix) and takes the average as clipping value for all.

FIXED_VALUE = 'FixedValue'

Clip to fixed value give, symmetrical around zero.

LAYER_GAUSSIAN = 'LayerGaussian'

Calculates the second moment of the whole weight matrix and clips at sigma times the result symmetrically around zero.

NONE = 'None'

None.

class aihwkit.simulator.configs.utils.WeightModifierParameter(std_dev=0.0, res=0.0, sto_round=False, dorefa_clip=0.6, pdrop=0.0, enable_during_test=False, rel_to_actual_wmax=True, assumed_wmax=1.0, copy_last_column=False, coeff0=0.0105392, coeff1=0.0768, coeff2=-0.046925, type=<WeightModifierType.COPY: 'Copy'>)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Parameter that modify the forward/backward weights during hardware-aware training.

assumed_wmax: float = 1.0

Assumed weight value that is mapped to the maximal conductance.

This is typically 1.0. This parameter will be ignored if rel_to_actual_wmax is set.

coeff0: float = 0.0105392
coeff1: float = 0.0768
coeff2: float = -0.046925

Coefficients for the POLY weight modifier type.

See WeightModifierType for details.

copy_last_column: bool = False

Whether to not apply noise to the last column (which usually contains the bias values).

dorefa_clip: float = 0.6

Parameter for DoReFa.

enable_during_test: bool = False

Whether to use the last modified weight matrix during testing.

Caution

This will not remove drop connect or any other noise during evaluation, and thus should only used with care.

pdrop: float = 0.0

Drop connect probability.

Drop connect sets weights to zero with the given probability. This implements drop connect.

Important

Drop connect can be used with any other modifier type in combination.

rel_to_actual_wmax: bool = True

Whether to calculate the abs max of the weight and apply noise relative to this number.

If set to False, assumed_wmax is taken as relative units.

res: float = 0.0

Resolution of the discretization.

The invert of res gives the number of equal sized steps in \(-a_\text{max}\ldots,a_\text{max}\) where the \(a_\text{max}\) is either given by the abs max (if rel_to_actual_wmax is set) or assumed_wmax otherwise.

res is only used in the modifier types DoReFa, Discretize, and DiscretizeAddNormal.

std_dev: float = 0.0

Standard deviation of the added noise to the weight matrix.

This parameter affects the modifier types AddNormal, MultNormal and DiscretizeAddNormal.

Note

If the parameter rel_to_actual_wmax is set then the std_dev is computed in relative terms to the abs max of the given weight matrix, otherwise it in relative terms to the assumed max, which is set by assumed_wmax.

sto_round: bool = False

Whether the discretization is done with stochastic rounding enabled.

sto_round is only used in the modifier types DoReFa, Discretize, and DiscretizeAddNormal.

type: aihwkit.simulator.configs.utils.WeightModifierType = 'Copy'

Type of the weight modification.

class aihwkit.simulator.configs.utils.WeightModifierType(value)

Bases: enum.Enum

Weight modifier type.

ADD_NORMAL = 'AddNormal'

Additive Gaussian noise.

COPY = 'Copy'

Just copy, however, could also drop.

DISCRETIZE = 'Discretize'

Quantize the weights.

DISCRETIZE_ADD_NORMAL = 'DiscretizeAddNormal'

First discretize and then additive Gaussian noise.

DOREFA = 'DoReFa'

DoReFa discretization.

MULT_NORMAL = 'MultNormal'

Multiplicative Gaussian noise.

POLY = 'Poly'

2nd order Polynomial noise model (in terms of the weight value).

In detail, for the duration of a mini-batch, each weight will be added a Gaussian random number with the standard deviation of \(\sigma_\text{wnoise} (c_0 + c_1 w_{ij}/\omega + c_2 w_ij^2/\omega^2\) where \(omega\) is either the actual max weight (if rel_to_actual_wmax is set) or the value assumed_wmax.

class aihwkit.simulator.configs.utils.WeightNoiseType(value)

Bases: enum.Enum

Output weight noise type.

The weight noise is applied for each MAC computation, while not touching the actual weight matrix but referring it to the output.

\[y_i = \sum_j w_{ij}+\xi_{ij}\]
ADDITIVE_CONSTANT = 'AdditiveConstant'

The \(\xi\sim{\cal N}(0,\sigma)\) thus all are Gaussian distributed.

\(\sigma\) is determined by w_noise.

NONE = 'None'

No weight noise.

PCM_READ = 'PCMRead'

Output-referred PCM-like read noise.

Output-referred PCM-like read noise that scales with the amount of current generated for each output line and thus scales with both conductance values and input strength.

The same general for is taken as for PCM-like statistical model of the 1/f noise during inference, see aihwkit.simulator.noise_models.PCMLikeNoiseModel.