aihwkit.simulator.configs.devices module

Configuration for Analog (Resistive Device) tiles.

class aihwkit.simulator.configs.devices.ConstantStepDevice(construction_seed=0, corrupt_devices_prob=0.0, corrupt_devices_range=1000, diffusion=0.0, diffusion_dtod=0.0, drift=<factory>, dw_min=0.001, dw_min_dtod=0.3, dw_min_std=0.3, enforce_consistency=True, lifetime=0.0, lifetime_dtod=0.0, perfect_bias=False, reset=0.01, reset_dtod=0.0, reset_std=0.01, up_down=0.0, up_down_dtod=0.01, w_max=0.6, w_max_dtod=0.3, w_min=-0.6, w_min_dtod=0.3)

Bases: aihwkit.simulator.configs.devices.PulsedDevice

Pulsed update behavioral model: constant step.

Pulsed update behavioral model, where the update step of material is constant throughout the resistive range (up to hard bounds).

In more detail, the update behavior implemented for ConstantStep is:

\[ \begin{align}\begin{aligned}w_{ij} &\leftarrow& w_{ij} - \Delta w_{ij}^d(1 + \sigma_\text{c-to-c}\,\xi)\\w_{ij} &\leftarrow& \text{clip}(w_{ij},b^\text{min}_{ij},b^\text{max}_{ij})\end{aligned}\end{align} \]

where \(d\) is the direction of the update (product of signs of input and error). \(\Delta w_{ij}^d\) is the update step size of the cross-point ij in direction \(d\) (up or down). Note that each cross-point has separate update sizes so that device-to-device fluctuations and biases in the directions can be given.

Moreover, the clipping bounds of each cross-point ij (i.e. \(b_{ij}^\text{max/min}\)) are also different in general. The mean and the amount of systematic spread from device-to-device can be given as parameters, see below.

For parameters regarding the devices settings, see e.g. PulsedDevice.

class aihwkit.simulator.configs.devices.DifferenceUnitCell(unit_cell_devices=<factory>)

Bases: aihwkit.simulator.configs.devices.UnitCell

Abstract device model takes an arbitrary device per crosspoint and implements an explicit plus-minus device pair.

A plus minus pair is implemented by using only one-sided updated of the given devices. Note that reset might need to be called otherwise the one-sided device quickly saturates during learning.

The output current is the difference of both devices.

Meta parameter setting of the pairs are assumed to be identical (however, device-to-device variation is still present).

Caution

Reset needs to be added manually by calling the reset_columns method of a tile.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.DifferenceResistiveDeviceParameter

class aihwkit.simulator.configs.devices.DigitalRankUpdateCell(device=<factory>)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Parameters that modify the behavior of the digital rank update cell.

This is the base class for devices that compute the rank update in digital and then (occasionally) transfer the information to the (analog) crossbar array that is used during forward and backward.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.AbstractResistiveDeviceParameter

device: PulsedDevice

(Analog) device that are used for forward and backward.

requires_decay()

Return whether device has decay enabled.

Return type

bool

requires_diffusion()

Return whether device has diffusion enabled.

Return type

bool

class aihwkit.simulator.configs.devices.ExpStepDevice(construction_seed=0, corrupt_devices_prob=0.0, corrupt_devices_range=1000, diffusion=0.0, diffusion_dtod=0.0, drift=<factory>, dw_min=0.001, dw_min_dtod=0.3, dw_min_std=0.3, enforce_consistency=True, lifetime=0.0, lifetime_dtod=0.0, perfect_bias=False, reset=0.01, reset_dtod=0.0, reset_std=0.01, up_down=0.0, up_down_dtod=0.01, w_max=0.6, w_max_dtod=0.3, w_min=-0.6, w_min_dtod=0.3, A_up=0.00081, A_down=0.36833, gamma_up=12.44625, gamma_down=12.78785, a=0.244, b=0.2425, write_noise_std=0.0)

Bases: aihwkit.simulator.configs.devices.PulsedDevice

Exponential update step or CMOS-like update behavior.

This model is derived from PulsedDevice and uses all its parameters. ExpStepDevice only implements a new ‘update once’ functionality, where the minimal weight step change with weight is fitted by an exponential function as detailed below.

\[w_{ij} \leftarrow w_{ij} - \max(y_{ij},0) \Delta w_{ij}^d (1 + \sigma_\text{c-to-c}\,\xi)\]

and \(y_{ij}\) is given as

\[ \begin{align}\begin{aligned}z_{ij} = 2 a_\text{es} \frac{w_{ij}}{b^\text{max}_{ij} - b^\text{min}_{ij}} + b_\text{es}\\y_{ij} = 1 - A^{(d)} e^{d \gamma^{(d)} z_{ij}}\end{aligned}\end{align} \]

where \(d\) is the direction of the update (+ or -), see also PulsedDevice for details.

All additional parameter (\(a_\text{es}\), \(b_\text{es}\), \(\gamma^{(d)}\), \(A^{(d)}\) ) are tile-wise fitting parameters (ie. no device-to-device variation in these parameters). Note that the other parameter involved can be still defined with device-to-device variation and (additional) up-down bias (see PulsedDevice).

A_down: float = 0.36833

Factor A for the down direction

A_up: float = 0.00081

Factor A for the up direction

a: float = 0.244

Global slope parameter

b: float = 0.2425

Global offset parameter

gamma_down: float = 12.78785

Exponent for the down direction.

gamma_up: float = 12.44625

Exponent for the up direction.

write_noise_std: float = 0.0

Whether to use update write noise.

Whether to use update write noise that is added to the updated devices weight, while the update is done on a hidden persistent weight. The update write noise is then sampled a new when the device is touched again.

Thus it is:

\[w_\text{apparent}{ij} = w_ij + \sigma_\text{write_noise}\xi\]

and the update is done on \(w_ij\) but the forward sees the \(w_\text{apparent}\).

class aihwkit.simulator.configs.devices.FloatingPointDevice(diffusion=0.0, lifetime=0.0, drift=<factory>)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Floating point reference.

Implements ideal devices forward/backward/update behavior.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.FloatingPointTileParameter

diffusion: float = 0.0

Standard deviation of diffusion process.

drift: SimpleDriftParameter

Parameter governing a power-law drift.

lifetime: float = 0.0

One over decay_rate, ie \(1/r_\text{decay}\).

requires_decay()

Return whether device has decay enabled.

Return type

bool

requires_diffusion()

Return whether device has diffusion enabled.

Return type

bool

class aihwkit.simulator.configs.devices.IdealDevice(construction_seed=0, diffusion=0.0, lifetime=0.0)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Ideal update behavior (using floating point), but forward/backward might be non-ideal.

Ideal update behavior (using floating point), however, forward/backward might still have a non-ideal ADC or noise added.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.IdealResistiveDeviceParameter

construction_seed: int = 0

If not 0, set a unique seed for hidden parameters during construction.

diffusion: float = 0.0

Standard deviation of diffusion process.

lifetime: float = 0.0

One over decay_rate, ie \(1/r_\text{decay}\).

requires_decay()

Return whether device has decay enabled.

Return type

bool

requires_diffusion()

Return whether device has diffusion enabled.

Return type

bool

class aihwkit.simulator.configs.devices.LinearStepDevice(construction_seed=0, corrupt_devices_prob=0.0, corrupt_devices_range=1000, diffusion=0.0, diffusion_dtod=0.0, drift=<factory>, dw_min=0.001, dw_min_dtod=0.3, dw_min_std=0.3, enforce_consistency=True, lifetime=0.0, lifetime_dtod=0.0, perfect_bias=False, reset=0.01, reset_dtod=0.0, reset_std=0.01, up_down=0.0, up_down_dtod=0.01, w_max=0.6, w_max_dtod=0.3, w_min=-0.6, w_min_dtod=0.3, gamma_up=0.0, gamma_down=0.0, gamma_up_dtod=0.05, gamma_down_dtod=0.05, allow_increasing=False, mean_bound_reference=True, mult_noise=True, write_noise_std=0.0)

Bases: aihwkit.simulator.configs.devices.PulsedDevice

Pulsed update behavioral model: linear step.

Pulsed update behavioral model, where the update step response size of the material is linearly dependent with resistance (up to hard bounds).

This model is based on PulsedDevice and thus shares all parameters and functionality. In addition, it only implements a more general update once function, where the update step size can depend linearly on the weight itself.

For each coincidence the weights is updated once. Here, the positive (negative) update step size decreases linearly in the following manner (compare to the update once for ConstantStepDevice):

\begin{eqnarray*} w_{ij} &\leftarrow& w_{ij} - \Delta w_{ij}^d(\gamma_{ij}^d\;w_{ij} + 1 + \sigma_\text{c-to-c}\,\xi)\\ w_{ij} &\leftarrow& \text{clip}(w_{ij},b^\text{min}_{ij},b^\text{max}_{ij}) \end{eqnarray*}

in case of additive noise. Optionally, multiplicative noise can be chosen in which case the first equation becomes:

\[w_{ij} \leftarrow w_{ij} - \Delta w_{ij}^d (\gamma_{ij}^d \;w_{ij} + 1) (1 + \sigma_\text{c-to-c}\,\xi)\]

The cross-point ij dependent slope parameter \(\gamma_{ij}^d\) are given during initialization by

\begin{eqnarray*} \gamma_{ij}^+ &=& - |\gamma^+ + \gamma_\text{d-to-d}^+ \xi|/b^\text{max}_{ij}\\ \gamma_{ij}^- &=& - |\gamma^- + \gamma_\text{d-to-d}^- \xi|/b^\text{min}_{ij} \end{eqnarray*}

where the \(\xi\) are standard Gaussian random variables and \(b^\text{min}_{ij}\) and \(b^\text{max}_{ij}\) the cross-point ij specific minimal and maximal weight bounds, respectively (see description for PulsedDevice).

Note

If \(\gamma=1\) and \(\gamma_\text{d-to-d}=0\) this update implements soft bounds, since the updates step becomes equal to \(1/b\).

Note

If \(\gamma=0\) and \(\gamma_\text{d-to-d}=0\) and additive noise, this update is identical to those described in PulsedDevice.

allow_increasing: bool = False

Whether to allow increasing of update sizes.

Whether to allow the situation where update sizes increase towards the bound instead of saturating (and thus becoming smaller).

gamma_down: float = 0.0

The value of \(\gamma^-\).

gamma_down_dtod: float = 0.05

Device-to-device variation for \(\gamma^-\), i.e. the value of \(\gamma_\text{d-to-d}^-\).

gamma_up: float = 0.0

The value of \(\gamma^+\).

Intuitively, a value of 0.1 means that the update step size in up direction at the weight bounds is 10% decreased relative to that origin \(w=0\).

Note

In principle one could fix \(\gamma=\gamma^-=\gamma^+\) since up/down variation can be given by up_down_dtod, see PulsedDevice.

Note

The hard-bounds are still observed, so that the weight cannot grow beyond its bounds.

gamma_up_dtod: float = 0.05

Device-to-device variation for \(\gamma^+\), i.e. the value of \(\gamma_\text{d-to-d}^+\).

mean_bound_reference: bool = True

Whether to use instead of the above:

\[ \begin{align}\begin{aligned}\gamma_{ij}^+ &=& - |\gamma^+ + \gamma_\text{d-to-d}^+ \xi|/b^\text{max}\\\gamma_{ij}^- &=& - |\gamma^- + \gamma_\text{d-to-d}^- \xi|/b^\text{min}\end{aligned}\end{align} \]

where \(b^\text{max}\) and \(b^\text{max}\) are the values given by w_max and w_min, see PulsedDevice.

mult_noise: bool = True

Whether to use multiplicative noise instead of additive cycle-to-cycle noise.

write_noise_std: float = 0.0

Whether to use update write noise.

Whether to use update write noise that is added to the updated devices weight, while the update is done on a hidden persistent weight. The update write noise is then sampled anew when the device is touched again.

Thus it is:

\[w_\text{apparent}{ij} = w_ij + \sigma_\text{write_noise} \Delta w_\text{min}\xi\]

and the update is done on \(w_ij\) but the forward sees the \(w_\text{apparent}\).

class aihwkit.simulator.configs.devices.MixedPrecisionCompound(device=<factory>, transfer_every=1, n_rows_per_transfer=-1, random_row=False, granularity=0.0, n_x_bins=0, n_d_bins=0)

Bases: aihwkit.simulator.configs.devices.DigitalRankUpdateCell

Abstract device model that takes 1 (analog) device and implements a transfer-based learning rule, where the outer product is computed in digital.

Here, the outer product of the activations and error is done on a full-precision floating-point \(\chi\) matrix. Then, with a threshold given by the granularity, pulses will be applied to transfer the information row-by-row to the analog matrix.

For details, see Nandakumar et al. Front. in Neurosci. (2020).

Note

This version of update is different from a parallel update in analog other devices are implementing with stochastic pulsing, as here \({\cal O}(n^2)\) digital computations are needed to compute the outer product (rank update). This need for digital compute in potentially high precision might result in inferior run time and power estimates in real-world applications, although sparse integer products can potentially be employed to speed up to improve run time estimates. For details, see discussion in Nandakumar et al. Front. in Neurosci. (2020).

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.MixedPrecResistiveDeviceParameter

granularity: float = 0.0

Granularity of the device.

Granularity of the device that is used to calculate the number of pulses transferred from \(\chi\) to analog.

If 0, it will take dw_min from the analog device used.

n_d_bins: int = 0

The number of bins to discretize (symmetrically around zero) the error before computing the outer product.

Dynamic quantization is used by computing the absolute max value of each error vector. Quantization can be turned off by setting this to 0.

n_rows_per_transfer: int = -1

How many consecutive rows to write to the tile from the \(\chi\) matrix.

-1 means full matrix read each transfer event.

n_x_bins: int = 0

The number of bins to discretize (symmetrically around zero) the activation before computing the outer product.

Dynamic quantization is used by computing the absolute max value of each input. Quantization can be turned off by setting this to 0.

random_row: bool = False

Whether to select a random starting row.

Whether to select a random starting row for each transfer event and not take the next row that was previously not transferred as a starting row (the default).

transfer_every: int = 1

Transfers every \(n\) mat-vec operations. Transfers every \(n\) mat-vec operations (rounded to multiples/ratios of m_batch).

Standard setting is 1.0 for mixed precision, but it could potentially be reduced to get better run time estimates.

class aihwkit.simulator.configs.devices.PowStepDevice(construction_seed=0, corrupt_devices_prob=0.0, corrupt_devices_range=1000, diffusion=0.0, diffusion_dtod=0.0, drift=<factory>, dw_min=0.001, dw_min_dtod=0.3, dw_min_std=0.3, enforce_consistency=True, lifetime=0.0, lifetime_dtod=0.0, perfect_bias=False, reset=0.01, reset_dtod=0.0, reset_std=0.01, up_down=0.0, up_down_dtod=0.01, w_max=0.6, w_max_dtod=0.3, w_min=-0.6, w_min_dtod=0.3, pow_gamma=1.0, pow_gamma_dtod=0.1, pow_up_down=0.0, pow_up_down_dtod=0.0, write_noise_std=0.0)

Bases: aihwkit.simulator.configs.devices.PulsedDevice

Pulsed update behavioral model: power-dependent step.

Pulsed update behavioral model, where the update step response size of the material has a power-dependent with resistance. This device model implements (a shifted from of) the Fusi & Abott (2007) synapse model (see also Frascaroli et al. (2108)).

The model based on PulsedDevice and thus shares most parameters and functionality. However, it implements new update once function, where the update step size depends in the following way. If we set \(\omega_{ij} = \frac{b_{ij}^\text{max} - w_{ij}}{b_{ij}^\text{max} - b_{ij}^\text{min}}\) the relative distance of the current weight to the upper bound, then the update per pulse is for the upwards direction:

\[w_{ij} \leftarrow w_{ij} + \Delta w_{ij}^+\,(\omega_{ij})^{\gamma_{ij}^+} \left(1 + \sigma_\text{c-to-c}\,\xi\right)\]

and in downwards direction:

\[w_{ij} \leftarrow w_{ij} + \Delta w_{ij}^-\,(1 - \omega_{ij})^{\gamma_{ij}^-} \left(1 + \sigma_\text{c-to-c}\,\xi\right)\]

Similar to \(\Delta w_{ij}^d\) the exponent \(\gamma_{ij}\) can be defined with device-to-device variation and bias in up and down direction:

\[\gamma_{ij}^d = d\; \gamma\, \left(1 + d\, \beta_{ij} + \sigma_\text{pow-gamma-d-to-d}\xi\right)\]

where \(\xi\) is again a standard Gaussian. \(\beta_{ij}\) is the directional up versus down bias. At initialization pow_up_down_dtod and pow_up_down defines this bias term:

\[\beta_{ij} = \beta_\text{pow-up-down} + \xi\sigma_\text{pow-up-down-dtod}\]

where \(\xi\) is again a standard Gaussian number and \(\beta_\text{pow-up-down}\) corresponds to pow_up_down.

Note

The pow_gamma_dtod and pow_up_down_dtod device-to-device variation parameters are given in relative units to pow_gamma.

Note

\(\Delta w_{ij}^d\) is defined as for the PulsedDevice, however, for this device, the update step size will not be given by \(\Delta w_{ij}\) at \(w_{ij}=0\) as for most other devices models

pow_gamma: float = 1.0

The value of \(\gamma\) as explained above.

Note

\(\gamma\) reduces essentially to the SoftBoundsDevice (if no device-to-device variation of gamma is used additionally). However, the SoftBoundsDevice will be much faster, as it does not need to compute the slow pow function.

pow_gamma_dtod: float = 0.1

Device-to-device variation for pow_gamma.

i.e. the value of \(\gamma_\text{pow-gamma-d-to-d}\) given in relative units to pow_gamma.

pow_up_down: float = 0.0

The up versus down bias of the \(\gamma\) as described above.

It is \(\gamma^+ = \gamma (1 + \beta_\text{pow-up-down})\) and \(\gamma^- = \gamma (1 - \beta_\text{pow-up-down})\) .

pow_up_down_dtod: float = 0.0

Device-to-device variation in the up versus down bias of \(\gamma\) as descibed above.

In units of pow_gamma.

write_noise_std: float = 0.0

Whether to use update write noise.

Whether to use update write noise that is added to the updated devices weight, while the update is done on a hidden persistent weight. The update write noise is then sampled a new when the device is touched again.

Thus it is:

\[w_\text{apparent}{ij} = w_ij + \sigma_\text{write_noise}\xi\]

and the update is done on \(w_ij\) but the forward sees the \(w_\text{apparent}\).

class aihwkit.simulator.configs.devices.PulsedDevice(construction_seed=0, corrupt_devices_prob=0.0, corrupt_devices_range=1000, diffusion=0.0, diffusion_dtod=0.0, drift=<factory>, dw_min=0.001, dw_min_dtod=0.3, dw_min_std=0.3, enforce_consistency=True, lifetime=0.0, lifetime_dtod=0.0, perfect_bias=False, reset=0.01, reset_dtod=0.0, reset_std=0.01, up_down=0.0, up_down_dtod=0.01, w_max=0.6, w_max_dtod=0.3, w_min=-0.6, w_min_dtod=0.3)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Pulsed update resistive devices.

Device are used as part of an AnalogTile to implement the update once characteristics, i.e. the material response properties when a single update pulse is given (a coincidence between row and column pulse train happened).

Common properties of all pulsed devices include:

Reset:

Resets the weight in cross points to (around) zero with cycle-to-cycle and systematic spread around a mean.

Important

Reset with given parameters is only activated when reset_weights() is called explicitly by the user.

Decay:

\[w_{ij} \leftarrow w_{ij}\,(1-\alpha_\text{decay}\delta_{ij})\]

Weight decay is only activated by inserting a specific call to decay_weights(), which is done automatically for a tile each mini-batch is decay is present. Note that the device decay_lifetime parameters (1 over decay rates \(\delta_{ij}\)) are analog tile specific and are thus set and fixed during RPU initialization. \(\alpha_\text{decay}\) is a scaling factor that can be given during run-time.

Diffusion:

Similar to the decay, diffusion is only activated by inserting a specific call to diffuse_weights(), which is done automatically for a tile each mini-batch is diffusion is present. The parameters of the diffusion process are set during RPU initialization and are fixed for the remainder.

\[w_{ij} \leftarrow w_{ij} + \rho_{ij} \, \xi;\]

where \(xi\) is a standard Gaussian variable and \(\rho_{ij}\) the diffusion rate for a cross-point ij.

Note

If diffusion happens to move the weight beyond the hard bounds of the weight it is ensured to be clipped appropriately.

Drift:

Optional power-law drift setting, as described in DriftParameter.

Important

Similar to reset, drift is not applied automatically each mini-batch but requires an explicit call to drift_weights() each time the drift should be applied.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.PulsedResistiveDeviceParameter

construction_seed: int = 0

If not equal 0, will set a unique seed for hidden parameters during construction.

corrupt_devices_prob: float = 0.0

Probability for devices to be corrupt (weights fixed to random value with hard bounds, that is min and max bounds are set to equal).

corrupt_devices_range: int = 1000

Range around zero for establishing corrupt devices.

diffusion: float = 0.0

Standard deviation of diffusion process.

diffusion_dtod: float = 0.0

Device-to device variation of diffusion rate in relative units.

drift: DriftParameter

Parameter governing a power-law drift.

dw_min: float = 0.001

Mean of the minimal update step sizes across devices and directions.

dw_min_dtod: float = 0.3

Device-to-device std deviation of dw_min (in relative units to dw_min).

dw_min_std: float = 0.3

Cycle-to-cycle variation size of the update step (related to \(\sigma_\text{c-to-c}\) above) in relative units to dw_min.

Note

Many spread (device-to-device variation) parameters are given in relative units. For instance e.g. a setting of dw_min_std of 0.1 would mean 10% spread around the mean and thus a resulting standard deviation (\(\sigma_\text{c-to-c}\)) of dw_min * dw_min_std.

enforce_consistency: bool = True

Whether to enforce weight bounds consistency during initialization.

Whether to enforce that max weight bounds cannot be smaller than min weight bounds, and up direction step size is positive and down negative. Switches the opposite values if encountered during init.

lifetime: float = 0.0

One over decay_rate, ie \(1/r_\text{decay}\).

lifetime_dtod: float = 0.0

Device-to-device variation in the decay rate (in relative units).

perfect_bias: bool = False

No up-down differences and device-to-device variability in the bounds for the devices in the bias row.

requires_decay()

Return whether device has decay enabled.

Return type

bool

requires_diffusion()

Return whether device has diffusion enabled.

Return type

bool

reset: float = 0.01

The reset values and spread per cross-point ij when using reset functionality of the device.

reset_dtod: float = 0.0

See reset.

reset_std: float = 0.01

See reset.

up_down: float = 0.0

Up and down direction step sizes can be systematically different and also vary across devices.

\(\Delta w_{ij}^d\) is set during RPU initialization (for each cross-point \(ij\)):

\[\Delta w_{ij}^d = d\; \Delta w_\text{min}\, \left( 1 + d \beta_{ij} + \sigma_\text{d-to-d}\xi\right)\]

where \(\xi\) is again a standard Gaussian. \(\beta_{ij}\) is the directional up versus down bias. At initialization up_down_dtod and up_down defines this bias term:

\[\beta_{ij} = \beta_\text{up-down} + \xi \sigma_\text{up-down-dtod}\]

where \(\xi\) is again a standard Gaussian number and \(\beta_\text{up-down}\) corresponds to up_down. Note that up_down_dtod is again given in relative units to dw_min.

up_down_dtod: float = 0.01

See up_down.

w_max: float = 0.6

See w_min.

w_max_dtod: float = 0.3

See w_min_dtod.

w_min: float = -0.6

Mean of hard bounds across device cross-point ij.

The parameters w_min and w_max are used to set the min/max bounds independently.

Note

For this abstract device, we assume that weights can have positive and negative values and are symmetrically around zero. In physical circuit terms, this might be implemented as a difference of two resistive elements.

w_min_dtod: float = 0.3

Device-to-device variation of the hard bounds.

Device-to-device variation of the hard bounds, of min and max value, respectively. All are given in relative units to w_min, or w_max, respectively.

class aihwkit.simulator.configs.devices.ReferenceUnitCell(unit_cell_devices=<factory>, update_policy=<VectorUnitCellUpdatePolicy.SINGLE_FIXED: 'SingleFixed'>, first_update_idx=0, gamma_vec=<factory>)

Bases: aihwkit.simulator.configs.devices.UnitCell

Abstract device model takes two arbitrary device per cross-point and implements an device with reference pair.

The update will only be on the 0-th device whereas the other will stay fixed. The resulting effective weight is the difference of the two.

Note

Exactly 2 devices are used, if more are given the are discarded, if less, the same device will be used twice.

Note

The reference device weights will all zero on default. To set the reference device with a particular value one can select the device update index:

analog_tile.set_hidden_update_index(1)
analog_tile.set_weights(W)
analog_tile.set_hidden_update_index(0) # set back to 0 for the following updates
as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.VectorResistiveDeviceParameter

first_update_idx: int = 0

Device that receives the update.

gamma_vec: List[float]

Weighting of the unit cell devices to reduce to final weight.

Note

While user-defined weighting can be given it is suggested to keep it to the default [1, -1] to implement the reference device subtraction.

update_policy: aihwkit.simulator.configs.utils.VectorUnitCellUpdatePolicy = 'SingleFixed'

The update policy of which if the devices will be receiving the update of a mini-batch.

Caution

This parameter should be kept to SINGLE_FIXED for this device.

class aihwkit.simulator.configs.devices.SoftBoundsDevice(construction_seed=0, corrupt_devices_prob=0.0, corrupt_devices_range=1000, diffusion=0.0, diffusion_dtod=0.0, drift=<factory>, dw_min=0.001, dw_min_dtod=0.3, dw_min_std=0.3, enforce_consistency=True, lifetime=0.0, lifetime_dtod=0.0, perfect_bias=False, reset=0.01, reset_dtod=0.0, reset_std=0.01, up_down=0.0, up_down_dtod=0.01, w_max=0.6, w_max_dtod=0.3, w_min=-0.6, w_min_dtod=0.3, mult_noise=True)

Bases: aihwkit.simulator.configs.devices.PulsedDevice

Pulsed update behavioral model: soft bounds.

Pulsed update behavioral model, where the update step response size of the material is linearly dependent and it goes to zero at the bound.

This model is based on LinearStepDevice with parameters set to model soft bounds.

mult_noise: bool = True

Whether to use multiplicative noise instead of additive cycle-to-cycle noise.

class aihwkit.simulator.configs.devices.SoftBoundsPmaxDevice(construction_seed=0, corrupt_devices_prob=0.0, corrupt_devices_range=1000, diffusion=0.0, diffusion_dtod=0.0, drift=<factory>, dw_min=<factory>, dw_min_dtod=0.3, dw_min_std=0.3, enforce_consistency=True, lifetime=0.0, lifetime_dtod=0.0, perfect_bias=False, reset=0.01, reset_dtod=0.0, reset_std=0.01, up_down=<factory>, up_down_dtod=0.01, w_max=<factory>, w_max_dtod=0.3, w_min=<factory>, w_min_dtod=0.3, mult_noise=True, p_max=1000, alpha=0.0005, range_min=-1.0, range_max=1.0)

Bases: aihwkit.simulator.configs.devices.SoftBoundsDevice

Pulsed update behavioral model: soft bounds, with a different parameterization for easier device fitting to experimental data.

Under the hood, the same device behavior as SoftboundsDevice This model is based on LinearStepDevice with parameters set to model soft bounds.

It implements pulse response function of the form:

\[ \begin{align}\begin{aligned}w(p_\text{up}) = B\left(1 -e^{-\alpha p_\text{up}} \right) + r_\text{min}\\w(p_\text{down}) = - B\left(1 - e^{-\alpha (p_\text{max} - p_\text{down})}\right) + r_\text{max}\end{aligned}\end{align} \]

where \(B=\frac{r_\text{max} - r_\text{min}}{1 - e^{-\alpha p_\text{max}}}\).

Here \(p_\text{max}\) is the number of pulses that were applied to get the device from the minimum conductance (minimum of range, \(r_\text{min}\)) to the maximum (maximum of range, \(r_\text{max}\)).

Internally the following transformation is used to get the original parameter of SoftboundsDevice:

b_factor = (range_max - range_min)/(1 - exp(-p_max * alpha))
w_min = range_min
w_max = range_min + b_factor
dw_min = b_factor * alpha
up_down = 1 + 2 * range_min / b_factor

Note

Device-to-device and cycle-to-cycle variation are defined as before (see SoftBoundsDevice, see also PulsedDevice). That is, for instance dw_min_dtod will effectively change the slope (in units of dw_min which is b_factor * alpha, see above). Range offset fluctuations can be achieved by using w_min_dtod and w_max_dtod which will vary w_min and w_max across devices, respectively.

alpha: float = 0.0005

The slope of the soft bounds model \(dw \propto \alpha w\) for both up and down direction.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.PulsedResistiveDeviceParameter

p_max: int = 1000

Number of pulses to drive the synapse from range_min to range_max.

range_max: float = 1.0

Value of the weight for \(P_max\) number of up pulses.

range_min: float = -1.0

Setting of the weight when starting the \(P_max\) up pulse experiment.

class aihwkit.simulator.configs.devices.TransferCompound(unit_cell_devices=<factory>, gamma=0.0, gamma_vec=<factory>, transfer_every=1.0, no_self_transfer=True, transfer_every_vec=<factory>, units_in_mbatch=True, n_cols_per_transfer=1, with_reset_prob=0.0, random_column=False, transfer_lr=1.0, transfer_lr_vec=<factory>, scale_transfer_lr=True, transfer_forward=<factory>, transfer_update=<factory>)

Bases: aihwkit.simulator.configs.devices.UnitCell

Abstract device model that takes 2 or more devices and implements a transfer-based learning rule.

It uses a (partly) hidden weight (where the SGD update is accumulated), which then is transferred partly and occasionally to the visible weight. This can implement an analog friendly variant of stochastic gradient descent (Tiki-taka), as described in Gokmen & Haensch (2020).

The hidden weight is always the first in the list of unit_cell_devices given, and the transfer is done from left to right. The first of the unit_cell_devices can have different HW specifications from the rest, but the others need to be of identical specs. In detail, when specifying the list of devices only the first two will actually be used and the rest discarded and instead replaced by the second device specification. In this manner, the fast crossbar (receiving the SGD updates) and the slow crossbar (receiving the occasional partial transfers from the fast) can have different specs, but all additional slow crossbars (receiving transfers from the left neighboring crossbar in the list of unit_cell_devices) need to be of the same spec.

The rate of transfer (e.g. learning rate and how often and how many columns per transfer) and the type (ie. with ADC or without, with noise etc.) can be adjusted.

Each transfer event that is triggered by counting the update cycles (in units of either mini-batch or single mat-vecs), n_cols_per_transfer columns are read from the left device using the forward pass with transfer vectors as input and transferred to the right (taking the order of the unit_cell_devices list) using the outer-product update with the read-out vectors and the transfer vectors. Currently, transfer vectors are fixed to be one-hot vectors. The columns to take are in sequential order and warped around at the edge of the crossbar. The learning rate and forward and update specs of the transfer can be user-defined.

The weight that is seen in the forward and backward pass is governed by the \(\gamma\) weightening setting.

Note

Here the devices could be either transferred in analog (essentially within the unit cell) or on separate arrays (using the usual (non-ideal) forward pass and update steps. This can be set with transfer_forward and transfer_update.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.TransferResistiveDeviceParameter

gamma: float = 0.0

Weighting factor to compute the effective SGD weight from the hidden matrices.

The default scheme is:

\[g^{n-1} W_0 + g^{n-2} W_1 + \ldots + g^0 W_{n-1}\]
gamma_vec: List[float]

User-defined weightening.

User-defined weightening can be given as a list if weights in which case the default weightening scheme with gamma is not used.

n_cols_per_transfer: int = 1

Number of consecutive columns to use during transfer events.

How many consecutive columns to read (from one tile) and write (to the next tile) every transfer event. For read, the input is a 1-hot vector. Once the final column is reached, reading starts again from the first.

no_self_transfer: bool = True

Whether to set the transfer rate of the last device (which is applied to itself) to zero.

random_column: bool = False

Whether to select a random starting column.

Whether to select a random starting column for each transfer event and not take the next column that was previously not transferred as a starting column (the default).

scale_transfer_lr: bool = True

Whether to give the transfer_lr in relative units.

ie. whether to scale the transfer LR with the current LR of the SGD.

transfer_every: float = 1.0

Transfers every \(n\) mat-vec operations or \(n\) batches.

Transfers every \(n\) mat-vec operations (rounded to multiples/ratios of m_batch for CUDA). If units_in_mbatch is set, then the units are in m_batch instead of mat-vecs, which is equal to the overall the weight re-use during a while mini-batch.

Note

If transfer_every is 0.0 no transfer will be made.

If not given explicitely with transfer_every_vec, then the higher transfer cycles are geometrically scaled, the first is set to transfer_every. Each next transfer cycle is multiplied by x_size / n_cols_per_transfer.

transfer_every_vec: List[float]

Transfer cycles lengths.

A list of \(n\) entries, to explicitly set the transfer cycles lengths. In this case, the above defaults are ignored.

transfer_forward: IOParameters

Input-output parameters that define the read of a transfer event.

AnalogTileInputOutputParameters that define the read (forward) of an transfer event. For instance the amount of noise or whether transfer is done using a ADC/DAC etc.

transfer_lr: float = 1.0

Learning rate (LR) for the update step of the transfer event.

Per default all learning rates are identical. If scale_transfer_lr is set, the transfer LR is scaled by current learning rate of the SGD.

Note

LR is always a positive number, sign will be correctly applied internally.

transfer_lr_vec: List[float]

Transfer LR for each individual transfer in the device chain can be given.

transfer_update: UpdateParameters

Update parameters that define the type of update used for each transfer event.

Update parameters AnalogTileUpdateParameters that define the type of update used for each transfer event.

units_in_mbatch: bool = True

Units for transfer_every.

If set, then the cycle length units of transfer_every are in m_batch instead of mat-vecs, which is equal to the overall of the weight re-use during a while mini-batch.

with_reset_prob: float = 0.0

Whether to apply reset of the columns that were transferred with a given probability.

class aihwkit.simulator.configs.devices.UnitCell(unit_cell_devices=<factory>)

Bases: aihwkit.simulator.configs.helpers._PrintableMixin

Parameters that modify the behaviour of a unit cell.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.VectorResistiveDeviceParameter

requires_decay()

Return whether device has decay enabled.

Return type

bool

requires_diffusion()

Return whether device has diffusion enabled.

Return type

bool

unit_cell_devices: List

Devices that compose this unit cell.

class aihwkit.simulator.configs.devices.VectorUnitCell(unit_cell_devices=<factory>, update_policy=<VectorUnitCellUpdatePolicy.ALL: 'All'>, first_update_idx=0, gamma_vec=<factory>)

Bases: aihwkit.simulator.configs.devices.UnitCell

Abstract resistive device that combines multiple pulsed resistive devices in a single ‘unit cell’.

For instance, a vector device can consist of 2 resistive devices where the sum of the two resistive values are coded for each weight of a cross point.

as_bindings()

Return a representation of this instance as a simulator bindings object.

Return type

aihwkit.simulator.rpu_base.devices.VectorResistiveDeviceParameter

first_update_idx: int = 0

Device that receives the first mini-batch.

Useful only for VectorUnitCellUpdatePolicy.SINGLE_FIXED.

gamma_vec: List[float]

Weighting of the unit cell devices to reduce to final weight.

User-defined weightening can be given as a list if factors. If not given, each device index of the unit cell is weighted by equal amounts (\(1/n\)).

update_policy: aihwkit.simulator.configs.utils.VectorUnitCellUpdatePolicy = 'All'

The update policy of which if the devices will be receiving the update of a mini-batch.