aihwkit.inference.converter.wpo module

Weight Programming Optimization implementation which is similar to the framework reported in the following paper:

Mackin, et al., “Optimised weight programming for analogue memory-based
deep neural networks” 2022. https://www.nature.com/articles/s41467-022-31405-1.

class aihwkit.inference.converter.wpo.WeightProgrammingOptimizer(weights, f_lst, rpu_config, t_steps, g_converter_baseline, symmetric=True, **kwargs)[source]

Bases: object

Weight Programming Optimization Class.

Uses differential evoluation to minimize the time-averaged normalized mean-squared error (TNMSE) loss between ideal weights and effective weights, which include device non-idealities such as programming errors, read noise, conductance drift, and algorithmic drift compensation. Can also optimize for significance pair scaling factors f and be employed to optimize symmetric weight programming (for negative and positive weights), which reduces the dimensionality and search space. Alternatively, one can also optimize weight programming in a ~2x higher-dimensional space for positive and negative weights in the event the weight distribution is highly asymmetric.

Params:

weights: ideal unitless weight distribution to program f_lst: list significance pair scaling factors. Passing a list of None values

will cause optimizer to solve for the best hardware f factors. Alternatively, a specified f_lst such as [1.0, 3.0] will constrain the optimization to a specific set of hardware f factors.

rpu_config: resistive processing unit configuration. t_steps: time steps used in the optimization process. Will try to minimize weight

errors at all of the specified time steps.

g_converter_baseline: a baseline g_converter which the weight programming optimizer: will try to outperform. You can input an instantiated SinglePairConductanceConverter or a DualPairConductanceConverter. Alternatively, you can also pass a previously optimized CustomPairConductanceConverter in the event you would like to improve on an previously optimized weight programming strategy.
symmetric: boolean that specifies whether or not to employ a symmetric weight programming: strategy for negative and positive weights. Enforcing symmetric reduces the optimization dimensionality / search space and leads to faster optimization times.
kwargs: optional parameters that allows the user to override the baseline parameters: passed to scipy differential_evolution algorithm. The baseline parameters have been heavily optimized and should be sufficient. In some cases, it may be beneficial to adjust these values to improve the speed or quality of results.

Returns:

instantiated with optimized weight programming strategy success: boolean flag indicating whether scipy differential_evolution successfully

terminated

Return type:

CustomPairConductanceConverter

Parameters:

weights (Tensor)
f_lst (List)
rpu_config (Type[InferenceRPUConfig])
t_steps (List)
g_converter_baseline (SinglePairConductanceConverter | DualPairConductanceConverter | CustomPairConductanceConverter)
symmetric (bool)
kwargs (Any | None)

differential_weight_evolution()[source]

Runs differential evolution for weight programming optimization.

To increase the chances of finding a good minima: increase popsize along with mutation, but lower recombination.

Returns:

CustomPairConductanceConverter instantiated: with optimal f_lst, g_lst weight programming specifications
success: whether weight programming optimization was successful: or not

Return type:

g_converter

generate_hop_bounds()[source]

Generates the bounds applied to the differential weight evolution algorithm.

Returns: tuple of tuples which specify the number and constraints: (hypercube) used for differential evoluation optimization

Return type:: Tuple[Tuple[float, float], …]

get_loss_baseline()[source]

Estimates the time-averaged normalized mean square error (TNMSE) value for the weight progamming optimizer to beat based on a user-specified standard weight programming procedure.

Returns:: Loss, which is a time-averaged normalized mean square error (TNMSE), which the weight programming optimizer will try to outperform
Return type:: float

run_optimizer()[source]

Runs a series of steps that optimizes the weight programming strategy based on programming noise, read noise, conductance- dependent drift models, and drift compensation so as to maintain weight fidelity as best as possible overtime and help the network achieve iso-accuracy.

Returns:

optimal weight programming strategy in: the form of an instantiated CustomPairConductanceConverter
success: boolean flag specifying whether or not the optimization: was a success

Return type:

optimal_g_converter

aihwkit.inference.converter.wpo.denormalize(x, d)[source]

De-normalizes a hypercube input parameter x to corresponding f_p values and gp_p and gm_p values.

Parameters:

x (Tensor) – hypercube parameters representing a weight programming strategy
d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution

Returns:

conductance pair scaling factors for CustomPairConductanceConverter g_lst: conductance programming spec for CustomPairConductanceConverter

Return type:

f_lst

aihwkit.inference.converter.wpo.downsample_weight_distribution(weights, shape)[source]

Downsamples weight distribution via interpolation

Params:: weights: original tensor of weights shape: torch.Size object containing shape of desired downsampled matrix

Returns:

downsampled N-dimensional weight distribution that is representative of overall network weight distribution. Note: this function will also correctly upsample weights when shape.numel() > weights.numel()

Parameters:

weights (Tensor)
shape (Size)

Return type:

Tensor

aihwkit.inference.converter.wpo.loss_fxn(x, *args)[source]

Computes loss based on hypercube x values being probed by differential evolution.

Parameters:

x (Tensor) – hypercube values corresponding to a programming strategy
args (bytes | bytearray) – any additional arguments necessary for optimization, must be pickled to work with multiple workers in scipy differential evolution algorithm

Returns:

Loss of corresponding programming strategy defined by x

Return type:

float

aihwkit.inference.converter.wpo.loss_rpu_config(d)[source]

Computes Time-averaged Normalized Mean Squared Error (TNMSE) between the target weight distribution and implemented weight distribution according to weight programming strategy and corresponding programming errors, read noise, drift, and drift compensation. This serves as a loss function to be minimized.

Parameters:: d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution
Returns:: Loss, which is a Time-averaged Normalized Mean Squared Error (TNMSE)
Return type:: float

aihwkit.inference.converter.wpo.loss_weights(model, t_steps, test_weights, max_abs_weight_unitless, loss_baseline, loss_margin, get_baseline=False)[source]

Computes Time-Averaged Normalized Mean Squared Error (TNMSE) for weight distribution

Parameters:

model (AnalogLinear) – one AnalogLinear layer used for test evaluation
t_steps (List[float]) – time steps to optimize over
test_weights (Tensor) – test weights (2d) that we want to implement
max_abs_weight_unitless (float) – maximum weight value positive or negative
loss_baseline (float) – baseline loss (tnmse) for baseline weight programming strategy
loss_margin (float) – how much we are trying to beat the baseline loss by 0.1 = 10%
get_baseline (bool) – whether to return true loss (tnmse) or the normalized version where we have beat the baseline loss by the loss margin amount once the value becomes less than zero

Returns:

Time-Averaged Normalized Mean Squared Error (TNMSE), usually the normalized: version where where we have beat the baseline loss by the loss margin when the value becomes less than zero.

Return type:

tnmse

aihwkit.inference.converter.wpo.partition_parameters(x, d)[source]

Separates array of x parameters from differential evolution into parameters corresponding to f_p factors (if they exist) and x_g parameter, which will be used to find corresponding valid conductance combinations.

Parameters:

x (Tensor) – hypercube parameters corresponding to a weight programming strategy
d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution

Returns:

list of conductance pair scaling parameters f k_w: unitless to micoSiemens weight rescaling factor [uS/1] x_g: hypercube parameters corresponding to remaining weight programming

strategy parameters

Return type:

f_lst

aihwkit.inference.converter.wpo.reformat_x_g(x_g, len_f_lst, symmetric)[source]

Get x values in same format as CustomPairConductanceConverter g_lst

Parameters:

x_g (Tensor) – list of hypercube parameters
len_f_lst (int) – length of conductance pair scaling parameters f in f_lst
symmetric (bool) – whether or not weight programming optimization solution will be symmetric for positive and negative weights (reduces dimensionality of optimization problem)

Returns:

reformatted xg to correspond to CustomPairConductanceConverter: g_lst formatting

g_len: number discretized weights specified in CustomPairConductanceConverter

Return type:

x_g_lst

aihwkit.inference.converter.wpo.shuffle_weights(weights)[source]

Sample weights to test programming strategy

Params:: weights: tensor of weights

Returns:: shuffled tensor of weights with equivalent dimensions
Parameters:: weights (Tensor)
Return type:: Tensor

aihwkit.inference.converter.wpo.span_of_remaining_pairs(span_of_each_pair, ind)[source]

Computes the span of the remaining conductance pairs. Informs interdependent constraints on conductance programming based on how previous conductance pair was programmed.

Parameters:

span_of_each_pair (List) – range of each conductance pair including f factor
ind (int) – index of remaining conductance pairs

Returns:

Remaining conductance range (i.e. maximum positive/negative conductance value that could programmed in the remaining conductance pairs)

Return type:

Tensor

aihwkit.inference.converter.wpo.stop_criterion(intermediate_result)[source]

Terminate weight programming optimization once strategy once (current loss - baseline loss) / baseline loss + loss margin is less than zero, where baseline loss is determined by g_converter_baseline. Current loss is determined by the current weight programming optimization strategy.

Parameters:

intermediate_result (OptimizeResult) – a keyword parameter containing an OptimizeResult with
fun (attributes x and)
objective (the best solution found so far and the)
function
be (respectively. Note that the name of the parameter must)
OptimizeResult. (intermediate_result for the callback to be passed an)

Raises:

StopIteration – when intermediate_result.fun (loss function) becomes
negative, which means optimization has adequately converged as defined –
by the loss_margin parameter –

Return type:

None