aihwkit.inference.converter.wpo module

Weight Programming Optimization implementation which is similar to the framework reported in the following paper:

  1. Mackin, et al., “Optimised weight programming for analogue memory-based

    deep neural networks” 2022. https://www.nature.com/articles/s41467-022-31405-1.

class aihwkit.inference.converter.wpo.WeightProgrammingOptimizer(weights, f_lst, rpu_config, t_steps, g_converter_baseline, symmetric=True, **kwargs)[source]

Bases: object

Weight Programming Optimization Class.

Uses differential evoluation to minimize the time-averaged normalized mean-squared error (TNMSE) loss between ideal weights and effective weights, which include device non-idealities such as programming errors, read noise, conductance drift, and algorithmic drift compensation. Can also optimize for significance pair scaling factors f and be employed to optimize symmetric weight programming (for negative and positive weights), which reduces the dimensionality and search space. Alternatively, one can also optimize weight programming in a ~2x higher-dimensional space for positive and negative weights in the event the weight distribution is highly asymmetric.

Params:

weights: ideal unitless weight distribution to program f_lst: list significance pair scaling factors. Passing a list of None values

will cause optimizer to solve for the best hardware f factors. Alternatively, a specified f_lst such as [1.0, 3.0] will constrain the optimization to a specific set of hardware f factors.

rpu_config: resistive processing unit configuration. t_steps: time steps used in the optimization process. Will try to minimize weight

errors at all of the specified time steps.

g_converter_baseline: a baseline g_converter which the weight programming optimizer

will try to outperform. You can input an instantiated SinglePairConductanceConverter or a DualPairConductanceConverter. Alternatively, you can also pass a previously optimized CustomPairConductanceConverter in the event you would like to improve on an previously optimized weight programming strategy.

symmetric: boolean that specifies whether or not to employ a symmetric weight programming

strategy for negative and positive weights. Enforcing symmetric reduces the optimization dimensionality / search space and leads to faster optimization times.

kwargs: optional parameters that allows the user to override the baseline parameters

passed to scipy differential_evolution algorithm. The baseline parameters have been heavily optimized and should be sufficient. In some cases, it may be beneficial to adjust these values to improve the speed or quality of results.

Returns:

instantiated with optimized weight programming strategy success: boolean flag indicating whether scipy differential_evolution successfully

terminated

Return type:

CustomPairConductanceConverter

Parameters:
differential_weight_evolution()[source]

Runs differential evolution for weight programming optimization.

To increase the chances of finding a good minima: increase popsize along with mutation, but lower recombination.

Returns:

CustomPairConductanceConverter instantiated

with optimal f_lst, g_lst weight programming specifications

success: whether weight programming optimization was successful

or not

Return type:

g_converter

generate_hop_bounds()[source]

Generates the bounds applied to the differential weight evolution algorithm.

Returns: tuple of tuples which specify the number and constraints

(hypercube) used for differential evoluation optimization

Return type:

Tuple[Tuple[float, float], …]

get_loss_baseline()[source]

Estimates the time-averaged normalized mean square error (TNMSE) value for the weight progamming optimizer to beat based on a user-specified standard weight programming procedure.

Returns:

Loss, which is a time-averaged normalized mean square error (TNMSE), which the weight programming optimizer will try to outperform

Return type:

float

run_optimizer()[source]

Runs a series of steps that optimizes the weight programming strategy based on programming noise, read noise, conductance- dependent drift models, and drift compensation so as to maintain weight fidelity as best as possible overtime and help the network achieve iso-accuracy.

Returns:

optimal weight programming strategy in

the form of an instantiated CustomPairConductanceConverter

success: boolean flag specifying whether or not the optimization

was a success

Return type:

optimal_g_converter

aihwkit.inference.converter.wpo.denormalize(x, d)[source]

De-normalizes a hypercube input parameter x to corresponding f_p values and gp_p and gm_p values.

Parameters:
  • x (Tensor) – hypercube parameters representing a weight programming strategy

  • d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution

Returns:

conductance pair scaling factors for CustomPairConductanceConverter g_lst: conductance programming spec for CustomPairConductanceConverter

Return type:

f_lst

aihwkit.inference.converter.wpo.downsample_weight_distribution(weights, shape)[source]

Downsamples weight distribution via interpolation

Params:

weights: original tensor of weights shape: torch.Size object containing shape of desired downsampled matrix

Returns:

downsampled N-dimensional weight distribution that is representative of overall network weight distribution. Note: this function will also correctly upsample weights when shape.numel() > weights.numel()

Parameters:
  • weights (Tensor)

  • shape (Size)

Return type:

Tensor

aihwkit.inference.converter.wpo.loss_fxn(x, *args)[source]

Computes loss based on hypercube x values being probed by differential evolution.

Parameters:
  • x (Tensor) – hypercube values corresponding to a programming strategy

  • args (bytes | bytearray) – any additional arguments necessary for optimization, must be pickled to work with multiple workers in scipy differential evolution algorithm

Returns:

Loss of corresponding programming strategy defined by x

Return type:

float

aihwkit.inference.converter.wpo.loss_rpu_config(d)[source]

Computes Time-averaged Normalized Mean Squared Error (TNMSE) between the target weight distribution and implemented weight distribution according to weight programming strategy and corresponding programming errors, read noise, drift, and drift compensation. This serves as a loss function to be minimized.

Parameters:

d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution

Returns:

Loss, which is a Time-averaged Normalized Mean Squared Error (TNMSE)

Return type:

float

aihwkit.inference.converter.wpo.loss_weights(model, t_steps, test_weights, max_abs_weight_unitless, loss_baseline, loss_margin, get_baseline=False)[source]

Computes Time-Averaged Normalized Mean Squared Error (TNMSE) for weight distribution

Parameters:
  • model (AnalogLinear) – one AnalogLinear layer used for test evaluation

  • t_steps (List[float]) – time steps to optimize over

  • test_weights (Tensor) – test weights (2d) that we want to implement

  • max_abs_weight_unitless (float) – maximum weight value positive or negative

  • loss_baseline (float) – baseline loss (tnmse) for baseline weight programming strategy

  • loss_margin (float) – how much we are trying to beat the baseline loss by 0.1 = 10%

  • get_baseline (bool) – whether to return true loss (tnmse) or the normalized version where we have beat the baseline loss by the loss margin amount once the value becomes less than zero

Returns:

Time-Averaged Normalized Mean Squared Error (TNMSE), usually the normalized

version where where we have beat the baseline loss by the loss margin when the value becomes less than zero.

Return type:

tnmse

aihwkit.inference.converter.wpo.partition_parameters(x, d)[source]

Separates array of x parameters from differential evolution into parameters corresponding to f_p factors (if they exist) and x_g parameter, which will be used to find corresponding valid conductance combinations.

Parameters:
  • x (Tensor) – hypercube parameters corresponding to a weight programming strategy

  • d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution

Returns:

list of conductance pair scaling parameters f k_w: unitless to micoSiemens weight rescaling factor [uS/1] x_g: hypercube parameters corresponding to remaining weight programming

strategy parameters

Return type:

f_lst

aihwkit.inference.converter.wpo.reformat_x_g(x_g, len_f_lst, symmetric)[source]

Get x values in same format as CustomPairConductanceConverter g_lst

Parameters:
  • x_g (Tensor) – list of hypercube parameters

  • len_f_lst (int) – length of conductance pair scaling parameters f in f_lst

  • symmetric (bool) – whether or not weight programming optimization solution will be symmetric for positive and negative weights (reduces dimensionality of optimization problem)

Returns:

reformatted xg to correspond to CustomPairConductanceConverter

g_lst formatting

g_len: number discretized weights specified in CustomPairConductanceConverter

Return type:

x_g_lst

aihwkit.inference.converter.wpo.shuffle_weights(weights)[source]

Sample weights to test programming strategy

Params:

weights: tensor of weights

Returns:

shuffled tensor of weights with equivalent dimensions

Parameters:

weights (Tensor)

Return type:

Tensor

aihwkit.inference.converter.wpo.span_of_remaining_pairs(span_of_each_pair, ind)[source]

Computes the span of the remaining conductance pairs. Informs interdependent constraints on conductance programming based on how previous conductance pair was programmed.

Parameters:
  • span_of_each_pair (List) – range of each conductance pair including f factor

  • ind (int) – index of remaining conductance pairs

Returns:

Remaining conductance range (i.e. maximum positive/negative conductance value that could programmed in the remaining conductance pairs)

Return type:

Tensor

aihwkit.inference.converter.wpo.stop_criterion(intermediate_result)[source]

Terminate weight programming optimization once strategy once (current loss - baseline loss) / baseline loss + loss margin is less than zero, where baseline loss is determined by g_converter_baseline. Current loss is determined by the current weight programming optimization strategy.

Parameters:
  • intermediate_result (OptimizeResult) – a keyword parameter containing an OptimizeResult with

  • fun (attributes x and)

  • objective (the best solution found so far and the)

  • function

  • be (respectively. Note that the name of the parameter must)

  • OptimizeResult. (intermediate_result for the callback to be passed an)

Raises:
  • StopIteration – when intermediate_result.fun (loss function) becomes

  • negative, which means optimization has adequately converged as defined

  • by the loss_margin parameter

Return type:

None