aihwkit.inference.converter.wpo module
Weight Programming Optimization implementation which is similar to the framework reported in the following paper:
- Mackin, et al., “Optimised weight programming for analogue memory-based
deep neural networks” 2022. https://www.nature.com/articles/s41467-022-31405-1.
- class aihwkit.inference.converter.wpo.WeightProgrammingOptimizer(weights, f_lst, rpu_config, t_steps, g_converter_baseline, symmetric=True, **kwargs)[source]
Bases:
objectWeight Programming Optimization Class.
Uses differential evoluation to minimize the time-averaged normalized mean-squared error (TNMSE) loss between ideal weights and effective weights, which include device non-idealities such as programming errors, read noise, conductance drift, and algorithmic drift compensation. Can also optimize for significance pair scaling factors f and be employed to optimize symmetric weight programming (for negative and positive weights), which reduces the dimensionality and search space. Alternatively, one can also optimize weight programming in a ~2x higher-dimensional space for positive and negative weights in the event the weight distribution is highly asymmetric.
- Params:
weights: ideal unitless weight distribution to program f_lst: list significance pair scaling factors. Passing a list of None values
will cause optimizer to solve for the best hardware f factors. Alternatively, a specified f_lst such as [1.0, 3.0] will constrain the optimization to a specific set of hardware f factors.
rpu_config: resistive processing unit configuration. t_steps: time steps used in the optimization process. Will try to minimize weight
errors at all of the specified time steps.
- g_converter_baseline: a baseline g_converter which the weight programming optimizer
will try to outperform. You can input an instantiated SinglePairConductanceConverter or a DualPairConductanceConverter. Alternatively, you can also pass a previously optimized CustomPairConductanceConverter in the event you would like to improve on an previously optimized weight programming strategy.
- symmetric: boolean that specifies whether or not to employ a symmetric weight programming
strategy for negative and positive weights. Enforcing symmetric reduces the optimization dimensionality / search space and leads to faster optimization times.
- kwargs: optional parameters that allows the user to override the baseline parameters
passed to scipy differential_evolution algorithm. The baseline parameters have been heavily optimized and should be sufficient. In some cases, it may be beneficial to adjust these values to improve the speed or quality of results.
- Returns:
instantiated with optimized weight programming strategy success: boolean flag indicating whether scipy differential_evolution successfully
terminated
- Return type:
- Parameters:
weights (Tensor)
f_lst (List)
rpu_config (Type[InferenceRPUConfig])
t_steps (List)
g_converter_baseline (SinglePairConductanceConverter | DualPairConductanceConverter | CustomPairConductanceConverter)
symmetric (bool)
kwargs (Any | None)
- differential_weight_evolution()[source]
Runs differential evolution for weight programming optimization.
To increase the chances of finding a good minima: increase popsize along with mutation, but lower recombination.
- Returns:
- CustomPairConductanceConverter instantiated
with optimal f_lst, g_lst weight programming specifications
- success: whether weight programming optimization was successful
or not
- Return type:
g_converter
- generate_hop_bounds()[source]
Generates the bounds applied to the differential weight evolution algorithm.
- Returns: tuple of tuples which specify the number and constraints
(hypercube) used for differential evoluation optimization
- Return type:
Tuple[Tuple[float, float], …]
- get_loss_baseline()[source]
Estimates the time-averaged normalized mean square error (TNMSE) value for the weight progamming optimizer to beat based on a user-specified standard weight programming procedure.
- Returns:
Loss, which is a time-averaged normalized mean square error (TNMSE), which the weight programming optimizer will try to outperform
- Return type:
float
- run_optimizer()[source]
Runs a series of steps that optimizes the weight programming strategy based on programming noise, read noise, conductance- dependent drift models, and drift compensation so as to maintain weight fidelity as best as possible overtime and help the network achieve iso-accuracy.
- Returns:
- optimal weight programming strategy in
the form of an instantiated CustomPairConductanceConverter
- success: boolean flag specifying whether or not the optimization
was a success
- Return type:
optimal_g_converter
- aihwkit.inference.converter.wpo.denormalize(x, d)[source]
De-normalizes a hypercube input parameter x to corresponding f_p values and gp_p and gm_p values.
- Parameters:
x (Tensor) – hypercube parameters representing a weight programming strategy
d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution
- Returns:
conductance pair scaling factors for CustomPairConductanceConverter g_lst: conductance programming spec for CustomPairConductanceConverter
- Return type:
f_lst
- aihwkit.inference.converter.wpo.downsample_weight_distribution(weights, shape)[source]
Downsamples weight distribution via interpolation
- Params:
weights: original tensor of weights shape: torch.Size object containing shape of desired downsampled matrix
- Returns:
downsampled N-dimensional weight distribution that is representative of overall network weight distribution. Note: this function will also correctly upsample weights when shape.numel() > weights.numel()
- Parameters:
weights (Tensor)
shape (Size)
- Return type:
Tensor
- aihwkit.inference.converter.wpo.loss_fxn(x, *args)[source]
Computes loss based on hypercube x values being probed by differential evolution.
- Parameters:
x (Tensor) – hypercube values corresponding to a programming strategy
args (bytes | bytearray) – any additional arguments necessary for optimization, must be pickled to work with multiple workers in scipy differential evolution algorithm
- Returns:
Loss of corresponding programming strategy defined by x
- Return type:
float
- aihwkit.inference.converter.wpo.loss_rpu_config(d)[source]
Computes Time-averaged Normalized Mean Squared Error (TNMSE) between the target weight distribution and implemented weight distribution according to weight programming strategy and corresponding programming errors, read noise, drift, and drift compensation. This serves as a loss function to be minimized.
- Parameters:
d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution
- Returns:
Loss, which is a Time-averaged Normalized Mean Squared Error (TNMSE)
- Return type:
float
- aihwkit.inference.converter.wpo.loss_weights(model, t_steps, test_weights, max_abs_weight_unitless, loss_baseline, loss_margin, get_baseline=False)[source]
Computes Time-Averaged Normalized Mean Squared Error (TNMSE) for weight distribution
- Parameters:
model (AnalogLinear) – one AnalogLinear layer used for test evaluation
t_steps (List[float]) – time steps to optimize over
test_weights (Tensor) – test weights (2d) that we want to implement
max_abs_weight_unitless (float) – maximum weight value positive or negative
loss_baseline (float) – baseline loss (tnmse) for baseline weight programming strategy
loss_margin (float) – how much we are trying to beat the baseline loss by 0.1 = 10%
get_baseline (bool) – whether to return true loss (tnmse) or the normalized version where we have beat the baseline loss by the loss margin amount once the value becomes less than zero
- Returns:
- Time-Averaged Normalized Mean Squared Error (TNMSE), usually the normalized
version where where we have beat the baseline loss by the loss margin when the value becomes less than zero.
- Return type:
tnmse
- aihwkit.inference.converter.wpo.partition_parameters(x, d)[source]
Separates array of x parameters from differential evolution into parameters corresponding to f_p factors (if they exist) and x_g parameter, which will be used to find corresponding valid conductance combinations.
- Parameters:
x (Tensor) – hypercube parameters corresponding to a weight programming strategy
d (Dict) – dictionary with all WeightProgrammingOptimization attribution that were previously serialized/pickled to be compatible with scipy differential evolution
- Returns:
list of conductance pair scaling parameters f k_w: unitless to micoSiemens weight rescaling factor [uS/1] x_g: hypercube parameters corresponding to remaining weight programming
strategy parameters
- Return type:
f_lst
- aihwkit.inference.converter.wpo.reformat_x_g(x_g, len_f_lst, symmetric)[source]
Get x values in same format as CustomPairConductanceConverter g_lst
- Parameters:
x_g (Tensor) – list of hypercube parameters
len_f_lst (int) – length of conductance pair scaling parameters f in f_lst
symmetric (bool) – whether or not weight programming optimization solution will be symmetric for positive and negative weights (reduces dimensionality of optimization problem)
- Returns:
- reformatted xg to correspond to CustomPairConductanceConverter
g_lst formatting
g_len: number discretized weights specified in CustomPairConductanceConverter
- Return type:
x_g_lst
- aihwkit.inference.converter.wpo.shuffle_weights(weights)[source]
Sample weights to test programming strategy
- Params:
weights: tensor of weights
- Returns:
shuffled tensor of weights with equivalent dimensions
- Parameters:
weights (Tensor)
- Return type:
Tensor
- aihwkit.inference.converter.wpo.span_of_remaining_pairs(span_of_each_pair, ind)[source]
Computes the span of the remaining conductance pairs. Informs interdependent constraints on conductance programming based on how previous conductance pair was programmed.
- Parameters:
span_of_each_pair (List) – range of each conductance pair including f factor
ind (int) – index of remaining conductance pairs
- Returns:
Remaining conductance range (i.e. maximum positive/negative conductance value that could programmed in the remaining conductance pairs)
- Return type:
Tensor
- aihwkit.inference.converter.wpo.stop_criterion(intermediate_result)[source]
Terminate weight programming optimization once strategy once (current loss - baseline loss) / baseline loss + loss margin is less than zero, where baseline loss is determined by g_converter_baseline. Current loss is determined by the current weight programming optimization strategy.
- Parameters:
intermediate_result (OptimizeResult) – a keyword parameter containing an OptimizeResult with
fun (attributes x and)
objective (the best solution found so far and the)
function
be (respectively. Note that the name of the parameter must)
OptimizeResult. (intermediate_result for the callback to be passed an)
- Raises:
StopIteration – when intermediate_result.fun (loss function) becomes
negative, which means optimization has adequately converged as defined –
by the loss_margin parameter –
- Return type: