aihwkit.optim.analog_optimizer module

Analog-aware inference optimizer.

class aihwkit.optim.analog_optimizer.AnalogAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, amsgrad=False, *, foreach=None, maximize=False, capturable=False, differentiable=False, fused=None)[source]

Bases: AnalogOptimizerMixin, Adam

Implements analog-aware Adam.

Parameters:
  • params (Iterable[Tensor] | Iterable[Dict[str, Any]]) –

  • lr (float | Tensor) –

  • betas (Tuple[float, float]) –

  • eps (float) –

  • weight_decay (float) –

  • amsgrad (bool) –

  • foreach (bool | None) –

  • maximize (bool) –

  • capturable (bool) –

  • differentiable (bool) –

  • fused (bool | None) –

class aihwkit.optim.analog_optimizer.AnalogOptimizer(optimizer_cls, *_, **__)[source]

Bases: AnalogOptimizerMixin, Optimizer

Generic optimizer that wraps an existing Optimizer for analog inference.

This class wraps an existing Optimizer, customizing the optimization step for triggering the analog update needed for analog tiles. All other (digital) parameters are governed by the given torch optimizer. In case of hardware-aware training (InferenceTile) the tile weight update is also governed by the given optimizer, otherwise it is using the internal analog update as defined in the rpu_config.

The AnalogOptimizer constructor expects the wrapped optimizer class as the first parameter, followed by any arguments required by the wrapped optimizer.

Note

The instances returned are of a new type that is a subclass of:

  • the wrapped Optimizer (allowing access to all their methods and attributes).

  • this AnalogOptimizer.

Example

The following block illustrate how to create an optimizer that wraps standard SGD:

>>> from torch.optim import SGD
>>> from torch.nn import Linear
>>> from aihwkit.simulator.configs.configs import InferenceRPUConfig
>>> from aihwkit.optim import AnalogOptimizer
>>> model = AnalogLinear(3, 4, rpu_config=InferenceRPUConfig)
>>> optimizer = AnalogOptimizer(SGD, model.parameters(), lr=0.02)
Parameters:
  • optimizer_cls (Type) –

  • _ (Any) –

  • __ (Any) –

Return type:

AnalogOptimizer

SUBCLASSES: Dict[str, Type] = {}

Registry of the created subclasses.

class aihwkit.optim.analog_optimizer.AnalogOptimizerMixin[source]

Bases: object

Mixin for analog optimizers.

This class contains the methods needed for enabling analog in an existing Optimizer. It is designed to be used as a mixin in conjunction with an AnalogOptimizer or torch Optimizer.

regroup_param_groups(*_)[source]

Reorganize the parameter groups, isolating analog layers.

Update the param_groups of the optimizer, moving the parameters for each analog layer to a new single group.

Parameters:

_ (Any) –

Return type:

None

set_learning_rate(learning_rate=0.1)[source]

Update the learning rate to a new value.

Update the learning rate of the optimizer, propagating the changes to the analog tiles accordingly.

Parameters:

learning_rate (float) – learning rate for the optimizer.

Return type:

None

step(closure=None, **kwargs)[source]

Perform an analog-aware single optimization step.

If a group containing analog parameters is detected, the optimization step calls the related RPU controller. For regular parameter groups, the optimization step has the same behaviour as torch.optim.SGD.

Parameters:
  • closure (callable, optional) – A closure that reevaluates the model and returns the loss.

  • kwargs (Any) – additional arguments if any

Returns:

The loss, if closure has been passed as a parameter.

Return type:

float | None

class aihwkit.optim.analog_optimizer.AnalogSGD(params, lr=0.001, momentum=0, dampening=0, weight_decay=0, nesterov=False, *, maximize=False, foreach=None, differentiable=False)[source]

Bases: AnalogOptimizerMixin, SGD

Implements analog-aware stochastic gradient descent.

Parameters:
  • maximize (bool) –

  • foreach (bool | None) –

  • differentiable (bool) –