aihwkit.optim.analog_optimizer module
Analog-aware inference optimizer.
- class aihwkit.optim.analog_optimizer.AnalogAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, amsgrad=False, *, foreach=None, maximize=False, capturable=False, differentiable=False, fused=None)[source]
Bases:
AnalogOptimizerMixin
,Adam
Implements analog-aware Adam.
- Parameters:
params (Iterable[Tensor] | Iterable[Dict[str, Any]]) –
lr (float | Tensor) –
betas (Tuple[float, float]) –
eps (float) –
weight_decay (float) –
amsgrad (bool) –
foreach (bool | None) –
maximize (bool) –
capturable (bool) –
differentiable (bool) –
fused (bool | None) –
- class aihwkit.optim.analog_optimizer.AnalogOptimizer(optimizer_cls, *_, **__)[source]
Bases:
AnalogOptimizerMixin
,Optimizer
Generic optimizer that wraps an existing
Optimizer
for analog inference.This class wraps an existing
Optimizer
, customizing the optimization step for triggering the analog update needed for analog tiles. All other (digital) parameters are governed by the given torch optimizer. In case of hardware-aware training (InferenceTile
) the tile weight update is also governed by the given optimizer, otherwise it is using the internal analog update as defined in therpu_config
.The
AnalogOptimizer
constructor expects the wrapped optimizer class as the first parameter, followed by any arguments required by the wrapped optimizer.Note
The instances returned are of a new type that is a subclass of:
the wrapped
Optimizer
(allowing access to all their methods and attributes).this
AnalogOptimizer
.
Example
The following block illustrate how to create an optimizer that wraps standard SGD:
>>> from torch.optim import SGD >>> from torch.nn import Linear >>> from aihwkit.simulator.configs.configs import InferenceRPUConfig >>> from aihwkit.optim import AnalogOptimizer >>> model = AnalogLinear(3, 4, rpu_config=InferenceRPUConfig) >>> optimizer = AnalogOptimizer(SGD, model.parameters(), lr=0.02)
- Parameters:
optimizer_cls (Type) –
_ (Any) –
__ (Any) –
- Return type:
- class aihwkit.optim.analog_optimizer.AnalogOptimizerMixin[source]
Bases:
object
Mixin for analog optimizers.
This class contains the methods needed for enabling analog in an existing
Optimizer
. It is designed to be used as a mixin in conjunction with anAnalogOptimizer
or torchOptimizer
.- regroup_param_groups(*_)[source]
Reorganize the parameter groups, isolating analog layers.
Update the param_groups of the optimizer, moving the parameters for each analog layer to a new single group.
- Parameters:
_ (Any) –
- Return type:
None
- set_learning_rate(learning_rate=0.1)[source]
Update the learning rate to a new value.
Update the learning rate of the optimizer, propagating the changes to the analog tiles accordingly.
- Parameters:
learning_rate (float) – learning rate for the optimizer.
- Return type:
None
- step(closure=None, **kwargs)[source]
Perform an analog-aware single optimization step.
If a group containing analog parameters is detected, the optimization step calls the related RPU controller. For regular parameter groups, the optimization step has the same behaviour as
torch.optim.SGD
.- Parameters:
closure (callable, optional) – A closure that reevaluates the model and returns the loss.
kwargs (Any) – additional arguments if any
- Returns:
The loss, if
closure
has been passed as a parameter.- Return type:
float | None
- class aihwkit.optim.analog_optimizer.AnalogSGD(params, lr=0.001, momentum=0, dampening=0, weight_decay=0, nesterov=False, *, maximize=False, foreach=None, differentiable=False)[source]
Bases:
AnalogOptimizerMixin
,SGD
Implements analog-aware stochastic gradient descent.
- Parameters:
maximize (bool) –
foreach (bool | None) –
differentiable (bool) –