Skip to main content

Guides

Opacus

Opacus is a library designed to work seamlessly with PyTorch models, allowing users to upgrade their existing machine learning models with privacy-preserving features. This is achieved without requiring users to engage deeply with the technical intricacies of differential privacy. By adjusting the model's training process, Opacus aims to protect user data, ensuring the model's outputs do not disclose individual data points from the training set.

Follow this straightforward guide to adapt a PyTorch model with Opacus:

  1. Model and DataLoader Preparation: Begin by setting up your standard PyTorch model and DataLoader. Ensure the DataLoader is configured to generate batches in a format Opacus can work with, which usually means organising them as tensors.
  2. Attach Privacy Engine: The core of Opacus integration is the Privacy Engine. This essential step alters how the model and optimiser work, introducing privacy-preserving mechanisms into the training.
  3. Training: Proceed with training your model as usual. With the Privacy Engine attached, Opacus automatically adjusts the training process to protect privacy. It does so by adjusting how gradients are computed and applied, ensuring that the training process does not compromise individual data privacy.

Now, let us see how we can use this with PrivateDataFrames within AG:

note

Opacus is currently in its beta phase, and only sequential models are supported at this time.

Example usages

In the following examples, we will dive into how to use Opacus in AG.

Example usage of PrivateDPDataLoader

In this example, we will use PrivateDPDataLoader to obtain a data loader from a PrivateDataFrame.

%%ag
from op_opacus import PrivateDPDataLoader

data = load_dataset("iris") # loading the iris dataset
train, test = train_test_split(data) # using the train_test_split function from op_pandas

train_x = train[["PetalLengthCm", "PetalWidthCm", "SepalLengthCm", "SepalWidthCm"]]
train_y = train["Species"]

data_loader = PrivateDPDataLoader.from_private_dataframe(
[train_x, train_y_encoded], dtypes=[torch.float, torch.long], batch_size=64
)

Wrapping an optimizer in the Privacy Engine

In this example, we will use PrivacyEngine's make_private_with_epsilon to wrap a PyTorch optimizer (optim.SGD) and obtain its private counterpart.

%%ag
from torch import optim, nn

model = nn.Sequential(nn.Linear(4, 16),
nn.ReLU(),
nn.Linear(16, 3),
nn.Softmax())

optimizer = optim.SGD(model.parameters(), lr=0.01)
privacy_engine = PrivacyEngine()
privacy_engine.make_private_with_epsilon(
module=model,
optimizer=optimizer,
data_loader=data_loader,
target_epsilon=3,
target_delta=1e-5,
epochs=10,
max_grad_norm=1,
)

Training a model

In this example, we will train a model by defining a custom train_callable function.

%%ag
def train_callable(model, optimizer, data, loss_function):
inputs = data[0]
labels = data[1]
optimizer.zero_grad()
outputs = model(inputs)
loss = loss_function(outputs, labels)
loss.backward()
optimizer.step()

train_model = TrainModel(privacy_engine, loss_function)
train_model.train(train_callable, verbose=2)

Saving and Loading Models

For saving models, you can export the state_dict for the differentially private modules.

This can be done in the following way:

%%ag
export(privacy_engine.model.state_dict(), "model_state_dict")

This will export the state dict to the local Jupyter environment, which can then be saved to a file.

While loading, we have to make some changes to the state dict so that it can be sent to the remote jupyter server.

from collections import OrderedDict

state_dict_serializable = OrderedDict((key, value.numpy().tolist()) for key, value in model_state_dict.items())
session.private_import(state_dict_serializable, 'state_dict')

Now, we can load the model state dict of a model in the following way:

%%ag
import numpy as np
for k, v in state_dict.items():
state_dict[k] = torch.from_numpy(np.array(v))
privacy_engine.model.load_state_dict(state_dict)

Resources

For detailed information about Opacus's function signatures and methods, refer to the official Opacus documentation.