# Guides

## TensorFlow Privacy

TensorFlow Privacy is a Python library developed by Google that enables training of machine learning models with privacy guarantees, in particular through the implementation of differential privacy. It allows developers to apply differential privacy techniques to protect the training data's privacy without significantly compromising the model's accuracy. This is particularly useful in scenarios where sensitive data is used, ensuring that the model does not inadvertently reveal any specific details about the individuals represented in the training data.

We have developed a custom library, `op_tensorflow`

, designed to integrate `tensorflow-privacy`

with the private data objects available in `op_pandas`

. Users familiar with TensorFlow and Keras will find the APIs of our library intuitive and easy to use.

Tensorflow Privacy is currently in its beta phase, and only Sequential models can be created in Antigranular environment. However , you can load any locally created model using the **following approach**.

## Importing the library

To use the `op_tensorflow`

, you need to import the library as presented in the following code block:

`%%ag`

import op_tensorflow

## Creating DP Models

`op_tensorflow`

makes the traiing of a differentially private model as seamless as possible with the following features

- It leaverages
`PrivateKerasModel`

— a wrapper class around`tensorflow.keras`

which converts any standard keras model to adhere to differentially private training. - The training is done using a differentially private version of stochastic gradient descent (DP-SGD).
- Differentially private model that is created works seamlessly with all the optimizers and loss functions available in
`tensorflow.keras`

.

`from op_tensorflow import PrivateKerasModel`

import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

# Normal keras model

seqM = Sequential(

[

Dense(16, activation="relu", input_shape=(2,)),

Dense(8, activation="relu"),

Dense(1, activation="sigmoid"),

]

)

# create DP keras model

dp_model = PrivateKerasModel(model = seqM, l2_norm_clip=1, noise_multiplier=1.2)

# Use a standard (non-DP) optimizer directly from keras.

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

# Use a standard (non-DP) loss function directly from keras.

loss = tf.keras.losses.MeanSquaredError()

# PrivateKerasModel uses similar API as standard Keras

dp_model.compile(

optimizer = optimizer,

loss = loss,

metrics = ["accuracy"]

)

## Training DP Models

We use `PrivateDataLoader`

, a data generator function which enables private objects from `op_pandas`

to be used with `PrivateKerasModel`

- The data loader internally uses Poisson sampling which significantly reduces the privacy costs.
- The sampling rate used in PrivateDataLoader depends on the input
`batch_size`

and`sample_size`

of the features and labels.

`%%ag`

from op_tensorflow import PrivateDataLoader

# pdf is a PrivateDataFrame object.

X = pdf[["feature1", "feature2"]]

y = pdf["target"]

data_loader = PrivateDataLoader(feature_df=X , label_df=y, batch_size=4)

Once the data_loader is created, we can directly send this onto the DP Model we created earlier with the `dp_model.fit`

method.

`%%ag`

dp_model.fit(x=data_loader , epochs = 20 , target_delta = 1e-5)

"""

model.fit ->

noise_multiplier = 1.2 ( from private_keras_model )

batch_size = 4 ( from private_data_loader )

epochs = 20 ( from fit function )

target_delta = 1e-5 ( from fit function )

"""

## Estimating privacy budgets

Before training, it is advised to get an estimate of the `epsilon`

that would be required for the training parameters of your model.
This would avoid undesirable suprises of accidently exhausting your budgets.

`%%ag`

from op_tensorflow import get_privacy_budget

get_privacy_budget(

sample_size=100000,

batch_size=32,

num_epochs=1000,

noise_multiplier=1.5,

target_delta=1e-5,

)

`OUTPUT`

=> EPSILON_REQUIRED = 0.2778802396005823 using TARGET_DELTA = 1e-05

Training parameters used :-

SAMPLE_SIZE = 100000

BATCH_SIZE = 4

NUM_EPOCHS = 20

NOISE_MULTIPLIER = 1.2

## Importing locally trained models

You can save your local model configs into a dict/str which can be sent from your local environment to AG using `private_import`

method.

`model_info = local_model.to_json()`

weights = local_model.get_weights()

session.private_import(data=model_info, name='model_info')

session.private_import(data=weights , name='weights')

Now you can use this private imported info to create a PrivateKerasModel in AG environment.

`%%ag`

import tensorflow

import op_tensorflow

model = tensorflow.keras.models.model_from_json(model_info)

model.set_weights(weights)

dp_model = PrivateKerasModel(model = model, l2_norm_clip=2, noise_multiplier=1.5)

## Predicting results from model

You can consider a model to be a complex mathematical transformation function like a blackbox which gives an output based on some input.

- The DP trained weights are used to predict the output for a particular input using
`model.predict()`

- Since it is a transformation function, private inputs generates private outputs.

`# test_pdf is a slice of the feature_df ( obtained using op_pandas.test_train_split )`

# label_columns basically sets a name for the column in the result_pdf ,

# If not mentioned , it will just assign (0,1,2 ...) for each label.

result_pdf = dp_model.predict(x = test_pdf , label_columns=["result"])

## Resources

For detailed information about TensorFlow Privacy's function signatures and methods, refer to the official TensorFlow Privacy documentation.