Skip to main content

API Reference

TensorFlow Privacy

TensorFlow Privacy is a Python library developed by Google that enables training of machine learning models with privacy guarantees, in particular through the implementation of differential privacy. It allows developers to apply differential privacy techniques to protect the training data's privacy without significantly compromising the model's accuracy. This is particularly useful in scenarios where sensitive data is used, ensuring that the model does not inadvertently reveal any specific details about the individuals represented in the training data.

We have developed a custom library, op_tensorflow, designed to integrate tensorflow-privacy with the private data objects available in op_pandas. Users familiar with TensorFlow and Keras will find the APIs of our library intuitive and easy to use.

note

The below sections contain the in-depth api references. To understand the usage flow of op_tensorflow , checkout the following guide.

PrivateDataLoader

We use PrivateDataLoader , a data generator function which enables private objects from op_pandas to be used seamlessly with PrivateKerasModel

  • The data loader internally uses Poisson-sampling which significantly reduces the privacy costs.
  • The sampling rate used in PrivateDataLoader depends on the input batch_size and sample_size of the features and labels.
from op_tensorflow import PrivateDataLoader
class PrivateDataLoader(
feature_df: PrivateDataFrame | PrivateSeries
label_df: PrivateDataFrame | PrivateSeries
batch_size: int
)
  • feature_df : Private data object containing the feature data.
  • label_df : Private data object containing the label data.
  • batch_size : Size of the batch that will be used while use the dataloader for training.

PrivateKerasModel

PrivateKerasModel is a wrapper class around tensorflow.keras which converts any standard keras model to easily integrate with op_pandas and leverage differentially private training of models.

from op_tensorflow import PrivateKerasModel
class PrivateKerasModel(
model: Keras.Model | Keras.Sequential,
l2_norm_clip : float,
noise_multiplier : float
)
  • model : Any standard keras model.
  • l2_norm_clip : Clipping threshold for l2_norms during model training.
  • noise_multiplier : The noise_multiplier parameter in TensorFlow Privacy plays a critical role in the implementation of differential privacy (DP) for machine learning models. It directly influences the amount of noise that is added to the gradients during the training process to ensure the model's outputs do not compromise the privacy of the data used for training.

fit

Trains the model for a fixed number of epochs (dataset iterations) in a differentially private manner using DP-SGD.

PrivateKerasModel.fit(
x : PrivateDataLoader | PrivateDataFrame | pandas.DataFrame,
y = None : PrivateDataFrame | pandas.DataFrame ,
batch_size=32,
epochs=1,
target_delta=1e-5,
*args,
**kwargs
):
  • data: There are 2 combinations via which data can be feeded.
    • x = PrivateDataLoader and y = None
    • x and y are op_pandas/pandas data objects.
  • batch_size : The batch size needed for each batch in one epoch
  • epochs : Number of training iterations
  • target_delta : The target delta that you want to acheieve for the entire training process.
  • *args | **kwargs : Refer here for the complete list of arguments.

evaluate

Returns the loss value & metrics values for the model in test mode.

PrivateKerasModel.evaluate(
x : PrivateDataLoader | PrivateDataFrame | pandas.DataFrame,
y=None : PrivateDataFrame | pandas.DataFrame ,
*args,
**kwargs
):
  • data: There are 2 combinations via which data can be feeded.
    • x = PrivateDataLoader and y = None
    • x and y are op_pandas/pandas data objects.
    • *args | **kwargs : Refer here for the complete list of arguments.

predict

Generates output predictions for the input samples.

PrivateKerasModel.predict(
x : PrivateDataLoader | PrivateDataFrame | pandas.DataFrame,
y=None : PrivateDataFrame | pandas.DataFrame ,
*args,
**kwargs
):
  • data: There are 2 combinations via which data can be feeded.
    • x = PrivateDataLoader and y = None
    • x and y are op_pandas/pandas data objects.
    • *args | **kwargs : Refer here for the complete list of arguments.

Blocked methods

The below methods are restricted mainly because they could cause side-effects via file/network access. Some other methods are restricted as they are depreciated from the main repository.

  • Depreciated
    • "train_on_batch"
    • "test_on_batch"
    • "predict_on_batch"
    • "fit_generator"
    • "evaluate_generator"
    • "predict_generator"
    • "train_step"
    • "test_step"
    • "predict_step"
  • File IO
    • "load_weights",
    • "save_weights",
    • "save",
    • "load_own_variables",
    • "save_own_variables",
    • "export"

get_privacy_budget

Utility function to estimate the epsilon needed for training a model based on various model parameters and target_delta.

from op_tensorflow import get_privacy_budget
get_privacy_budget(
sample_size : int ,
batch_size : int,
num_epochs : int,
noise_multiplier : float,
target_delta : float
):