Scoring Metrics
To host a competition or set a goal while working with a dataset, it can be helpful to have a clear scoring metric as a north star. Multiple scoring mechanisms can be combined, and in most cases, we aim to use some proxy for accuracy with a loss based on the privacy budget spent to achieve it. Below are the scoring metrics currently supported on the platform:
Metrics of Accuracy
All the metrics of accuracy involve an aggregation, either a sum or a mean. During a competition, these metrics are calculated using a differentially private measurement to ensure that information is not unintentionally leaked based on the leaderboard, serving as a feedback loop to the user. More details are available here.
Below are the current metrics of accuracy. More metrics will be added as required for competitions.
-Loss
Assuming represents the predictions you have made and represents the true outputs, both indexed by in , L2-loss is defined as:
The -loss is typically used in regression problems.
-Loss
Similar to -loss, assuming represents the predictions you have made and represents the true outputs, both indexed by in , the -loss is defined as:
The -loss is typically used in regression problems.
Classification Accuracy
Classification accuracy is simply the number of correctly labelled samples divided by the total number of samples ():
This is only applicable for classification problems.
Metrics of Privacy Loss
The performance metrics in Antigranular also take into account the privacy budget used by the user to achieve a particular accuracy metric. As the focus is on differential privacy, both of these two parameters of privacy loss can be considered. For more of differential privacy please refer to here.
Linear Epsilon Loss
This is the most straightforward loss, simply calculated as the product of the utilised epsilon and a weight:
Soft Threshold Delta Loss
There are many guidelines for determining the safe amount of to be used in differential privacy. Typically, it is suggested to keep it below . The Soft Threshold Delta Loss acts as an smoothened step function, heavily penalising the user when but having little effect otherwise:
where and are scale parameters.