# Scoring Metrics

To host a competition or set a goal while working with a dataset, it can be helpful to have a clear scoring metric as a north star. Multiple scoring mechanisms can be combined, and in most cases, we aim to use some proxy for accuracy with a loss based on the privacy budget spent to achieve it. Below are the scoring metrics currently supported on the platform:

## Metrics of Accuracy

All the metrics of accuracy involve an aggregation, either a sum or a mean. During a competition, these metrics are calculated using a differentially private measurement to ensure that information is not unintentionally leaked based on the leaderboard, serving as a feedback loop to the user. More details are available here.

Below are the current metrics of accuracy. More metrics will be added as required for competitions.

### $L_2$-Loss

Assuming $y$ represents the predictions you have made and $\hat{y}$ represents the true outputs, both indexed by $i$ in $(1, n)$, L2-loss is defined as:

The $L_2$-loss is typically used in regression problems.

### $L_1$-Loss

Similar to $L_2$-loss, assuming $y$ represents the predictions you have made and $\hat{y}$ represents the true outputs, both indexed by $i$ in $(1, n)$, the $L_1$-loss is defined as:

The $L_1$-loss is typically used in regression problems.

### Classification Accuracy

Classification accuracy is simply the number of correctly labelled samples divided by the total number of samples ($n$):

This is only applicable for classification problems.

## Metrics of Privacy Loss

The performance metrics in Antigranular also take into account the privacy budget used by the user to achieve a particular accuracy metric. As the focus is on $(\epsilon - \delta)$ differential privacy, both of these two parameters of privacy loss can be considered. For more of differential privacy please refer to here.

### Linear Epsilon Loss

This is the most straightforward loss, simply calculated as the product of the utilised epsilon and a weight:

### Soft Threshold Delta Loss

There are many guidelines for determining the safe amount of $\delta$ to be used in differential privacy. Typically, it is suggested to keep it below $\frac{1}{n}$. The Soft Threshold Delta Loss acts as an smoothened step function, heavily penalising the user when $\delta > \frac{1}{n}$ but having little effect otherwise:

where $\alpha$ and $\beta$ are scale parameters.