Redaction

Redaction is a privacy control, designed to prevent accidental or deliberate identification of individuals through the results of insight queries.

Redaction controls the size of the population which may be reported on. If a query would report on too few individuals, then no results are returned at all.

Each dataset has a configurable redaction threshold, set by the dataset's owner. A query involving that dataset is redacted if it would report on a number of individuals below the threshold.

When a query involves more than one dataset, the query is redacted if it would report on a number of individuals less than the highest of the redaction thresholds for any of the datasets.

For example, suppose datasets A and B both contain millions of records, but there are only 50 individuals who appear in both datasets. Suppose also that one of the datasets has a redaction threshold of 100. A user submits a query requesting aggregated demographic information for users who appear in both A and B. This query will be redacted - because it would report on just 50 individuals, which is below the redaction threshold of one of the datasets.