Writing insight queries
An insight query is a query that returns statistical results, founded on aggregated, anonymized data.
The original data remains securely in each Bunker, and is never shared or revealed. In this way, insight queries help guarantee data security and ensure regulatory compliance.
The simplest insight query comprises of two clauses. The SELECT clause defines what you are trying to achieve in terms of information gained, and the FROM clause defines the audience you are intending to learn about. The example below would return statistical information on an attribute category.
SELECT
Aggregate(
Category/"Representation",
Category {'string1', 'string2'},
Category {bin1, bin2, bin3}
)
FROM mydataset
So, to aggregate age and gender, you would write something like this:
SELECT
Aggregate(
Age/"Five Year Bins",
Gender {'Female', 'Male'}
)
FROM
(A Union B) Exclude C
For more information on aggregations, see select query type.
As you can see above, the FROM clause is used to reference the datasets and build relationships between them through operators. For more information, see audience definition.
If you want to, you can identify a subset of these records by using filters. The WHERE can be used to define attribute-based criteria. For example, building on the example above, you may be interested in learning more about those with a certain age and income.
SELECT
Aggregate(
Age/"Five Year Bins",
Gender {'Female', 'Male'}
)
FROM
(A Union B) Exclude C
WHERE
Age < 30 AND
Income >= 20000
For more information on filters, please see filters.
There are two further optional clauses ENRICH WITH and LINK WITH, which can be used to to include categories from other datasets and glue datasets to link keys, respectively.
SELECT
Aggregate(
Age/"Five Year Bins",
Gender, {'Male', 'Female'}
)
FROM
(A Union B) Exclude C
WHERE
Age < 30 AND
Income >= 20000 AND
Address/Region = 'Scotland'
ENRICH WITH D
LINK WITH E
For more information about the ENRICH WITH and LINK WITH clauses, see enrichment datasets and linking datasets.
To build, test and run queries, try using the Query Tool.