IQL functions

InfoSum’s Insight Query Language (IQL) provides functions to analyze connected data.

This reference article describes the IQL functions you can use to query Datasets using the Query Tool. Four IQL functions are available: COUNT, AGGREGATE, TOPN, and SUM

IQL and SQL functions: Although IQL functions are loosely based on SQL functions, it is limited in functionality and only returns aggregated results.

For more information on IQL, see the IQL reference documentation.

Functions:

COUNT

AGGREGATE

Using Bins

Using Representations

TOPN

SUM (opens a new article)

COUNT

Description

COUNT returns the number of records in a Dataset or combination of Datasets.

COUNT returns a number based on the number of matching keys, which tells you the size of that audience.

Syntax

SELECT
  COUNT() 

FROM Dataset

Examples

To return the number of unique records in a Dataset:

SELECT
  COUNT()

FROM ACME

Output:

To return the number of unique records in a combination of Datasets:

SELECT
  COUNT()

FROM ACME INTERSECT LifeWeek

Output:

To segment this result, you can apply any number of filters. When we say apply filters, we mean to set attribute-based rules to only include a type of individual in the result. For example, the query below counts the number of unique individuals in three Datasets who meet the income and location criteria.

SELECT
  COUNT()

FROM ACME INTERSECT LifeWeek INTERSECT AutoSportsR

WHERE AutoSportsR."Personal Income"/Income > 20000 

AND ACME."existing customer"/"existing customer" = 'Yes'

Output:

To return the number of unique records between two values in a Dataset:

SELECT COUNT()
FROM ACME
WHERE ACME.Age BETWEEN 20 AND 50

Output:

To return the number of unique records for multiple values in a Dataset:

SELECT COUNT()
FROM ACME
WHERE ACME.Gender/"Binary and Other" IN ('Female','Other')

Output:

AGGREGATE

Description

AGGREGATE performs a calculation on individual records and returns a summarized representation of those values. AGGREGATE requires you to specify how the statistics are reported on - either as bins or representations.

The aggregate function will return a maximum of 512 different values. For categories with higher cardinality, it is recommended to use TOPN instead.

Syntax

SELECT
  Aggregate(
    Category {bin1, bin2, bin3},
    Category Category/Representation
  )

FROM Dataset

Note: An AGGREGATE query must always contain either a bin or a representation or both, otherwise an error message will be returned. Category is a specific type of information contained in a Dataset.

Using Bins

This section describes how you can use bins with AGGREGATE. A bin is a group of values that are counted together when using an insight query to produce statistical reports. For example:

SELECT
  Aggregate(
    Category {bin1, bin2, bin3},
    Category {'string1', 'string2'}
  )

FROM Dataset

Where:

Each bin or string is separated with a comma and a space (e.g. bin1, bin2). Additionally, each string is enclosed with single quotes.
One or more string or bin can be specified, but you cannot mix them.

A typical query would look like this:

SELECT
  Aggregate(
    ACME.Age {10, 15, 20, 25, 30},
    ACME.Gender {'Female', 'Male'}
  )

FROM ACME

Output:

The way you define the bins for numeric and categoric data is different (see the table below). In the age example above, the specification of (10, 15, 20, 25, 30) will output the bins ([10, 14], [15, 19], [20, 24], [25, 29]), whereas 'Female' and 'Male' will be outputted as two categorical bins, as shown.

Using Representations

You can use representations in two ways:

To create rows named after values in representations.
To create bins based on representations.

Creating rows named after values in representations

For example:

SELECT
  Aggregate( 
    AutoSports."in market for car"/"in market for car", 
    AutoSports."Marital Status"/"Marital Status"
  )

FROM AutoSports

Output:

Creating bins based on representations

The query format is the same as for using bins, for example:

SELECT 
  AGGREGATE(
    ACME.Age/"HMRC Age Range"
  )

FROM ACME

Output:

Using bins and representations in the same query

The example below shows how you can use bins and representations in the same AGGREGATE query.

SELECT
  Aggregate(
    ACME.Age {20,30,40,50},
    AutoSports."sports lover"
  )

FROM ACME INTERSECT AutoSports

Output:

The table below describes each type of bin you can use in an Aggregate query.

Bin Type	Description	Format	Example
Categoric Bins	Returns string values for a category.	`Category {'string1', 'string2'}` The names of categoric bins are surrounded by single quote marks. In IQL (like in SQL), single quote marks identify a string literal.	`Gender {'Female', 'Male'}`
Numeric Bins	Returns continuous values that are numeric.	`Category {bin1, bin2, bin3}` A numeric bin can be any number between a minimum and maximum (or possibly any number at all).	`Age {10, 15, 20, 25, 30}` Outputs the bins: 10-15, 15-20, 20-25, 25-30
Bins based on representations	Returns continuous values grouped into bins contained in the category. Bin representations are only available for some categories.	`Category/Representation` A slash (/) is used to separate the category from the representation. Categories and representations that contain a space must be surrounded by double quote marks, for example: `Gender/"Extended Gender"` `"Personal Income"/Income` `mum/mum`	`Age/”5 Year Bins”` Outputs 5 year bins without having to list them out as in the previous example.
Representations - grouped into ranges	Each of the bins are based on representations. Returns continuous values grouped into ranges.	`Category/Representation` A slash (/) is used to separate the category from the representation. Categories and representations that contain a space must be surrounded by double quote marks, for example: `Gender/"Extended Gender"` `"Personal Income"/Income` `mum/mum`	`Age/"Employment Range"` Employment Range representation of the Age category groups values into 0-18, 19-25, 26-35, and so on. Each of these ranges counts as a bin.
Representations - discrete values from a predefined list	Can only return a value from a predefined list - each of the possible values counts as a bin.	`Category/Representation` A slash (/) is used to separate the category from the representation. Categories and representations that contain a space must be surrounded by double quote marks, for example: `Gender/"Extended Gender"` `"Personal Income"/Income` `mum/mum`	`Address/Region` The Region representation of the Address category returns strings like London or Scotland, and each of those strings counts as a bin.

How privacy controls can affect query results

Privacy controls add a deliberate degree of error to statistical results. This small margin of error prevents scenarios where individuals could be identified by a series of carefully-crafted queries.

Because privacy controls prevent queries reporting on very small numbers of individuals, the following privacy controls can affect your query results:

Rounding defines the number that bin counts will be rounded down by.
Redaction defines the number of results required for a bin to be included in the returned results. This means that no results are returned if a query reports on too few individuals.

The rounding and redaction thresholds for your Dataset are set to 100 by default, so any bins that contain less than 100 records are shown as zero. However, the total records count will not be affected by this and will show all records for a query, including any records in zero counted bins. The following example contains bins with less than 200 items for a Dataset with a redaction threshold of 200:

Query:

SELECT
  Aggregate(
    AutoSportsS.Age/"Population Range",
    AutoSportsS.Industry/Industry
  )

FROM AutoSportsS

Output:

In the above screenshot, the Dataset redaction threshold set to 200 means that any industries with a count of less than 200 show as zero even when records exist for a bin. For example, the AutoSportsS Dataset contains 189 records for Business, Consulting and Management for individuals aged 75 to 80, but the count shows zero.

TOPN

Description

TOPN runs an AGGREGATE query on the entire Dataset and returns the top number of aggregated values, sorted from highest to lowest.

TOPN is ideal for aggregating categories with high cardinality, where you are interested in learning about the frequently appearing values in a category, such as the top 5 job titles or the top 10 postcodes.

Syntax

TOPN(Category, ROWNUM)

Where:

ROWNUM is the number of rows to output by highest fill rate.

Category is a specific type of information contained in a Dataset.

Examples

To return a table showing the five most frequent age groups in the audience:

SELECT 
  TopN(Age/"Advert Range", 5)

FROM AutoSportsS

Output:

SUM

You can use IQL to calculate the Sum and Avg across the intersections of different Datasets. This is particularly useful for insights or measurement.

Please see our SUM and AVG article for more information.

Hi, How can we help?

IQL functions

InfoSum’s Insight Query Language (IQL) provides functions to analyze connected data.

COUNT

Description

Syntax

Examples

AGGREGATE

Description

Syntax

Using Bins

Using Representations

Creating rows named after values in representations

Creating bins based on representations

Using bins and representations in the same query

How privacy controls can affect query results

TOPN

Description

Syntax

Examples

SUM

Was this article helpful?