The WHERE clause
As we explained in IQL overview, the second part of an IQL query is the
WHERE clause. The
WHERE clause filters the records included in the query, identifying a subset of records to be analysed by the aggregation invoked by the
WHEREclause is optional. If you omit it, then all the records in the relevant dataset are passed to the aggregation.
Using comparison operators
WHERE clause filters using the value of a category, like this:
WHERE mydataset.Age > 60
When combined with a suitable
SELECT clause, this will produce statistical analysis only of individuals in your dataset who are aged over 60.
Using IN and NOT IN
IN keyword to find out whether a value is in a list of possibilities you specify. For example:
WHERE mydataset.Address.Region IN ('North East', 'North West', 'Scotland')
WHERE clause matches only the rows where the address is in one of the listed regions. (More formally, it matches rows where the
Region representation of the
Address category matches one of the listed strings.)
You can use
NOT IN to select rows which don't match the listed values:
WHERE mydataset.Address.Region NOT IN ('North East', 'North West', 'Scotland'
Referencing another dataset
The examples so far have worked with a single dataset, using a
WHERE clause to filter the rows according to your specified criteria.
WHERE clause can also refer to a different dataset - either another dataset you own, or a dataset you have been given permission to reference. In this way, you can enrich your data using information from another party, and gain new insights into your existing organizational knowledge.
For example, this complete insight query analyses the ages of individuals in your dataset - but only if they live in Scotland. To determine whether the individuals live in Scotland, the query refers to a second dataset owned by another user.
SELECT HISTOGRAM(mydataset.Age."5 Year Bins") WHERE theirdataset.Address.Region = 'Scotland'
Or this complete identity query returns a list of Twitter handles for individuals in your dataset, but only if they are aged 25. Again, the age is derived from a second dataset owned by another user.
SELECT mydataset.TwitterHandle WHERE theirdataset.Age."Years Old" < 25)
If you're used to writing queries in SQL, note that you don't need to use the
JOINkeyword in IQL. InfoSum automatically applies the correct behaviour when you reference two datasets in a single query.
You now know how to write the
WHERE clause of an IQL query. Finally, we'll see how to use subqueries to filter the records in a dataset depending on whether they're present in another one.