A key is an anonymised version of a direct identifier, used to match up entries in separate datasets which refer to the same person.
In other words:
- a key denotes an individual person who appears in a dataset
- if two people appearing in separate datasets have the same key, then they are the same person
- but the person cannot be identified in the real world using the key.
When you normalise your data, your bunker converts direct identifiers to keys, then securely deletes the original data. Once this process is complete, the original direct identifiers cannot be recovered - even in the unlikely event that somebody else gained access to your Bunker.
Because each row in your original dataset may contain more than one direct identifier, your dataset may have more than one key. For example, if your original data contains both a social media handle and an email address, then each of these will become a key.
Sometimes, a key may be made from a combination of data. For example, a postal address on its own is not enough to make a key, because more than one person may live at the same address. But a postal address combined with a date of birth can make a key.
Each dataset supports a maximum number of keys, as shown in InfoSum Platform limitations. If more than the number of possible keys can be made from your data, you will be asked to select your preferred keys when you publish your dataset.