Key fill rate
Key fill rates are one of the most important metrics in judging data quality, and how successfully two datasets can be used together in a query. Your key fill rate will be less than 100% if your original data is ambiguous, inconsistent or incomplete.
For example, your original dataset may contain customers' email addresses. Because an email address is a direct identifier, it can be used to make a key. But suppose that 25% of your customers did not give an email address, and a further 5% typed something obviously invalid. The key fill rate for that key will be 70%.
This means that only 70% of the rows in your dataset can be accessed by a query using that key. The remaining 30% are inaccessible to a query using that key, as though they had never been present in the dataset at all.
As we explain in the definition of keys, each dataset may have up to five keys, and each key will have its own fill rate.
See also representation fill rate.