Category fill rate

The category fill rate is a measure of how successfully the original data in your dataset has been normalised.

The higher the category fill rate, the more successful the normalisation, and the better the quality of query results.

Before reading on, ensure you understand the concepts of categories and representations.

When you normalise your data, your Bunker attempts to interpret your data according to the categories you have assigned. That process may fail for particular rows. This might be because the data is simply missing, or it might be mis-formatted in a way which prevents your Bunker interpreting it.

Each category in your dataset has its own category fill rate, which reflects the proportion of rows where that category has successfully been interpreted. For example, if the fill rate for a category is 50%, then half the rows were successfully interpreted and the other half were not.

Some categories have more than one representation. In this case, the category fill rate is calculated based on the best representation (the one which yields the highest fill rate).

If a particular category has a low category fill rate - and if this is not simply because the original data is missing - you may be able to use mappings or transformations to improve the quality of the original data.

See also key fill rate.