Intersection size

The intersection size between two datasets is a measure of how many rows can be matched between them using the selected key.

To put it another way, it shows what proportion of the individuals identified in each dataset are also known to be present in the other one.

Because a dataset may have more than one key, and the key fill rates may differ, the intersection size depends on the key selected. Generally, InfoSum Platform will select the key which produces the largest intersection size, but you can override this using query metadata.

The intersection size is an approximation, due to the application of privacy controls. The exact margin of approximation depends on the properties of the data, and particularly on whether any single dataset contains more than one row relating to the same individual. However, a margin of plus or minus 1 - 2% is typical.

In some circumstances, the intersection size is expressed as a percentage of the total rows in the dataset. In this case, there are two percentages, one for each dataset involved in the intersection. These percentages will of course be different if the original datasets are of different sizes.