Multi-Value Keys

Supports multiple values within a key

This feature will give clients the ability to upload data in a single row with multiple values for a single key

How does this work?

Let’s take a scenario of a customer importing an identity data. In some circumstances, a user can be identified by multiple email addresses or multiple cookies. 

This is how a customer imports data without Multi-value key feature: 

Internal ID

First Name

Email

Cookie

Mobile Advertising ID

1

John

#email1

cookie1

maid1

1

John

#email2

cookie2

maid1

2

Dave

#email3

cookie3

maid2

2

Dave

#email4

cookie4

maid2

3

Jamie

#email5

cookie5

maid3

3

Jamie

#email6

cookie6

maid3

4

Katy

#email7

cookie7

maid4

Here is how the data is transformed by the Multi-value key feature:

Internal ID

First Name

Email

Cookie

Mobile Advertising ID

1

John

#email1,#email2

cookie1,cookie2

maid1

2

Dave

#email3,#email4

cookie3,cookie4

maid2

3

Jamie

#email5,#email6

cookie5,cookie6

maid3

4

Katy

#email7

cookie7

maid4

With this feature, the Platform can accept a file with multiple identifiers per individual in a single row. In the above example, the customer can import a file with 4 rows instead of 7 rows. 

The primary benefit is that you can have more rows in your file if any of your keys has multiple values. The multi-value keys produce the same aggregation output as a single value. Every value in a multi-value key is treated independently when used in matching.

Implementation

Upload:

In the first version, InfoSum will only accept the data already in the form of an array/list type for CSV files uploaded or transferred via SFTP/S3/GCP. The maximum number of identifiers per key can be 25.

That means you will need to merge your rows prior to upload into the InfoSum platform. In later versions, we can support merging the rows inside the InfoSum platform.

Merging will be achieved using dual delimiters in the file. One to split the columns and the other to split the entries in a list within a column. 

In the below example, we are uploading a file containing multi-value key columns (Email & Vehicle Registration Number). The data is already in the form of an array/list type.

In preview settings, the Platform will show the delimiter used in the multi-value column. If it’s not the correct delimiter, select the right delimiter in the dropdown list.

Click on the toggle next to the column header and enable a multi-value column.

Screenshot 2020-09-25 at 11.19.26

Repeat the same process for all your multi-value columns. You can perform some other optional minor manipulations to the source data here, please see this article for information. 

Click “Accept Preview Config” when you are ready to normalise your data, before publishing it. 

Normalisation:

There is no difference in normalising your data for Multi-value columns or single value columns.

Matching:

Matching happens in the same way as before. Every value in a multi-value key is treated independently when used in matching. Matching happens between each individual identifier in a multi-value column against a single value column or each individual identifier in a multi-value column.

The Platform shows the matched audience total at each individual (row) level rather than at the identifier level.

Let’s take some scenarios.

Scenario 1:

We are matching between two datasets and both of them contain a multi-value column (Email) and we are using Email as the key to match these datasets.

Dataset A

Internal ID

First Name

Email

1

John

#email1,#email2

2

Dave

#email3,#email4

3

Jamie

#email5,#email6

4

Katy

#email7

Dataset B

Internal ID

First Name

Email

1

John

#email1,#email2

2

Dave

#email3,#email0

3

Jamie

#email5,#email4

4

Lauren

#email8

When the Platform matches Dataset A and Dataset B, it will report the matched audience total as 3 (total no. of rows matched), not the identifiers matched. The Platform reports the total number of the combined audience on an individual level rather than the identifier level because an individual can be represented by multiple identifiers. 

For example, John can be represented in two emails (#email1,#email2) but the Platform reports this as one match because those two emails belong to one individual.

Scenario 2:

We are matching between two datasets and only one of them has a multi-value column. Both datasets have Email but Dataset A has a multi-value column and Dataset B has single value column

Dataset A

Internal ID

First Name

Email

1

John

#email1,#email2

2

Dave

#email3,#email4

3

Jamie

#email5,#email6

4

Katy

#email7

Dataset B

Internal ID

First Name

Email

1

John

#email1

1

John

#email2

2

Dave

#email3

2

Dave

#email0

3

Jamie

#email5

3

Jamie

#email4

4

Lauren

#email8

When the platform matches Dataset A and Dataset B, it will still report the matched audience total as 3 because joining happens using Email Key and the counting using the Internal ID Key. The Platform reports the total number of the combined audience on an individual level rather than an identifier level because an individual can be represented by multiple identifiers. 

Activation:

Currently, the InfoSum Platform does not support a multi-column key as an output column. You can choose only single value columns as an output but can match using multi-value columns. In the above example, a match can happen using Email (MV column) but you can activate a single value column such as an Internal ID or any other single value key.