Formatting customer data
When you import a dataset to InfoSum Platform, you will be provided with a range of tools to normalize the data and map it into our Global Schema.
As best practice InfoSum recommends that data is prepared on the premise of one row per person and any associated attributes in individual columns tied to that row. Similar to the table below.
|
mobile_phone_number |
age |
Gender |
24 |
Female |
||
jzfh1xxz0nakHGKkV@pwPd3KB.co.uk |
6196036912 |
20 |
Male |
r28QdQn9uSURegC3t@DxUfajC.io |
8217609272 |
50 |
Male |
nmpT8cXD7ba9VVKnn@QgAcM7T.org |
71 |
Male |
|
MdCMKgXQaJr6Y3bYT@kxiKkmh.net |
28 |
Male |
Having descriptive column names - in particular for any customer identifiers - will make this a smoother process.
The table below outlines the identifiers commonly used in the Platform. During the normalization process, these identifiers are converted into keys, which are used to match rows in a query. Keys can be both deterministically (e.g. Email) and probabilistically (e.g. Full name and DOB) matched during a query. To see the list of standard keys that are defined in the Global Schema, click here. To see the accepted value(s) for each attribute within the Global Schema, see Global Schema attributes.
Our Bunker normalization technology has been purposely designed for users to feel comfortable with bunkering raw data, including emails. Our email normalization process begins by converting raw data to sha256 before it is further encrypted, salted and mathematically sketched. While we do accept Sha256 encoded emails for users who do not have access to raw data, match rates are optimal when users bunker raw emails. By the end of the normalization process, there is no translatable identifier information stored within an InfoSum Bunker. We recommend bunkering raw format identifier data where at all possible.
Using these column names below will help the Platform to understand the meaning of your identifiers.
Home Phone Number
SurnameName
AB1234CD-E567-89FG-H012-34IJK5L678901
Data type |
Column name |
Guidelines |
Examples |
|
Email Address |
Most international email address formats can be used. Emails can be provided in either SHA256 hexadecimal format or raw data. |
duncan@infosum.com duncan@infosum.co.uk duncan@domain.net |
Phone |
Mobile Phone Number |
Both mobile and home phone numbers can be imported in separate columns. |
07812345678 +44 1397 123456 |
Name |
Forename |
Name can either be formatted in separate or a single column. Forename and Surname should be separated into individual columns |
Duncan MacLeod Duncan MacLeod |
Age |
Date of Birth Age |
Age can be formatted as a DOB or a numerical value - or both. In the case of DOB, each data point should be split into individual columns. i.e 1 Column for YYYY, 1 Column for MM, 1 Column for DD |
YYYY-MM-DD YYYY-MM YYYY 30 |
Address |
Address Region Postcode UDPRN |
A range of address columns can be imported, ideally including UDPRN. Each datapoint should also be split into individual columns. I.e. Street in one column, Town in another Column, etc |
1 The Street, Glenfinnan PH37 1AB 12345678 |
Mobile Advertising ID |
Mobile Advertising ID (e.g. AAID, IDFA) |
Both Android's Advertising ID (AAID) and Apple's Advertising Identifier (IDFA) can be used. |
abcd123e-fg4h-56ij-7890k123l456 |
Social Media |
Social Media: Twitter Handle Social Media: Facebook ID |
Identifiers from most social media platforms can be used. |
@therecanbeonlyone 1234567890 |