Data structure recommendations
Basic platform best practices
- Our platform works best with one row per customer/customer-level data
- Include human-readable column names and data to allow for easy partner analysis
Uploading hashed data
The onboarding steps need to be completed in this order: Normalize > Hash
Normalization can happen in the platform directly if you are onboarding raw data.
Data can be onboarded pre-hashed to SHA256 standard, but it needs to be normalized first to ensure that after hashing the values still match.
Uploading salted data
We do not recommend uploading salted data as this is not necessary due to the decentralized nature of our platform.
If salt is required, the onboarding steps need to be completed in this order: Salt > Normalize (> Hash [optional])
- Data can be onboarded with a salt that all parties agree on to ensure that encrypted values match.
- Please consult with your collaboration partners before using any salt. Not many companies in our ecosystem salt their datasets and it would need to be the exact same salt.
Tips to make the platform more usable
- Using multi-value attributes is more column-volume efficient and creates more interesting insights
- For data providers/publishers: We recommend your always on focus to be having an off-the-shelf data schema you can make available to everyone to productize with ease. Focus on making the data usable and driving your customers to value quickly. Custom projects that require very specific slices of data or more granularity can be generated on the side for customers who want more.
Specific data type recommendations
KEYS:
For more information on data formatting rules for keys please read our Normalization rules page.
ID |
Recommendations |
|
Can be provided in either SHA256 hexadecimal format or raw data. Must be in a single column. If you use SHA256 format, ensure all email addresses are normalized in lowercase with leading/trailing white spaces removed before you convert to SHA256 format. |
- First name & last name |
Provided in raw format All words lowercase Forename and Surname should be separated into individual columns |
- Postcode - Zipcode |
Always provided in raw format For US address please provide zip5 or Zip9 PH37 1AB (Example) |
- Phone number |
Always provided in raw format Valid phone number in valid E.164 format When activating to Meta you need a second column that includes the country so please ensure your dataset has that information if you plan to use the Meta destination |
- IP address |
There is no format enforcement but most clients provide this in IPv4 e.g. 116.61.80.61, 30.161.132.202, 137.143.254.196, 62.158.243.253 |
ATTRIBUTES:
Attribute |
Recommendations |
- Age |
Leverage one of the most useful existing groupings such as 5 or 10 year bins. The platform will automatically give you the option to view this representation if you input this in a different format like DOB. DOB must always provided in raw format. Must be in three columns, each with a separate input value for "yyyy", "mm", and "dd". If your DOB is in a single column (for example, YYYY-MM-DD, DD-MM-YYYY), you can use transformations to convert them into three columns. |
- Cost- Price- Revenue |
The platform does not recognize currency symbols or currency as a data format. Due to our focus on privacy, you cannot perform calculations that you would usually perform on currency figures We recommend:
If your analysis is going to be heavily into cost efficiencies please speak to our team to get a more tailored recommendation |
- Ratios or percentages E.g. share of wallet |
The platform does not recognize % signs or percentages as a data format. We recommend:
|
- Dates- Timestamps |
The platform does not recognize dates as a format. Due to our focus on privacy, we don’t allow to use dates in the usual way as it would aid in re-identifying individuals.
We recommend that you keep only the necessary information. For example, it might be enough to have just months or months and years to create actionable insights, or you might care about the date that something |
- Interests- Characteristics- Content consumption- Contextual |
Using multi-value attributes is more column volume efficient and creates more interesting insights, by grouping similar interests or interest fields into one column. E.g. instead of one column per type of sport, create a ‘favorite sports’ column where you can list the sport or sports. It can be challenging to provide relevant insights without going to the highest level of granularity which makes analysis more cumbersome. We suggest you include just two levels of granularity, for example:Favourite sports > type of sport (multi-value) > sports clubTV channel > Type of program > program titleMusic > Music genre > Artist You may also wish to upload your content organized following a common mendia language such as IAB categories, if you’re using these already as it can reduce the data preparation needed on your side. Following these categories (names and format) can create more transferable insights for the brands (e.g. they can activate the insight programmatically or as part of a media plan) |