Onboarding Your Data
Please ensure you’ve read the Preparing your data article before this one.
Table of Contents
IP Allowlist and Firewall Settings
Create a Cloud Vault and a Connector to Import
Normalize, and Publish your Data to a Bunker
Important note |
The InfoSum platform offers self-serve data import. Please do not send customer data to your InfoSum representative. Data should flow directly from your tech stack into the data clean room. The InfoSum platform does NOT offer self-registration. To create your company and initial user accounts please fill in this form. Clients wishing to add or remove additional users will need to contact their InfoSum representative. |
To use the InfoSum platform you will need to create:
- (Provided by the client) An import method, such as local file upload or server import. We recommend using server imports as these support the automation of imports.
- (Provided by InfoSum) A Cloud Vault in your business region which is a virtual folder on the InfoSum platform that you use to manage data importing tasks. We currently support EU (Germany), UK, US, and AU.
- (Provided by InfoSum) An Insight Bunker and (optionally) an Activation Bunker in your business region.
This article will explain how to set up your data to connect it to your Bunker
IP Allowlist and Firewall Settings
If you are using a VPN, or your firewall restricts upload/download of data, you may need to authorize relevant IP addresses.
Import Process Overview
To publish your data to a Bunker you will need to go through two distinct processes: connecting your files to a Cloud Vault, and normalizing and publishing your data.
Each of these steps is self-contained, meaning that you don’t need to complete them all in one session.
If you are using Local File Import instead of Server Import - you won’t have to complete the full process outlined in Connect your files.
Create a Cloud Vault and a Connector to Import
Cloud Vault is an InfoSum environment where your data is hosted and prepared before getting published to a bunker. You will create this within the platform, go to the Dashboard and click on ‘Start a new Import’
Then, you will need to create an import connector configuration (“ICC”). The ICC stores the access credentials for connecting to your desired environment.
We currently support:
- Local file upload (please note that importing cannot be automated using this option)
- SFTP
- S3 cross-account
- S3 access key (please note that AWS is working to deprecate access key authentication, so we recommend using the S3 cross-account instead)
- Google Cloud Storage (GCS)
Once you have created an ICC, create the importer, and then run that importer to Connect your data to your Cloud Vault. This step is required only once for each of your upstream data sources. For example, if your data is solely coming from S3, this is a one-time step.
You can now import your dataset to Cloud Vault.
Level of Complexity |
Low |
Estimated timing |
5-10 minutes processing time, <5 minutes user time |
Role(s) involved |
Technical or non-technical users with access credentials. Please ensure that your account admin has allocated the rights to perform this task |
Other Relevant Article(s) |
Normalize, and Publish your Data to a Bunker
After your dataset is imported to Cloud Vault, you are ready to normalize and Bunker your data.
Important note |
Importing manually is required once, but you have the option to leverage import automation for all future datasets that follow the same file structure (i.e. number of columns, column header names, and column sequence). You cannot modify a normalization configuration once saved, and if your file changes structure or contents you will need to manually normalize it again and save a new configuration. |
Normalization
While importing, you also have the option of leveraging the Global Schema to help automatically detect and normalize common identifiers and attributes. This Global Schema is designed to make collaboration easier, especially when working with several collaboration partners and for common keys such as email or phone number, but it is not required. In the event that you cannot find individual categories or keys in the Global Schema, you also have the option to create your own custom categories.
(UK clients) For clients wishing to use postal addresses to match with partners, you can either follow our guide for address mapping or match on a UDPRN/UPRN and map it as a custom key.
Important! Insights or Activation Bunker?
It is during the Normalization process that you have to choose if you are normalizing data to be used in an Insight Bunker or an Activation Bunker. If you are normalizing for an Activation Bunker. you have to activate the toggled labeled ‘Output ID’ (these will define the keys that can be pushed out in an export). If you are normalizing an for an Insight Bunker, you need to select the columns by checking the box on the left hand side.
Publish to a Bunker
Once you have finished mapping which columns should use the Global Schema, you can now prepare and publish your dataset. To publish your dataset you will need to choose the Bunker (dataset) that you want to use for your collaboration. You can choose an existing Bunker or create a new one at this stage. The Cloud Vault and Bunker must both be in the same region.
While creating a bunker, be prepared to provide basic information about your dataset. For example: number of rows, how many types of identifiers are present, a bunker name, which cloud region to host data, which allowances to use and whether it’s an insight or Activation Bunker. There are several optional fields, for example a bunker description. When updating or executing new file imports, all previous files are set to be replaced by default.
Level of Complexity |
Mid-Low |
Estimated timing |
0-3 hours processing time, <5 minutes user time |
Role(s) involved |
Both technical and non-technical users. Please ensure that your account admin has allocated the rights to perform this task |
Automation of Imports
Once an initial manual upload has been completed, imports can be automated provided that all details including the source location, the file format, and normalization configuration remain the same.
Here's more information about how to automate your data onboarding.
Read Next
Continue by Collaborating with Partners