Onboarding Your Data
Please ensure you’ve read the Preparing your data article before this one.
Table of Contents
IP Allowlist and Firewall Settings
Import Process Overview (InfoSum Provided Deployment)
Import Process Overview (Beacon Cloud Deployment)
Create a Cloud Vault and a Connector to Import
Normalize, and Publish your Data
| Important note |
|
The InfoSum platform offers self-serve data import. Please do not send customer data to your InfoSum representative. Data should flow directly from your tech stack into the data clean room. The InfoSum platform does NOT offer self-registration. To create your company and initial user accounts please fill in this form. Clients wishing to add or remove additional users will need to contact their InfoSum representative. |
To use the InfoSum platform, you will need to create:
| For Beacon deployments into your cloud/warehouse | For traditional InfoSum provided deployments |
|
|
This article will explain how to set up your data to publish to a Dataset
IP Allowlist and Firewall Settings
If you are using a VPN, or your firewall restricts upload/download of data, you may need to authorize relevant IP addresses.
Import Process Overview (InfoSum Provided Deployment)
To publish your data to a Dataset you will need to go through two distinct processes: connecting your files to a Cloud Vault, and normalizing and publishing your data.
Each of these steps is self-contained, meaning that you don’t need to complete them all in one session.
If you are using Local File Import instead of Server Import - you won’t have to complete the full process outlined in Connect your files.
Import Process Overview (Beacon Cloud Deployment)
For native deployments using Beacons, connecting your files is done via app install. You will only need to create a virtual cloud vault that matches your deployment technology, install the app in your cloud or warehouse and point it to the data tables you wish to use.
You will then follow the steps outlined in normalize and publish above.
Create a Cloud Vault and a Connector to Import
Cloud Vault is an InfoSum environment where your data is hosted and prepared before getting published to a Dataset. You will create this within the platform, go to the Dashboard and click on ‘Start a new Import’
Then, you will need to create an import connector configuration (“ICC”). The ICC stores the access credentials for connecting to your desired environment.
We currently support:
- Local file upload (please note that importing cannot be automated using this option)
- SFTP
- S3 cross-account
- S3 access key (please note that AWS is working to deprecate access key authentication, so we recommend using the S3 cross-account instead)
- Google Cloud Storage (GCS)
Once you have created an ICC, create the importer, and then run that importer to Connect your data to your Cloud Vault. This step is required only once for each of your upstream data sources. For example, if your data is solely coming from S3, this is a one-time step.
You can now import your data to Cloud Vault.
| Level of Complexity | Low |
| Estimated timing | 5-10 minutes processing time, <5 minutes user time |
| Role(s) involved | Technical or non-technical users with access credentials. Please ensure that your account admin has allocated the rights to perform this task |
| Other Relevant Article(s) | How to Import data to a Cloud Vault |
Normalize, and Publish your Data to a Dataset
After your data is imported to Cloud Vault, you are ready to normalize and publish your data.
Important note |
|
Importing manually is required once, but you have the option to leverage import automation for all future data that follow the same file structure (i.e. number of columns, column header names, and column sequence). You cannot modify a normalization configuration once saved, and if your file changes structure or contents you will need to manually normalize it again and save a new configuration. |
Normalization
After importing you will have to standardize your data using our Normalization wizard. You can also find data formatting guidelines and recommendations on this article.
You also have the option of leveraging the Global Schema to help automatically detect and normalize common identifiers and attributes. This Global Schema is designed to make collaboration easier, especially when working with several collaboration partners and for common keys such as email or phone number, but it is not required.
UK and US clients have the option to normalize physical addresses
Important! Are you planning to Activate data from the Dataset?
If you are normalizing for an activation, you will need to normalize at least one ‘Export column’ (these will define the keys that can be pushed out in an export).
Publish a Dataset
Once you have finished mapping which columns should use the Global Schema, you can now prepare and publish your data as a Dataset. To publish your data, you will need to choose the Dataset that you want to use for your collaboration. You can choose an existing Dataset or create a new one at this stage.
Your can prepare your data to replace that hosted in a Dataset to update incrementally.
| Level of Complexity | Mid-Low |
| Estimated timing | 0-3 hours processing time, <5 minutes user time |
| Role(s) involved | Both technical and non-technical users. Please ensure that your account admin has allocated the rights to perform this task |
Automation of Imports
Once an initial manual upload has been completed, imports can be automated provided that all details including the source location, the file format, and normalization configuration remain the same.
Here's more information about how to automate your data onboarding.
Read Next
Continue by Collaborating with Partners