Overview of Data Importing
To publish your data to a Bunker you will need to go through two distinct processes: connecting your files to a Cloud Vault, and normalizing and publishing your data.
Each of these steps is self-contained, meaning that you don’t need to complete them all in one session.
Table of Contents
Onboard your files to a Cloud Vault
Normalize and publish to a Bunker
User rights for imports
To be able to access the File Management section and import and publish data for collaboration you will need the correct user rights which include all ‘Bunker Operations’ rights.
Please contact your account admin or support@infosum.com if you cannot see the File Management section on the left-hand side menu.
Onboard your files to a Cloud Vault
A Cloud Vault is a data staging environment where your collaboration data can be prepared before it's published to a Bunker. Cloud Vaults are hosted in AWS, and they must be in the same region as your data storage. You will also need to select the same region for your Bunker.
There are two ways of importing files to your Cloud Vault:
- Using the end-to-end Import Flow, designed to import data in one session
- Completing each import set-up task independently, which might be more suitable if multiple teams are involved or you don’t have all the necessary information readily available
Step 1: Create a Cloud Vault
You will only need to define the geographic region (which should be the same one as the original location of your files) and give your Cloud Vault a name, so this should take less than 5 minutes. If all your data is in the same geographic location, you will likely only need to create one Cloud Vault for all your imports.
Follow the instructions in this article to create a Cloud Vault
Step 2: Select an Import method
You can import data via Local File upload or using a Server.
If you are using Local File Import you simply need to upload a CSV file from your local computer with your data.
If you are using a Server to import you will need to:
- Add the details of your server to an Import Connector Config. If all your data is coming from the same server, you only need to complete this step once.
- Run an Importer
Once the files are imported into your Cloud Vault, you will need to create a recordset - which helps our platform understand your file by confirming what are the column delimiters and multi-value delimiters.
Normalize and publish your data to a Bunker
After your data in the staging environment (Cloud Vault) it needs to be normalized. When normalizing your data, the names of the keys used for joining to collaboration partner's datasets need to be consistent across the different datasets. The purpose of the Global Schema is to assist in the standardization of names to make joining simpler. However, if you're uploading a custom key, you will need to ensure the names are exactly the same across all datasets.
You can complete these tasks using your Bunker's web-based UI:
- assign columns to keys and categories,
- assign keys and categories to the Global Schema
- define the data type
After the data is normalized it can be published to a new or existing Bunker.
You can create a Bunker as part of the publishing workflow, or you can create or manage your existing Bunkers at any time. A published Bunker creates a new Dataset on the platform.
Automation of Imports
Once an initial manual upload has been completed, imports can be automated provided that all details including the source location, the file format, and normalization configuration remain the same.
Please reach out to your InfoSum representative to learn more about the Automation of imports.
Importing via API
You can onboard the data that you want to use for collaboration using our UI onboarding flow or using our API documentation.
You can find more information and a link to our API documentation in this article.