Creating and running an Importer
Before you can import data from your cloud storage into InfoSum, you will need to create an Importer to pull your data files or folders containing multiple files into the Cloud Vault.
Previous section: Creating an Import Connector Config
Note: you also have the option to create a Import Connector Config (ICC) when creating an Importer.
Table of Contents
Where to find and manage Importers
Import data into a Cloud Vault
Where to find and manage Importers
Under File Management, select the Importing tab. This page contains a list of Importers you can use to stream your data to the Platform. Importers connect ICCs and Cloud Vaults and allow you to select which files or folders within your server you wish to import in the platform.
Select an Importer to view more information such as when it last used, created and update date, the associated cloud vault and ICC. Use the search bar to find Importers you are interested in.
Create an Importer
Select Create New Importer and it will ask you to define your Importer. Click < in the top left to return to the Importers main page.
On this form you will need to:
- Select an Import Connector Config from the drop-down list or click Create New ICC to take you to the Create New Import Connector Config wizard. The Import Connector Config is used to save the connection details and credentials of the files you want to import.
- Add a Name to identify the Importer
- Select the name of the Cloud Vault where your data will be streamed to.
Click Next and enter the files and/or folder to import. You can only filter on files allowed by the Import Connector Config.
Importing files or folders of files
You can choose to import files or folders from your ICC configuration.
Folders come in handy when you want to import multiple files of the same format (e.g. parquet files) into a single Bunker.
In the screenshot above, the importer points to a folder named ‘JanuaryData’ which has all CSV files. This will import and create a ‘JanuaryData’ folder and respective CSV files inside it within the cloudvault. If you want to import only specific files, you can list each of them individually separated by commas i.e. ‘File1.csv’, ‘File2.csv’. You can also choose to pick up all CSV files or GZ files or parquet files i.e. *.csv, *csv.gz, *csv.gpg, *.parquet.
Folder limitations:
If the intention is to convert an entire folder into a recordset, then all files within that folder must have the same format. If you import folders that have multiple file formats, you can also create a recordset from specific files you select from within your folder (of the same format)
Please note that to create a single recordset from a folder, it can contain only one type of file and no sub-folders.All files within the folder must have the same format, e.g. .csv files, .parquet files, and files with any other extension. Files with any other extension will be treated as if they are .csv files. If a folder contains a mix of file types, the recordset creation will fail. You can still create a recordset from multiple individual files with the same format.
Refreshing or adding more files and folders
If you want to update or add a new file to a folder in a CloudVault, you can amend the importer’s config or create a new config targeting the same location so that the new file is added and file(s) are updated.
Automation restrictions |
Please avoid amending the importer if you are planning to automate your data onboarding tasks, as this will change the automation parameters and might cause imports of unexpected files or cause it to fail. |
Click Submit and Run Later to save the Importer. The new Importer is added to the Importers list. Please note the importer is not importing data at this point. You will need to click on ‘Run Importer’ for the import task to kick off.
To edit or delete an Importer, click the 3 dots in the details panel and select Edit or Delete. You cannot change the ICC an Importer uses.
Importing data in a Cloud Vault
Select an Importer from the list and click Run Importer.
Click Run Import. This imports the specified files or folders in your cloud storage to the Cloud Vault. When you start the import, it will show the status. Click the Refresh button to update the task status. The status changes to 'Complete' when your files have been imported. To cancel a running task, click the 3 dots in the details pane and select Cancel Task. If the import fails, the Errors field tells you why it failed.
To see your imported files or folders, select Cloud Vault under File Management and select the Cloud Vault you imported your files to.
The Cloud Vault Files page shows data files imported into a Cloud Vault and transformed files. Data files are listed by Cloud Vault.
Click on a file to see file details.
Deleting files, folders or Cloud Vaults
Click on the 3 dots and select Delete to remove a file or folder from the Cloud Vault. You cannot delete a Cloud Vault that contains file(s) that are being used to create a recordset or which holds a Recordset that is currently published.
Troubleshooting your importer
If your importer fails its status will change to 'failed' and it will be red as shown below.
(Please note that In-platform troubleshooting is only available for S3 cross-account ICCs at the moment. The error will look different depending on what the issue is)
Click through the 'troubleshooting and details' button to see some helpful tips to resolve the error:
Next steps
Now your data is imported to a Cloud Vault, you can Create a Recordset.
Please ensure that you give your folders enough time to import all files (and the status of the importer task is 'Complete') before proceeding to the recordset task to avoid failed tasks or incomplete recordsets.