Create a Recordset
A Recordset is a collection of files with a common schema. It defines how the InfoSum platform understands and interprets the underlying data. This page will guide you through the step by step process of creating a Recordset.
Limitations |
Files will need to have the same number of columns, the same header rows, column order, delimiters and multi-value delimiters. The column delimiter, multi-value delimiter and header row fields are ignored if a parquet data type is used. Any parquet files used for recordset creation cannot be more than 5GB but the total of all files can be more than 5GB. |
Recordset from folder limitations |
Only one type of file and no subfolders Please note that to create a single recordset from a folder, it can contain only one type of file and no sub-folders.All files within the folder must have the same format, e.g. .csv files, .parquet files, and files with any other extension. Files with any other extension will be treated as if they are .csv files. If a folder contains a mix of file types, the recordset creation will fail. You can still create a recordset from multiple individual files with the same format. Importing files completed before recordset creation Please allow sufficient time for your files to finish importing before starting the next task. You can check that your folder has completed importing files in the 'Importing' tab, the status of the task should be 'completed' before proceeding to the next task. Creating a recordset from a folder that’s still importing files will fail or result in an incomplete recorder. Use one importer to automate onboarding If you’re selecting a folder to create your recordset and you intend to automate the steps, please make sure the folder only includes files from one Importer. Automatic Automation steps detection is not possible when a folder includes files from multiple importers. |
Table of Contents
Creating a recordset from a file or folder
Creating a recordset from a file or folder
You can create a recordset from either an individual file or a folder of files.
From the main Cloud Vault screen select the file(s) or folder that you want to work with and click on the Create Recordset button from the information panel.
Your saved recordsets will be stored in a folder called Recordsets as displayed below. You cannot create recordsets from that folder as it already contains recordsets.
The Create Recordset screen will now open.
On this page you can choose to either Use Existing Config or Create new Recordset Config
Use an Existing Config
1) Add a new name to your Recordset.
An existing Recordset config can be selected from the dropdown list.
The files that are being included in the existing config will be listed at the top of the page. If over five files are selected, the list of files will be hidden but can be displayed by clicking on the number of files.
If you are creating a recordset from a folder, the number of files will be displayed instead.
2) When a Recordset Config is selected, there is a toggle available if you want to modify the config.
Create a new Recordset Config
1) The Create Recordset will list the following options.
- Create a new Name for your Recordset
- Select the required column Delimiter
Note multi-value delimited and column delimiter selection cannot be the same.
Multi-value attribute example, the delimiter is PIPE
Multi-value key example, the delimiter is COMMA
Multi-value keys and attributes reduce the file size and increase processing speed. Plus multi-value attributes are helpful in creating more interesting in-platform visualizations.
2) If your file(s) has headers, select the File has column headers toggle. However, if your file(s) do not have column headers, deselect the toggle and manually enter the column headers that match exactly the column headers in your file(s) in the same order.
Note if this is turned off you will need to manually enter each column header.
3) Click Create Recordset.
4) The next information panel will allow you to save your Configuration with a Name. This is so it can be used next time.
Managing Recordsets
The Platform Tasks screen will show the status of your Recordset creation and any other tasks in the running state.
When the task is completed it will change to green Completed
Once this has finished it will create a new file within the Recordsets folder within the Cloud Vault page.
A new button will appear in the task called ‘Go to Output’ that will take you to the Recordset file.
Alternatively you can click on the Recordsets folder and your Recordset will now be visible
Read next
The next step is to Normalize your data