Assign columns to categories
The first step in normalizing your data is to assign each column to a Global Schema category. This step tells the Platform what each column in your dataset means.
Before starting, please make sure you have read the definition of columns, categories, representations and properties.
Some or all of your columns may have been assigned to a category during the import process, in particular if the columns have commonly used and descriptive names such as Age or Postcode. You can use the Category Wizard to bulk assign columns to categories during the Normalize phase.
Scroll right to find any columns that haven't been categorized. They'll be shaded light blue. If all your columns have been assigned and you're happy with the categories and mappings, you can move onto testing with a dry run.
Any unassigned categories will look like the image below.
Assigning a column to a category
To assign a column to a category, either:
- click on the Settings button next to the column name.
Then click Assign Category.
- Under Normalise, select the List View tab and click the Add Category button.
From here, you can select additional columns to assign to a category, then click on the NEXT button.
From here, you can, search for a relevant category or scroll through the list, then click Save. You have now told the Platform the meaning of that column, which will no longer be shaded blue.
Assigning more than one column to a category
Several columns in your original data can map onto a single category. For example, you might have individual columns for Street, Town and Postcode (or zip code). All of these together would map onto a single category, Address.
To assign more than one column to a category, open up the Assign Category dialog as before, then select multiple columns.
Categories with properties
For a few categories, you need to configure properties to help your Bunker understand your original schema. For example, the Address category comes with a Postcode property, which tells your Bunker which of your original data columns contains the postal code.
When you select the Address category in the Assign Category dialog, an additional option appears. Select the appropriate column from the drop-down, then click Save.
Removing columns from a category
If you assign a category to the wrong column, you can edit or remove the assignment.
To do so, open the Edit Category dialog, then delete the incorrect column and select the correct column to assign to the category. In the image below, the age column has been incorrectly assigned to the gender category. Click delete to remove it.
Assigning a column to a custom category
If there isn't a relevant category available, you can create a custom category. This gives you the flexibility to use categories beyond what's included in the Global Schema. For example, if you have an internal ID or flag that you want to use.
To create a custom category, open up the Assign Category dialog as before and click NEXT. Then select the Custom Category option and an additional settings area will appear. You will now need to give the custom category a name and specify the type of data. Two custom categories in different datasets can only be matched if they have the same name, so this stage may require some coordination with other users.
If the column used for the custom category is an identifier, such as a Customer ID, you will need to select 'is key' for it to be used later on to match keys across datasets. When adding a custom category that contains PII (Personal Identifiable Information), this category must always be normalized as a key, which prevents the raw data from being exposed during analysis. This can be done in the category wizard by activating the ‘KEY’ toggle or in the assign category modal by activating the ‘Is Key?’ toggle.
Note that the System automatically prefixes the word “Custom” to all custom keys in the published dataset.
Next up
Now all your columns have been assigned a Global Schema category, if the column is still red, you may now need to set up category mappings or use the transformation tools.