Using the US address mapper
As part of the normalization process, the InfoSum Platform lets you map US postal addresses in a source data file to the USA Address category in the Global Schema. Mapping US postal addresses to the Global Schema has a number of advantages:
- Three keys will be generated after normalization (USA address, Zip9 and Zip5).
- Combination keys are created for US postal addresses (for example, address + name).
- US addresses can be entered in various formats.
Region restrictions |
US address mapping is only available for US East Cloud Vaults and Bunkers. |
Validating
InfoSum verifies that the US address is valid and complete.
Normalizing
When normalizing US addresses, address line 1, 2, town and state is hashed to generate three keys: USA address, Zip9 and Zip5.
Mapping US addresses to the Global Schema
Supported formats
You must always provide US addresses in raw format. Values are required in either:
- street name, city and state fields, OR
- street name and zip code fields.
The following Zip formats are supported by InfoSum Platform:
- Zip9 is accepted in two formats: 01740-1329 or 017401329
- Street name is required along with Zip code. The Platform cannot map or normalize Zip code on it's own without a street. Both Street name and Zip code are required to validate an address.
- City and State are not mandatory values.
- If an address has a Zip5, InfoSum Platform creates a Zip9 key.
Steps
Prior to normalizing your data you must have imported a file to your Cloud Vault. You can do this via the Import flow. Once you complete your import you’ll have to create a recordset before being able to normalize your data.
For general information about the normalization step please read our Data Normalization article.
- Go to the Normalizing page, select the recordset and hit “Normalise”.
We will create a new config, but if this is a repeated action you can utilize your previous normalization configs here as long as the original file and new file have the exact same schema.
-
Here we will select the columns we want to include in the new normalization config. Select the columns in 3 ways:
- Click the toggle (on top, in a light blue oval) to select all columns.
- Select the checkbox next to NEXT and it will select every checkbox in the dataset
- Manually select which columns you’d like to normalise in the InfoSum system.
-
We will now map your dataset to the Global Schema. For the US Addresses we have a value in the Global Schema called “USA Address,” please use this for any address field.
As you can see below, our dataset, which contains USA Addresses has each value mapped to “USA Address” and then an additional value is given to signify what part of a USA Address is this field. This is crucial as it denotes what part of the address you have in each column of data. This step is shown in the light blue box with the two headers, MAPPING TO GLOBAL SCHEMA ADDITIONAL MAPPING/CONFIG. - Next, select if the column will be an output column. You must only do this if you are normalizing the data to publish to an Activation Bunker.
- Lastly, the checkboxes next to arrow “A” will allow the user to select multiple columns and prefer changes to all selected files. In the orange arrow labeled “B” you can pick the data type of the field. As addresses are key in the Global Schema the type will be predetermined and unchangeable.
- Ensure the rest of the keys and attributes in your dataset are also normalized to the correct configuration. Then click CONTINUE TO DETAILS
- Provide a name for the config you’ve just created and create a name for the output normalized file. Click “CREATE NEW CONFIG AND NORMALISE” to move on.
Next steps
After normalizing the rest of your dataset you can prepare and publish to a new or existing Bunker.
In the prepare step you will be able to confirm that InfoSum’s Global Schema has combined all the USA Address fields into one key. Additionally, it has created a separate key for Zip5 and Zip9, should those be useful.