Rewriting a column using string functions
How to write a simple DTL script to change the value of a column.
The scenario
In the United Kingdom, telephone numbers are conventionally written starting with the digit 0. For example, 020 7946 0643 is a typical phone number.
But when dialling this number internationally, the first 0 is omitted, and the international dialling code +44 is used instead. So for international callers, the same number is written +44 20 7946 0643.
What's more, some software records UK phone numbers without either the leading 0 or the international dialling code. Using this notation, the same number is written simply as 20 7946 0643.
So there are three different ways the same phone number can be written in the UK. Let's suppose your dataset has a mix of these formats, and you would like to convert all phone numbers to a common format beginning with 0.
To summarize, here are examples of how you'd like to rewrite your data.
Original |
Rewritten |
Note |
020 7946 0643 |
020 7946 0643 |
Unchanged |
+44 20 7946 0643 |
020 7946 0643 |
+44 removed, 0 added |
20 7946 0643 |
020 7946 0643 |
0 added |
Before you begin
Before you can start writing a script to rewrite these numbers, you'll need to have gone through the steps described in the Bunker and the Script Editor. To summarize, this means you have:
- identified a column in your original dataset which contains the phone number
- assigned this original data column to a category
- begun applying a transformation to that category, and switched to the script editor.
In this example, we'll assume your original column was called Phone Number. Because you've assigned that column to this category, it's now available to use in your DTL script.
One more thing to remember: your DTL script runs against a single row at a time. It will run repeatedly - each time with a different row and, therefore, a different value in the Phone Number column.
Note: You don't need to do anything to loop through the rows. Your Bunker handles that for you, calling your script once for each row.
Step 1: Read the original value
So, let's get started on the script.
The very first thing we need to do is to read the contents of the Phone Number column. In this example, we'll assign it to a variable. Here's how:
$original_number = `Phone Number`
There are a couple of things going on here, both of which we covered in DTL language essentials.
- Variable names in DTL always start with the $ character. Notice that you don't need to declare the variable in any way; you just start using it.
- When you put the name of a column inside backticks, like we did with `Phone Number`, you retrieve the value of that column for the current row.
So, here, we're taking the value of the Phone Number column and storing it in the variable $original_number.
Step 2: Remove the "+44", if it's there
Now, we want to remove the prefix +44, if it's there (but only if it's there). As it happens, that's a very simple task - DTL has an inbuilt function that's perfect for the job.
You can find a complete list of DTL's inbuilt functions in the functions specification. The one we want is called trim_prefix, and here's how it's described:
Note: Given an existing string, return a new string, with a specified substring removed if it is present at the beginning of the string.
So we can call the trim_prefix function to trim +44. If it's there, it will be removed. If it isn't, nothing will happen.
The specification also tells us that trim_prefix takes two parameters. The first is the string you want to process, and the second is the prefix to trim. Putting it all together, here's how to do it:
$trimmed_number = trim_prefix ($original_number, "+44")
Note that the $original_number isn't changed - a function in DTL never changes the value of its parameters. Instead, it returns a result, and we've stored that result in a new variable called $trimmed_number.
Step 3: Add the "0", if it's missing
The next step is a little trickier. We'd now like to add the customary digit 0 onto the start of the number, but only if it isn't already there.
It's time to write our first if statement. Take another look at the functions specifications, and you'll see there's a function called has_prefix, which returns true or false depending on whether a string starts with a particular prefix. So we can do this:
if has_prefix ($trimmed_number, "0")
{
$final_number = $trimmed_number
}
else
{
$final_number = "0" + $trimmed_number
}
As well as the structure of the if statement, notice that DTL's string concatenation operator is +. So when we say $final_number = "0" + $trimmed_number, we're not doing any kind of arithmetic - we're just concatenating the string "0" to the start of the string $trimmed_number.
Step 4: Store the result back in the column
We've finished tweaking the phone number, but so far the result is only in a variable inside DTL. Once our script ends, that variable will be destroyed, and its value will be lost. So to finish, we need to store the value back into the original column (which, remember, was called Phone Number).
Here's how we do that.
set ("Phone Number", $final_number)
This is our first experience of a DTL transformation, which we mentioned briefly in DTL language essentials. As you can see, a transformation looks very much like a function call. But it doesn't return a result - instead it changes, or "transforms", something about your data.
The set transformation is probably the simplest of all. It just changes the value of a column to the new value you specify. So in this line of DTL, we are storing the $final_number we've generated back into the column called Phone Number, overwriting the original data which was there before.
A word of warning
There is a trap you could fall into here. Remember how we did this, back at the start of this tutorial?
$original_number = `Phone Number`
When you want to read the value of a column, you write its name in `backticks`. You'll get very used to doing that. So when you come to write your set transformation, you might think you should do it there, too:
// This is wrong!
set (`Phone Number`, $final_number)
Why's that wrong? Because set takes, as its first parameter, the name of the column you want to set. And writing the name in backticks returns its value. So DTL would try to do something like this:
// This is wrong!
set ("+44 20 7946 0643, $final_number)
...which isn't going to work.
So remember: when you're using set (or any transformation that takes the name of a column), write the name of the column as a string inside "double quote marks".
The full script
Here's the full script we've written.
// Read the phone number from the original data column
$original_number = `Phone Number`
// Remove the +44 prefix, it it's there
$trimmed_number = trim_prefix ($original_number, "+44")
// Add the 0 prefix, only if it ISN'T there
if has_prefix ($trimmed_number, "0")
{
$final_number = $trimmed_number
}
else
{
$final_number = "0" + $trimmed_number
}
// Write the result back to the data column
set ("Phone Number", $final_number)
Just in case you're wondering... because this was a step-by-step tutorial, we've ended up writing the script in a fairly long-winded way. In the real world, you'd probably do something more like this.
$number = trim_prefix (`Phone Number`, "+44")
if not has_prefix ($number, "0")
{
$number = "0" + $number
}
set ("Phone Number", $number)
Try it out!
You can run the script yourself using our tutorial data set, which you can download as a CSV file (coming soon). Here's how to try it out:
- Create a dataset and upload the tutorial data to a Bunker.
- When you're prompted, accept the recommended categories. Notice that Phone Number hasn't automatically been assigned to a category, so you'll need to assign it yourself (you'll do that in the next step).
- Using the drop-down above the Phone Number column, select Assign Category. Under Assign columns to a category, select Home Phone Number, then click Assign.
- Notice how some of the rows have red warnings in the Home Phone Number column. This is because the numbers are not in the usual UK format... and that's what our script is there to fix.
- Now, using the drop-down again, select Transformations. Switch to the Editor tab and paste the sample code in.
- Press Save and you'll see the rewritten results in the Phone Number column. Notice that when a number has been changed by the script, it's shaded light blue; you can hover your mouse over to see the original value.
- See how the red warnings have disappeared. Congratulations, you've successfully applied your first transformation script!
Next step
In the next tutorial, we'll see how to combine several columns to produce a single result. Read tutorial two now.