Reduce Boilerplate Code with New Text Functions in SQL Server 2017

By Daniel Stead - BI Team

Last updated: 20.06.2018

SQL SQL Server

Introduction

Every release of SQL Server has many new features that get us all excited. Whilst the addition of new functions to the T-SQL language never represents the most ground-breaking of changes, it’s always nice if they present new ways to eliminate boilerplate code. This blog article looks at the new functions introduced in SQL Server 2017 and how they make working with text easier.

STRING_AGG

Anyone with any amount of database development experience at some point will have encountered a requirement to produce a comma-separated list of items for one or more columns in a row.

A typical solution to this would be to use a correlated subquery and transform each dataset returned into a comma-separated list using XML.

As an example, let’s suppose that for each system database we want a comma-separated list of the associated filenames.

Using the system views and the discussed pattern, we write the below query to retrieve the desired data.

Who enjoys writing this? Wouldn’t it be nice if the SQL Server team created a native function to eliminate this kind of boilerplate?

Luckily, the STRING_AGG aggregate function has been introduced in SQL Server 2017. Our query can now be reduced to the below.

This provides the exact same dataset, not to mention being much easier on the eye and the memory. I know many of us are sick of looking up the syntax for the XML hack.

TRIM

In many ETL solutions you’ll find landing tables containing text data with leading and trailing white space that needs to be removed.

To achieve this, you’ll currently have to use a combination of the RTRIM and LTRIM functions. This is shown in the below example.

In SQL Server 2017, the TRIM function has been introduced so we only need to make one function call.

It’s frustrating that it has taken this long for this function to be introduced. Considering that Excel has long had a full TRIM function available. Better late than never.

CONCAT_WS

Imagine an ETL with a requirement to export data to a pipe delimited file. One approach to this would be to concatenate the required fields together using a query similar to that shown in the below example.

Note the need to keep repeating the pipe delimiter. SQL Server 2017 now introduces the CONCAT_WS function to eliminate the redundant delimiters.

The delimiter is now only required as the first argument, the function takes care of the boilerplate.

TRANSLATE

A common data scrubbing task involves replacing unwanted characters in a string. The typical solution involves nesting multiple calls to the REPLACE function. As an example, let’s consider the following dataset which contains some unwanted characters that we wish to replace with a comma.

We can use nested calls to the REPLACE function to make the character replacement.

However, this is cumbersome to write and difficult to read. Luckily, SQL Server 2017 introduces the TRANSLATE function to reduce the effort down to a single function call. The second parameter contains the characters to be replaced and the third parameter contains the corresponding characters to be replaced with.

Summary

This blog article looked at the new text functions introduced in SQL Server 2017. In the scenarios examined, they helped to reduce the amount of boilerplate that would have been written previously. Some of these functions should have been introduced a long time ago, but at least Microsoft are continuing to enhance the T-SQL language.

cap hpi