REGEXP_ Functions in SQL Server 2025 – REGEXP_REPLACE

Part 8 of Learn RegEx With Louis

Okay, we have gone through as much of the RegEx filtering as I think is a a part of the SQL Server 2025 implementation. Now it is time to focus on the functions that are not REGEXP_LIKE. We have already talked about REGEXP_MATCHES, which will come in handy for the rest of the series.

I will start with REGEXP_REPLACE, which is like the typical SQL REPLACE function. But instead of replacing based on a static delimiter, it can be used to replace multiple (or a specific) value that matches the RegEx expression. All of my examples for this entry will simply use a variable with a value we are working on, so no need to create or load any objects.

As an example, consider you want to get rid of square brackets in some text. Using replace, you probably have used something like these nested calls to REPLACE to get rid of multiple values.:

DECLARE @stringValue nvarchar(100)
SET @stringValue = 'This is the [Schema].[Table] table'

SELECT  REPLACE(REPLACE(@stringValue,'[',''),']','');

This statement returns:

-------------------------------
This is the Schema.Table table

Using REGEXP_REPLACE, you can do this in a single function call, using a list of values to match. You do need to be careful to escape some characters, especially the [ and ] which have meaning in the RegEx expression:

DECLARE @stringValue nvarchar(100)
SET @stringValue = 'This is the [Schema].[Table] table'

SELECT REGEXP_REPLACE(@stringValue,'[[]]','');

This found any character in the character set (in this case [ and ]), then replaced it with an empty string. Note that you can also leave off the third parameter, and the default replace value is the empty string.

This is a simple example, but it is often the case that you are trying to clean up some text where you are wanting to remove all characters that aren’t in a certain set. For example, say you have some data that is messy like this: 'Thi2s i3s cra333%$#%^zy, le@@t 423435me clean this'. We can replace all the values that are not string, comma, or a space really easily.

DECLARE @stringValue nvarchar(100)

SET @stringValue = 'Thi2s i3s cra333%$#%^zy, le@@t 423435me clean this'
SELECT REGEXP_REPLACE(@stringValue,'[^a-z, ]')

This returns:

--------------------------------
THis is crazy, let me clean this

And if you are wanting to find the data that isn’t quite right, just change it to replace everything that is not in the characters set you wish:

SELECT REGEXP_REPLACE(@stringValue,'[a-z, ]');

Now you get:

------------------
23333%$#%^@@423435

I have written this code using a numbers table and a substring many times, but this is a lot more convienient.

In a later entry on REGEXP_SPLIT_TO_TABLE I will show another way that will help if you have characters that cannot be easily viewed in your string.

It doesn’t need to be complicated

You can still do a simple replace, and just look for something like the value of 'Schema' and change it to 'SchemaName':

DECLARE @stringValue nvarchar(100)
SET @stringValue = 'This is the [Schema].[Table] table';

SELECT REGEXP_REPLACE(@stringValue,'Schema','SchemaName');

Now the value has changed:

--------------------------------------
This is the [SchemaName].[Table] table

So simple replaces can be done with this function, and just like with REPLACE, you don’t need to be concerned if you replace a value with the exact same value. It does not check again:

DECLARE @stringValue nvarchar(100);
SET @stringValue = 'This is the [Schema].[Table] table';

SELECT REGEXP_REPLACE(@stringValue,'Schema','Schema');

Obviously the string looks the same still, and will be equal to the string, but it will actually have been modified.

Positional replaces

One thing to note is the REPLACE function does not have positional parameters to control the text you are replacing, but REGEXP_REPLACE does. The full syntax of the command is:

REGEXP_REPLACE
(
input value,
pattern [, string_replacement [, start [, occurrence [, flags ] ] ] ]
)

In these next sections, I will show how you can use these parameters to give you fine grained control over your replace operations. This makes manipulating strings a lot eaiser to do (when that is necessary!)

start – Adjusting the starting point

This can be useful if you want to start altering the value at a certain point. This is something that can be coupled with REGEXP_INSTR, which will find the position of a match, and you could start replacing after the 10th match, for example. In my example, I am just going to specify a hardcoded 20 characters:

DECLARE @stringValue nvarchar(100);
SET @stringValue = 'duck duck duck duck duck';

SELECT REGEXP_REPLACE(@stringValue,'duck', 'goose', 20);

This returns:

-------------------------
duck duck duck duck goose

You can see where the 20th character is in the string using:

declare @stringValue nvarchar(100)
set @stringValue = 'duck duck duck duck duck'

SELECT SUBSTRING(@stringValue,20,80);

Which returns:

--------
duck

And that duck will be replaced by goose. The default for this parameter is 1, as you can see here:

DECLARE @stringValue nvarchar(100);

SET @stringValue = 'duck duck duck duck duck';
SELECT REGEXP_REPLACE(@stringValue,'duck', 'goose'),
       REGEXP_REPLACE(@stringValue,'duck', 'goose', 1);

Same output:

-------------------------------  --------------------------------
goose goose goose goose goose    goose goose goose goose goose

occurrence – Replacing one specific match

If you want to return only a certain position in your string, you can use the occurrence parameter. In this next example, I will replace the third instance:

DECLARE @stringValue nvarchar(100);
SET @stringValue = 'duck duck duck duck duck';

SELECT REGEXP_REPLACE(@stringValue,'duck', 'goose', 1,3);

This returns:

----------------------------
duck duck goose duck duck

If you put in 6 for the occurance, all five ducks will be returned. It just doesn’t replace anything.

Note that you cannot replace multiple occurrences using REGEX_REPLACE, so in this case you would need to nest calls, and look for items from the end of the string first, then work your way to the front. For example, to replace the 2nd and 4th entries:

DECLARE @stringValue nvarchar(100);
SET @stringValue = 'duck duck duck duck duck';

SELECT REGEXP_REPLACE(REGEXP_REPLACE(
@stringValue,'duck', 'goose', 1,4),'duck','goose',1,2);

Which returns:

----------------------------
duck goose duck goose duck

As noted, the string will not be changed if the value isn’t there, so if you look for duck in the 2nd and 6th location, it will work as expected and replace just the second duck.

Alternative method for nesting calls

You can replace the items from front to back, but it is a bit more cumbersome. You need to do the 1st item, then for consequtive items, you will need to decrement the position, since the first items will have been removed. So to remove 2 and 4, you have to remove 2, then remove 4-1, or 3:

DECLARE @stringValue nvarchar(100)
SET @stringValue = 'duck duck duck duck duck'

SELECT REGEXP_REPLACE(REGEXP_REPLACE(
             @stringValue,'duck', 'goose', 1,2),'duck','goose',1,3);

I don’t know if this is necessary to know, but it was how I started out trying to do this before I realized starting at the end is more straightforward. I include it because it was interesting!

If you are needing to do an unknown number of occurrences, you could use a loop, which I won’t write here, but the algorithm would be the same, unless you wanted to do a more complex process that removed different words.

Note: When nesting calls, be aware that your replacement values could overlap. Make sure to test with a variety of values if there is any overlap of values.

Parameterized string replacement

When you want to do things like formatting some data, things may get a bit trickier to reassemble the data when you have to add formatting. Luckily REGEXP_REPLACE supports a way to add formatting or even move data around in the output.

This is something I learned just writing this blog, from books online (I really miss that name for our SQL Server documentation!) I am going to use a similar example of a formatting a number like a phone number or US social security number, but this could be used in a lot of ways. All this to say, go peruse the documentation occasionally. Good stuff in there!

Say you have a group of phone numbers, with no formatting. So you have 1234567890 as an example, and you want to format that as 123-345-7890. Using CONCAT and SUBSTRING, this would be doable. But you can do this with a single REGEXP_REPLACE.

You do this by putting one or more patterns in parenthesis, and then when there is a match to a group, you can reference then as \1, \2, etc, up to \9.

As an example, for our string of 10 numbers, we will use (\d{3})(\d{3})(\d{4}). The pattern in the first parenthesis matches three character, the second 3 characters, the four 4. Then we are going to take each of the matches for those pattens, and seperated the output with a dash - character.

DECLARE @stringValue nvarchar(100);
SET @stringValue = '1234567890';

SELECT REGEXP_REPLACE(@stringValue,'(\d{3})(\d{3})(\d{4})', '\1-\2-\3');

This outputs:

-------------
123-456-7890

Now if there are extra characters that are not fully consumed by the pattern, you might see them hanging from the end:

DECLARE @stringValue nvarchar(100);
SET @stringValue = '12345678901';

SELECT REGEXP_REPLACE(
         @stringValue,'(\d{3})(\d{3})(\d{4})', '\1-\2-\3 and ');

You can see the dangling 1 here, placed after the replace pattern which ends in ' and ':

-------------------
123-456-7890 and 1

To clean those up, you can add another parametered match pattern, which in my next code, I use .*, to get everything (other then end-of line which would not be typical of such data):

DECLARE @stringValue nvarchar(100);

SET  @stringValue = '12345678901234567';
SELECT REGEXP_REPLACE(@stringValue,'(\d{3})(\d{3})(\d{4})(.*)', '\1-\2-\3');

These values are now ignored because while they were matched by the 4th parenthesis set, we didn’t use them in the output, so no trailing data.

Formatting data example

Simple formatting isn’t such a big deal, but you can implement a variety of formatting in your REGEXP_REPLACE calls. Lets take some data that hase multiple formats. Note that I only looked for a simple 10 digit number in the first example. Say you had phone numbers in multiple formats, and some that didn’t work at all:

CREATE TABLE #PhoneNumber (phoneNumber varchar(50))
INSERT INTO #PhoneNumber
VALUES ('111-111-1111'),('2222222222'),
       ('(333) 333-3333'),('111-NaN-NaNa'),('Nope');

I would obviously suggest you valiedate that your data is in a reasonable format first, because a phone number like '111-NaN-NaNa' would fail (and 'Nope' is a definite nope).

So you could do something like:

SELECT REGEXP_REPLACE(PhoneNumber,'[^\d]')
FROM #PhoneNumber
     --only return if we end up the 9 or
     --greater count of numeric characters.
WHERE LEN(REGEXP_REPLACE(PhoneNumber,'[^\d]')) >= 9 ;

Now this returns:

-----------
1111111111
2222222222
3333333333

Now, we will do two steps, one to remove all the formatting, the next to add it right back in a standard format:

SELECT REGEXP_REPLACE(
REGEXP_REPLACE(PhoneNumber,'[^\d]'),
                   '(\d{3})(\d{3})(\d{4})(.*)', '\1-\2-\3')
FROM #PhoneNumber
WHERE LEN(REGEXP_REPLACE(PhoneNumber,'[^\d]')) >= 9;

Now we have properly formatted phone numbers (at least for what the user wanted):

-------------
111-111-1111
222-222-2222
333-333-3333

And you probably want to do a bit more format checking in this case, using a REGEXP_LIKE call as well, like here to make sure it is one of the 3 “reasonable” formats:

SELECT REGEXP_REPLACE(REGEXP_REPLACE(
             PhoneNumber,'[^\d]'),'(\d{3})(\d{3})(\d{4})(.*)', '\1-\2-\3')
FROM #PhoneNumber
WHERE LEN(REGEXP_REPLACE(PhoneNumber,'[^\d]')) >= 9
  AND  REGEXP_LIKE (PhoneNumber,
         '^([\d]{10}|[\d]{3}-[\d]{3}-[\d]{4}|([\d]{3}) [\d]{3}-[\d]{4})');

This checks to make sure the data is in one of the three formats included that would be valid. Yout could add another statement with NOT REGEXP_LIKE(PhoneNumber, etc. to save off the data that doesn’t match for checking later.

The LEN and the REGEXP_LIKE calls may not both necessary, so your mileage my vary in how you need to implement your own system.

One last silly trick with positions

They are not required to be output in order:

DECLARE @stringValue nvarchar(100);
SET @stringValue = '12345678901';

SELECT REGEXP_REPLACE(@stringValue,'(\d{3})(\d{3})(\d{4})(.*)', '\3-\2-\1');

This returns:

-------------
7890-456-123

Or you can even repeat the parts:

DECLARE @stringValue nvarchar(100);
SET @stringValue = '12345678901';

SELECT REGEXP_REPLACE(@stringValue,
'(\d{3})(\d{3})(\d{4})(.*)', '\3-\3-\3-\3-\1-\1');

Now the output is 4 instances of 3, and 2 of 1.

----------------------------
7890-7890-7890-7890-123-123

Cool stuff that I am very glad I learned existed!

Summary

This is what this whole series was leading up to. All of the stuff we have done so far has been interesting. I can see a few uses for REGEXP_LIKE, and REGEXP_MATCHES will be a very useful tool for diagnosing issues.

But this is a function that I can very much see being put into use on a semi-regular basis, especially when doing any loading of data. So often the need arises to check data, and/or it

3 responses to “REGEXP_ Functions in SQL Server 2025 – REGEXP_REPLACE”

  1. […] Louis Davidson continues a series on regular expressions in SQL Server 2025: […]

    Like

  2. REGEXP_ Functions in SQL Server 2025 – REGEXP_SPLIT_TO_TABLE – Drsql's Blog Avatar

    […] character in your data that you are trying to ingest in a tool. We used a similar technique in part 8 of this series with REGEXP_REPLACE (and I implemented it back in 2019 in this article on Simple Talk), but this technique is a lot […]

    Like

  3. T-SQL Tuesday #190 – Mastering A New or Existing Technical Skill – Drsql's Blog Avatar

    […] was one feature (groups) of RegEx that I probably wouldn’t have even looked at had I not been writing about it. It was a […]

    Like

Leave a reply to Replacing Text in SQL Server 2025 via Regular Expression – Curated SQL Cancel reply

I’m Louis

I have been at this database thing for a long long time, with no plans to stop.

This is my blog site, companion to drsql.org

Recents