Overview
The REGEXP_REPLACE()
function replaces all occurrences of a regular expression pattern in a string with a specified replacement string.
Syntax
The syntax for REGEXP_REPLACE()
function is as follows:
REGEXP_REPLACE(source_string, pattern, replacement, [flags])
Parameters
source_string
: string that we want to perform the replacement on
pattern
: POSIX regular expression pattern to match
replacement
: replacement string
flags
: (optional) string that changes the matching behavior of REGEXP_REPLACE()
function
The flags
parameter is an optional string that controls how the function operates. Here is a list of flags supported in Oxla:
g
: global replacement. This flag ensures that all occurrences of the pattern are replaced
i
: use this flag for case-insensitive matching
c
: REGEXP_REPLACE()
function is case sensitive by default, using the c
flag has the same effect as using no flags
Examples
Basic function usage
In this example, we will focus on using REGEXP_REPLACE()
function with a basic POSIX regular expression pattern:
SELECT REGEXP_REPLACE('The OXLA supports various data types', 'T[^ ]*', 'We') AS "Replaced_String";
By executing the query above, we will get the following output:
Replaced_String
-----------------------------------------
We OXLA supports various data types
The pattern used was “T[^ ]*”, which matches any substring that starts with a ‘T’ character, followed by any number of non-space characters. The function replaces the matched substring with the specified replacement string “We”.
Replacing special characters
This example demonstrates how to replace a non-alphanumeric character in a string with a tilde (~):
SELECT REGEXP_REPLACE('Hello World!', '[^A-Za-z0-9 ]', '~') AS "Replaced_String";
In the above query, the second parameter is a regular expression “[^A-Za-z0-9 ]” that matches any characters that are not uppercase / lowercase letters, digits or spaces. The output for executing the query above will be as follows:
Replaced String
-------------------
Hello World~
Flags usage
Replacing certain substrings with a single flag defined
This example will focus on using the REGEXP_REPLACE()
function with a defined flag and replacing certain substrings in a string. For the needs of this section, we will create a sample quotes
table:
CREATE TABLE quotes (quotes_text text);
INSERT INTO quotes (quotes_text)
VALUES ('Work hard and stay hungry. Lazy people get nowhere in life.'),
('An excuse is a way for a LAZY person to feel better.'),
('The word LUCKY is how a lazy person describes someone who works hard.');
SELECT quotes_text FROM quotes;
By executing the code above, we will get the following output:
quotes_text
-----------------------------------------------------------------------
Work hard and stay hungry. Lazy people get nowhere in life.
An excuse is a way for a LAZY person to feel better.
The word LUCKY is how a lazy person describes someone who works hard.
(3 rows)
Now, we will use the REGEXP_REPLACE()
function with the i
flag specifiec to replace all occurrences of the word lazy
with active
regardless of the case sensitivity:
SELECT quotes_text, REGEXP_REPLACE(quotes_text, 'lazy', 'active', 'i') AS "New quotes" FROM quotes;
In this case, all occurrences of the word lazy
have been replaced with active
:
quotes_text | New quotes
-----------------------------------------------------------------------+-------------------------------------------------------------------------
Work hard and stay hungry. Lazy people get nowhere in life. | Work hard and stay hungry. active people get nowhere in life.
An excuse is a way for a LAZY person to feel better. | An excuse is a way for a active person to feel better.
The word LUCKY is how a lazy person describes someone who works hard. | The word LUCKY is how a active person describes someone who works hard.
(3 rows)
Specifying one or more flags
Without specifying the g
flag, REGEXP_REPLACE()
function replaces only the first occurrence of a substring:
SELECT REGEXP_REPLACE('ab12c', '[0-9]', 'X');
regexp_replace
----------------
abX2c
In this case, as you can see only the first digit (1
) was replaced with X
. By adding the g
flag, all occurrences are replaced with X
:
SELECT REGEXP_REPLACE('ab12c', '[0-9]', 'X', 'g');
regexp_replace
----------------
abXXc
If you use multiple flags, the last one takes precedence. If you use the ci
flags, the regex will be case-insensitive, while using the ic
flags it will be case-sensitive
Restrictions
- The function returns
NULL
if there are no input rows or NULL
values
- If the regular expression pattern isn’t found in the string, the
REGEXP_REPLACE()
function returns the original string
i
and c
flags shouldn’t be used with each other