RegexpExtract

The RegexpExtract function returns the substring that matches a regular expression within a string.

Syntax

RegexpExtract(string, substring, [position])

Function arguments:

  • string (required): The string to search
  • substring (required): The substring to extract with.
  • position (optional): The index of the match to return.
📘

When the regular expression you want to use contains a slash, quotation or other special character, you will need to use a backslash () to escape the special character.

Notes

  • When any of the arguments are Null, the function returns Null.
  • Regular expressions vary by database. Check the documentation of your data platform for syntax.
  • If the regular expression contains a capture group ((...)) and there are one or more matches for that capture group, Sigma returns the first capture group across all matches

Examples

RegexpExtract([Address], "[0-9]+")

Extracts the first match of numeric characters in the string. No position is specified so position defaults to 1.

In a table with columns Address and RegexpExtract, the street address is extracted from the address. 345 South Court 10016 becomes 345
RegexpExtract([Address], "[0-9]+", 2)

Extracts the second match of numeric characters in the string.

In a table with columns Address and RegexpExtract, the zip code is extracted. 345 South Court 10016 becomes 10016
RegexpExtract([Address], "\\s*([a-zA-Z]+)", 2)

Extracts the second match of alphabetical characters in the string.

In a table with columns Address and RegexpExtract, the zip code is extracted. 345 South Court becomes Court
RegexpExtract([Date], "(\\d{2})", 2)

Extracts the second match, day of date, of the 2-digit character group in the date.

In a table with columns Date and RegexpExtract, the day is extracted. 01.05.2022 becomes 05