Summary: in this tutorial, you’ll learn about JavaScript regex backreferences and how to apply them effectively.
Introduction to JavaScript regex backreferences
Backreferences allow you to reference the capturing groups in the regular expressions. Technically speaking, backreferences are like variables in regular expressions.
Here’s the syntax of a backreference:
\N
Code language: Python (python)
In this syntax, N
is an integer such as 1, 2, and 3 that represents the corresponding capturing group number.
Suppose you have a string with the duplicate word JavaScript
like this:
const s = 'JavaScript JavaScript is awesome';
Code language: Python (python)
And you want to remove the duplicate word (JavaScript
) so that the result string will be:
'JavaScript is awesome'
Code language: Python (python)
To do so, you can use a backreference in the regular expression.
First, match a word:
/\w+\s+/
Code language: Python (python)
Second, create a capturing group that captures the word:
/(\w+)\s+/
Code language: Python (python)
Third, use a backreference to reference the first capturing group:
/(\w+)\s+\1/
Code language: Python (python)
In this pattern, the \1
is a backreference that references the (\w+
) capturing group.
Finally, replace the entire match with the first capturing group using the String.replace()
method:
const s = 'JavaScript JavaScript is cool';
const pattern = /(\w+)\s+\1/;
const result = s.replace(pattern, '$1');
console.log(result);
Code language: Python (python)
Output:
JavaScript is cool
Code language: Python (python)
JavaScript regex backreference examples
The following examples show some practical applications of backreferences.
1) Using backreferences to get text inside quotes
To get the text inside the double quotes like this:
"JavaScript Regex Backreferences"
Code language: Python (python)
Or single quotes:
'JavaScript Regex Backreferences'
Code language: Python (python)
But not mixed of single and double-quotes:
'not match"
Code language: Python (python)
To do so, you might come up with the following regular expression:
/[\'"](.*?)[\'"]/
Code language: Python (python)
However, this regular expression also matches the text that starts with a single quote (‘) and ends with a double quote (“) or vice versa. For example:
const message = `"JavaScript's cool". They said`;
const pattern = /[\'"].*?[\'"]/;
const match = message.match(pattern);
console.log(match[0]);
Code language: Python (python)
It returns the "JavaScript'
not "JavaScript's cool"
.
To resolve it, you can use a backreference in the regular expression:
/([\'"]).*?\1/
Code language: Python (python)
The backreference \1
references the first capturing group. If the subgroup starts with a single quote, the \1
matches the single quote. And if the subgroup starts with double quotes, the \1
matches double-quotes.
For example:
const message = `"JavaScript's cool". They said`;
const pattern = /([\'"]).*?\1/;
const match = message.match(pattern);
console.log(match[0]);
Code language: Python (python)
Output:
"JavaScript's cool"
Code language: Python (python)
2) Using backreferences to find word that has at least one consecutive repeated character
The following example shows how to use a backreference to find the word that has at least one consecutive repeated character e.g., apple
(the letter p
is repeated):
const words = ['apple', 'orange', 'strawberry'];
const pattern = /\b\w*(\w)\1\w*\b/;
for (const word of words) {
const result = word.match(pattern);
if (result) {
console.log(result[0], '->', result[1]);
}
}
Code language: Python (python)
Output:
apple -> p
strawberry -> r
Code language: Python (python)
Summary
- Use backreferences to reference the capturing groups in a regular expression.