RegExp pattern - references (Definition)

Groups of characters in a pattern can be referred to symbolically later in the expression.

Availability:

ECMAScript edition - 3

We can use parentheses to repeat a match so that the same text is matched in more than one place in the expression. It is not the pattern that is repeated but the matched value. We do this by referring to the sub-expression according to its indexed location within the pattern. The index is incremented for every left parenthesis encountered. The first is referred to as \1, the second as \2 and so on. This is useful for balancing matching quote symbols around a text string that might have either single or double quote marks around it. Unless we can relate the matches at each end, we might find that we do in fact have either one of the two quote symbols but we don't necessarily have a balanced pair.

This matches a single or double quote character:

/['"]/

This matches a single or double quote character at either end of any other character sequence:

/['"].*['"]/

However, the text between the quotes could contain a quote so we'll replace the match between them with any non-quote character. Like this:

/['"][^'"]*['"]/

To use a reference to a previous sub-expression, we need to mark the sub-expression with parentheses:

/(['"])[^'"]*['"]/

Now we can refer to the first sub-expression at a later stage and require that the same characters be repeated:

/(['"])[^'"]*\1/

This ensures that the sequence of characters 'AAA' would match but 'AAA" would not even though "AAA" would.

A grouped sub-expression can be prevented from being indexed by placing a question mark and colon immediately inside the parentheses and then the item cannot be indexed as a reference. This works in JavaScript version 1.3 onwards.

Warnings:

See also:RegExp pattern, RegExp pattern - extension syntax, RegExp pattern - grouping, RegExp pattern - sub-patterns, RegExp.$n, RegExp.lastParen

Cross-references:

ECMA 262 edition 3 - section - 15.10.1