RegExp pattern - character class (Definition)

RegExp pattern components for describing character classes.

Availability:

ECMAScript edition - 3

Character classes are a group of RegExp character literals enclosed in square brackets. This set of characters can then be placed into a RegExp pattern to provide a match for any character that is considered to be a member of the class.

You can use character classes to include or exclude groups of characters. An exclusion is called a negated class and is signified by placing a circumflex (^) character as the first one inside the square brackets. Thus, [abcd] will match true if the character being tested at that position in the pattern is the letter "a", "b", "c" or "d".

The character class [^abcd] will match any character as long as it is not one of those four.

You can indicate a range of characters by using a hyphen. Therefore [abcd] is the same as saying [a-d]. Other examples are summarized in the following table.

There are special character classes already built in so that you don't have to create your own. These are done using escaped characters very similar to those in the literal character set. These can also be used inside the square brackets to construct other class groups of characters.

PatternDescription
[ ... ]Any single character that is one of the set enclosed in the square brackets.
[^ ... ]Any single character that is not one of the set enclosed in the square brackets.
[^abcd]Any character that is not one of the letters "a", "b", "c" or "d".
[abcd]Any one of the letters "a", "b", "c" or "d".
[a-z]Any single lower case character.
[A-Z]Any single upper case character.
[a-zA-Z]Any single alphabetic character.
[0-7]Any octal numeric digit.
.Any character apart from newline.
\dAny decimal digit character.
\sAny whitespace character.
\wAny word character (which is any letter, number or underscore). This does not mean a whitespace character.
\DAny non-digit character.
\SAny non-whitespace character. This is not necessarily a valid word character.
\WAny non-word character.
[\b]A literal backspace not to be confused with a word boundary match (using the \b outside of the square brackets)
[0-1]Any binary numeric digit.
[0-9A-F]Any hexadecimal numeric digit.
[\dA-F]Any hexadecimal numeric digit (alternative form).
[a-zA-Z0-9]Any single alphanumeric character.
[a-zA-Z\d]Any single alphanumeric character (alternative form).
[^a-zA-Z0-9_\$]Any character that is not valid within an identifier name.
[a-zA-Z0-9_\$]Any character that is valid within an identifier name.
[0-9]Any decimal numeric digit, (alternative version).
[^0-9]Any any character that is not a digit, (alternative version).
[\t\n\r\f\v]Any whitespace character, (alternative version).
[^\t\n\r\f\v]Any non-whitespace character, (alternative version).
[^\n]Any character apart from newline, (alternative version).
[^a-zA-Z0-9_]Any non-word character, (alternative version).
[a-zA-Z0-9_]Any word character, (alternative version).

Warnings:

See also:RegExp pattern, RegExp pattern - alternation, RegExp pattern - character literal

Cross-references:

ECMA 262 edition 3 - section - 15.10.1

ECMA 262 edition 3 - section - 15.10.2.6

ECMA 262 edition 3 - section - 15.10.2.12