Character classes are a group of RegExp character literals enclosed in square brackets. This set of characters can then be placed into a RegExp pattern to provide a match for any character that is considered to be a member of the class.
You can use character classes to include or exclude groups of characters. An exclusion is called a negated class and is signified by placing a circumflex (^) character as the first one inside the square brackets. Thus, [abcd] will match true if the character being tested at that position in the pattern is the letter "a", "b", "c" or "d".
The character class [^abcd] will match any character as long as it is not one of those four.
You can indicate a range of characters by using a hyphen. Therefore [abcd] is the same as saying [a-d]. Other examples are summarized in the following table.
There are special character classes already built in so that you don't have to create your own. These are done using escaped characters very similar to those in the literal character set. These can also be used inside the square brackets to construct other class groups of characters.
Pattern | Description |
---|---|
[ ... ] | Any single character that is one of the set enclosed in the square brackets. |
[^ ... ] | Any single character that is not one of the set enclosed in the square brackets. |
[^abcd] | Any character that is not one of the letters "a", "b", "c" or "d". |
[abcd] | Any one of the letters "a", "b", "c" or "d". |
[a-z] | Any single lower case character. |
[A-Z] | Any single upper case character. |
[a-zA-Z] | Any single alphabetic character. |
[0-7] | Any octal numeric digit. |
. | Any character apart from newline. |
\d | Any decimal digit character. |
\s | Any whitespace character. |
\w | Any word character (which is any letter, number or underscore). This does not mean a whitespace character. |
\D | Any non-digit character. |
\S | Any non-whitespace character. This is not necessarily a valid word character. |
\W | Any non-word character. |
[\b] | A literal backspace not to be confused with a word boundary match (using the \b outside of the square brackets) |
[0-1] | Any binary numeric digit. |
[0-9A-F] | Any hexadecimal numeric digit. |
[\dA-F] | Any hexadecimal numeric digit (alternative form). |
[a-zA-Z0-9] | Any single alphanumeric character. |
[a-zA-Z\d] | Any single alphanumeric character (alternative form). |
[^a-zA-Z0-9_\$] | Any character that is not valid within an identifier name. |
[a-zA-Z0-9_\$] | Any character that is valid within an identifier name. |
[0-9] | Any decimal numeric digit, (alternative version). |
[^0-9] | Any any character that is not a digit, (alternative version). |
[\t\n\r\f\v] | Any whitespace character, (alternative version). |
[^\t\n\r\f\v] | Any non-whitespace character, (alternative version). |
[^\n] | Any character apart from newline, (alternative version). |
[^a-zA-Z0-9_] | Any non-word character, (alternative version). |
[a-zA-Z0-9_] | Any word character, (alternative version). |
Beware of the backspace escape \b. To match against a backspace it must be enclosed in square brackets to create a character class. If it is used outside of the brackets, it is interpreted to mean a word boundary.
ECMA 262 edition 3 - section - 15.10.1
ECMA 262 edition 3 - section - 15.10.2.6
ECMA 262 edition 3 - section - 15.10.2.12
Prev | Home | Next |
RegExp pattern - attributes | Up | RegExp pattern - character literal |
JavaScript Programmer's Reference, Cliff Wootton Wrox Press (www.wrox.com) Join the Wrox JavaScript forum at p2p.wrox.com Please report problems to support@wrox.com © 2001 Wrox Press. All Rights Reserved. Terms and conditions. |