A.1 Metacharacters and POSIX character classes
\w
matches any word character (alphabet or number, or alphanumeric) and underscore, equivalent to[A-Za-z0-9_]
.
\W
is the opposite of\w
that matches non-word character, or[^A-Za-z0-9_]
\d
matches any single digit number
.
matches any character except linebreaks, equivalent to[^\r\n]
(Windows) or[\n]
(Mac)
\s
matches any white space, including spaces, tabs and vertical tab, return and line breaks, equivalent to[:space:]
in the following table.
\S
is the opposite of\s
that matches any non-white character.[\s\S]
is a common shorthand for matching everything, since.
does not match linebreak.
And there are POSIX character classes.
class | description |
---|---|
[:alnum:] |
alphabets or numbers, equivalent to [A-Za-z0-9] |
[:alpha:] |
alphabets, equivalent to [A-Za-z] |
[:punct:] |
punctuation |
[:blank:] |
space or tab, equivalent to [\t ] |
[:space:] |
any whitespace character including space [\f\n\r\t\v ] |
[:print:] |
any printable character, a similar expression is [:graph:] which excludes space |
[:xdigit:] |
any hexadecimal digit, equivalent to [F-Aa-f0-9] |