dev-notes/RegularExpressions.md
2021-02-06 21:37:33 +01:00

2.1 KiB

Common Regex Syntax

Character Types

\d any digit (0-9)
\D any non digit character
\s whitespace (space, tab, new line)
\S any non whitespace charaters
\w any alphanumeric charater (a-z, A-Z)
\W any non alphanumeric character
\b whitespace surrounding words (only at row start or end)
\B whitespace surrounding words (not at row start or end)
\A search only at string start
\Z search only at string end
. any charaters but newline (CRLF, CR, LF)

Quantifiers

+ one or more repetitions
* zero or more repetitions
? zero or one repetition
{m} exactly m times {m, n} at least m times, at most n times

The *, x, and ? qualifiers are all greedy; they match as much text as possible
Adding ? after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched.

Special Characters

\a, \b, \f, \n, \r, \t, \u, \U, \v, \x, \\, \?, \*, \+ , \., \^, \$ special characters \(, \), \[, \] brackets escaping

Delimiters

^ match must be at start of string/line
$ match must be at end of string/line
^__$ match must be whole string

Character classes

[__] one of the charaters in the class ([ab] --> a or b)
[__]{m , n} consecutive characters in the class ([aeiou]{2} --> ae, ao, ...)
[a-z] sequence of lowercase characters
[A-Z] sequence of uppercase characters
[a-zA-Z] sequence of lowercase or uppercase characters [a-z][A-Z] sequence of lowercase characters followed by sequence of uppercase charaters [^__] anything but the elements of the class (include \n to avoid matching line endings)

^, \, - and ] must be escaped to be used in clases: [ \]\[\^\- ]

Groups

(__) REGEX subgroup (REGEX_1 | REGEX_2) match in multiple regex (R1 OR R2) (?=__) match only if __ is next substring
(?!__) match only if __ is not next substring
(?<=__) match only if __ is previous substring
(?<!__) match only if __ is not previous substring

\<number> refers to n-th group

Special Cases

(.*) match anything (.*?) match anything, non-greedy match