| RE2 regular expression syntax reference |
| ------------------------------------- |
| |
| Single characters: |
| . any character, possibly including newline (s=true) |
| [xyz] character class |
| [^xyz] negated character class |
| \d Perl character class |
| \D negated Perl character class |
| [[:alpha:]] ASCII character class |
| [[:^alpha:]] negated ASCII character class |
| \pN Unicode character class (one-letter name) |
| \p{Greek} Unicode character class |
| \PN negated Unicode character class (one-letter name) |
| \P{Greek} negated Unicode character class |
| |
| Composites: |
| xy «x» followed by «y» |
| x|y «x» or «y» (prefer «x») |
| |
| Repetitions: |
| x* zero or more «x», prefer more |
| x+ one or more «x», prefer more |
| x? zero or one «x», prefer one |
| x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more |
| x{n,} «n» or more «x», prefer more |
| x{n} exactly «n» «x» |
| x*? zero or more «x», prefer fewer |
| x+? one or more «x», prefer fewer |
| x?? zero or one «x», prefer zero |
| x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer |
| x{n,}? «n» or more «x», prefer fewer |
| x{n}? exactly «n» «x» |
| x{} (== x*) NOT SUPPORTED vim |
| x{-} (== x*?) NOT SUPPORTED vim |
| x{-n} (== x{n}?) NOT SUPPORTED vim |
| x= (== x?) NOT SUPPORTED vim |
| |
| Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}» |
| reject forms that create a minimum or maximum repetition count above 1000. |
| Unlimited repetitions are not subject to this restriction. |
| |
| Possessive repetitions: |
| x*+ zero or more «x», possessive NOT SUPPORTED |
| x++ one or more «x», possessive NOT SUPPORTED |
| x?+ zero or one «x», possessive NOT SUPPORTED |
| x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED |
| x{n,}+ «n» or more «x», possessive NOT SUPPORTED |
| x{n}+ exactly «n» «x», possessive NOT SUPPORTED |
| |
| Grouping: |
| (re) numbered capturing group (submatch) |
| (?P<name>re) named & numbered capturing group (submatch) |
| (?<name>re) named & numbered capturing group (submatch) NOT SUPPORTED |
| (?'name're) named & numbered capturing group (submatch) NOT SUPPORTED |
| (?:re) non-capturing group |
| (?flags) set flags within current group; non-capturing |
| (?flags:re) set flags during re; non-capturing |
| (?#text) comment NOT SUPPORTED |
| (?|x|y|z) branch numbering reset NOT SUPPORTED |
| (?>re) possessive match of «re» NOT SUPPORTED |
| re@> possessive match of «re» NOT SUPPORTED vim |
| %(re) non-capturing group NOT SUPPORTED vim |
| |
| Flags: |
| i case-insensitive (default false) |
| m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false) |
| s let «.» match «\n» (default false) |
| U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false) |
| Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»). |
| |
| Empty strings: |
| ^ at beginning of text or line («m»=true) |
| $ at end of text (like «\z» not «\Z») or line («m»=true) |
| \A at beginning of text |
| \b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other) |
| \B not at ASCII word boundary |
| \G at beginning of subtext being searched NOT SUPPORTED pcre |
| \G at end of last match NOT SUPPORTED perl |
| \Z at end of text, or before newline at end of text NOT SUPPORTED |
| \z at end of text |
| (?=re) before text matching «re» NOT SUPPORTED |
| (?!re) before text not matching «re» NOT SUPPORTED |
| (?<=re) after text matching «re» NOT SUPPORTED |
| (?<!re) after text not matching «re» NOT SUPPORTED |
| re& before text matching «re» NOT SUPPORTED vim |
| re@= before text matching «re» NOT SUPPORTED vim |
| re@! before text not matching «re» NOT SUPPORTED vim |
| re@<= after text matching «re» NOT SUPPORTED vim |
| re@<! after text not matching «re» NOT SUPPORTED vim |
| \zs sets start of match (= \K) NOT SUPPORTED vim |
| \ze sets end of match NOT SUPPORTED vim |
| \%^ beginning of file NOT SUPPORTED vim |
| \%$ end of file NOT SUPPORTED vim |
| \%V on screen NOT SUPPORTED vim |
| \%# cursor position NOT SUPPORTED vim |
| \%'m mark «m» position NOT SUPPORTED vim |
| \%23l in line 23 NOT SUPPORTED vim |
| \%23c in column 23 NOT SUPPORTED vim |
| \%23v in virtual column 23 NOT SUPPORTED vim |
| |
| Escape sequences: |
| \a bell (== \007) |
| \f form feed (== \014) |
| \t horizontal tab (== \011) |
| \n newline (== \012) |
| \r carriage return (== \015) |
| \v vertical tab character (== \013) |
| \* literal «*», for any punctuation character «*» |
| \123 octal character code (up to three digits) |
| \x7F hex character code (exactly two digits) |
| \x{10FFFF} hex character code |
| \C match a single byte even in UTF-8 mode |
| \Q...\E literal text «...» even if «...» has punctuation |
| |
| \1 backreference NOT SUPPORTED |
| \b backspace NOT SUPPORTED (use «\010») |
| \cK control char ^K NOT SUPPORTED (use «\001» etc) |
| \e escape NOT SUPPORTED (use «\033») |
| \g1 backreference NOT SUPPORTED |
| \g{1} backreference NOT SUPPORTED |
| \g{+1} backreference NOT SUPPORTED |
| \g{-1} backreference NOT SUPPORTED |
| \g{name} named backreference NOT SUPPORTED |
| \g<name> subroutine call NOT SUPPORTED |
| \g'name' subroutine call NOT SUPPORTED |
| \k<name> named backreference NOT SUPPORTED |
| \k'name' named backreference NOT SUPPORTED |
| \lX lowercase «X» NOT SUPPORTED |
| \ux uppercase «x» NOT SUPPORTED |
| \L...\E lowercase text «...» NOT SUPPORTED |
| \K reset beginning of «$0» NOT SUPPORTED |
| \N{name} named Unicode character NOT SUPPORTED |
| \R line break NOT SUPPORTED |
| \U...\E upper case text «...» NOT SUPPORTED |
| \X extended Unicode sequence NOT SUPPORTED |
| |
| \%d123 decimal character 123 NOT SUPPORTED vim |
| \%xFF hex character FF NOT SUPPORTED vim |
| \%o123 octal character 123 NOT SUPPORTED vim |
| \%u1234 Unicode character 0x1234 NOT SUPPORTED vim |
| \%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim |
| |
| Character class elements: |
| x single character |
| A-Z character range (inclusive) |
| \d Perl character class |
| [:foo:] ASCII character class «foo» |
| \p{Foo} Unicode character class «Foo» |
| \pF Unicode character class «F» (one-letter name) |
| |
| Named character classes as character class elements: |
| [\d] digits (== \d) |
| [^\d] not digits (== \D) |
| [\D] not digits (== \D) |
| [^\D] not not digits (== \d) |
| [[:name:]] named ASCII class inside character class (== [:name:]) |
| [^[:name:]] named ASCII class inside negated character class (== [:^name:]) |
| [\p{Name}] named Unicode property inside character class (== \p{Name}) |
| [^\p{Name}] named Unicode property inside negated character class (== \P{Name}) |
| |
| Perl character classes (all ASCII-only): |
| \d digits (== [0-9]) |
| \D not digits (== [^0-9]) |
| \s whitespace (== [\t\n\f\r ]) |
| \S not whitespace (== [^\t\n\f\r ]) |
| \w word characters (== [0-9A-Za-z_]) |
| \W not word characters (== [^0-9A-Za-z_]) |
| |
| \h horizontal space NOT SUPPORTED |
| \H not horizontal space NOT SUPPORTED |
| \v vertical space NOT SUPPORTED |
| \V not vertical space NOT SUPPORTED |
| |
| ASCII character classes: |
| [[:alnum:]] alphanumeric (== [0-9A-Za-z]) |
| [[:alpha:]] alphabetic (== [A-Za-z]) |
| [[:ascii:]] ASCII (== [\x00-\x7F]) |
| [[:blank:]] blank (== [\t ]) |
| [[:cntrl:]] control (== [\x00-\x1F\x7F]) |
| [[:digit:]] digits (== [0-9]) |
| [[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]) |
| [[:lower:]] lower case (== [a-z]) |
| [[:print:]] printable (== [ -~] == [ [:graph:]]) |
| [[:punct:]] punctuation (== [!-/:-@[-`{-~]) |
| [[:space:]] whitespace (== [\t\n\v\f\r ]) |
| [[:upper:]] upper case (== [A-Z]) |
| [[:word:]] word characters (== [0-9A-Za-z_]) |
| [[:xdigit:]] hex digit (== [0-9A-Fa-f]) |
| |
| Unicode character class names--general category: |
| C other |
| Cc control |
| Cf format |
| Cn unassigned code points NOT SUPPORTED |
| Co private use |
| Cs surrogate |
| L letter |
| LC cased letter NOT SUPPORTED |
| L& cased letter NOT SUPPORTED |
| Ll lowercase letter |
| Lm modifier letter |
| Lo other letter |
| Lt titlecase letter |
| Lu uppercase letter |
| M mark |
| Mc spacing mark |
| Me enclosing mark |
| Mn non-spacing mark |
| N number |
| Nd decimal number |
| Nl letter number |
| No other number |
| P punctuation |
| Pc connector punctuation |
| Pd dash punctuation |
| Pe close punctuation |
| Pf final punctuation |
| Pi initial punctuation |
| Po other punctuation |
| Ps open punctuation |
| S symbol |
| Sc currency symbol |
| Sk modifier symbol |
| Sm math symbol |
| So other symbol |
| Z separator |
| Zl line separator |
| Zp paragraph separator |
| Zs space separator |
| |
| Unicode character class names--scripts: |
| Adlam |
| Ahom |
| Anatolian_Hieroglyphs |
| Arabic |
| Armenian |
| Avestan |
| Balinese |
| Bamum |
| Bassa_Vah |
| Batak |
| Bengali |
| Bhaiksuki |
| Bopomofo |
| Brahmi |
| Braille |
| Buginese |
| Buhid |
| Canadian_Aboriginal |
| Carian |
| Caucasian_Albanian |
| Chakma |
| Cham |
| Cherokee |
| Common |
| Coptic |
| Cuneiform |
| Cypriot |
| Cyrillic |
| Deseret |
| Devanagari |
| Dogra |
| Duployan |
| Egyptian_Hieroglyphs |
| Elbasan |
| Ethiopic |
| Georgian |
| Glagolitic |
| Gothic |
| Grantha |
| Greek |
| Gujarati |
| Gunjala_Gondi |
| Gurmukhi |
| Han |
| Hangul |
| Hanifi_Rohingya |
| Hanunoo |
| Hatran |
| Hebrew |
| Hiragana |
| Imperial_Aramaic |
| Inherited |
| Inscriptional_Pahlavi |
| Inscriptional_Parthian |
| Javanese |
| Kaithi |
| Kannada |
| Katakana |
| Kayah_Li |
| Kharoshthi |
| Khmer |
| Khojki |
| Khudawadi |
| Lao |
| Latin |
| Lepcha |
| Limbu |
| Linear_A |
| Linear_B |
| Lisu |
| Lycian |
| Lydian |
| Mahajani |
| Makasar |
| Malayalam |
| Mandaic |
| Manichaean |
| Marchen |
| Masaram_Gondi |
| Medefaidrin |
| Meetei_Mayek |
| Mende_Kikakui |
| Meroitic_Cursive |
| Meroitic_Hieroglyphs |
| Miao |
| Modi |
| Mongolian |
| Mro |
| Multani |
| Myanmar |
| Nabataean |
| New_Tai_Lue |
| Newa |
| Nko |
| Nushu |
| Ogham |
| Ol_Chiki |
| Old_Hungarian |
| Old_Italic |
| Old_North_Arabian |
| Old_Permic |
| Old_Persian |
| Old_Sogdian |
| Old_South_Arabian |
| Old_Turkic |
| Oriya |
| Osage |
| Osmanya |
| Pahawh_Hmong |
| Palmyrene |
| Pau_Cin_Hau |
| Phags_Pa |
| Phoenician |
| Psalter_Pahlavi |
| Rejang |
| Runic |
| Samaritan |
| Saurashtra |
| Sharada |
| Shavian |
| Siddham |
| SignWriting |
| Sinhala |
| Sogdian |
| Sora_Sompeng |
| Soyombo |
| Sundanese |
| Syloti_Nagri |
| Syriac |
| Tagalog |
| Tagbanwa |
| Tai_Le |
| Tai_Tham |
| Tai_Viet |
| Takri |
| Tamil |
| Tangut |
| Telugu |
| Thaana |
| Thai |
| Tibetan |
| Tifinagh |
| Tirhuta |
| Ugaritic |
| Vai |
| Warang_Citi |
| Yi |
| Zanabazar_Square |
| |
| Vim character classes: |
| \i identifier character NOT SUPPORTED vim |
| \I «\i» except digits NOT SUPPORTED vim |
| \k keyword character NOT SUPPORTED vim |
| \K «\k» except digits NOT SUPPORTED vim |
| \f file name character NOT SUPPORTED vim |
| \F «\f» except digits NOT SUPPORTED vim |
| \p printable character NOT SUPPORTED vim |
| \P «\p» except digits NOT SUPPORTED vim |
| \s whitespace character (== [ \t]) NOT SUPPORTED vim |
| \S non-white space character (== [^ \t]) NOT SUPPORTED vim |
| \d digits (== [0-9]) vim |
| \D not «\d» vim |
| \x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim |
| \X not «\x» NOT SUPPORTED vim |
| \o octal digits (== [0-7]) NOT SUPPORTED vim |
| \O not «\o» NOT SUPPORTED vim |
| \w word character vim |
| \W not «\w» vim |
| \h head of word character NOT SUPPORTED vim |
| \H not «\h» NOT SUPPORTED vim |
| \a alphabetic NOT SUPPORTED vim |
| \A not «\a» NOT SUPPORTED vim |
| \l lowercase NOT SUPPORTED vim |
| \L not lowercase NOT SUPPORTED vim |
| \u uppercase NOT SUPPORTED vim |
| \U not uppercase NOT SUPPORTED vim |
| \_x «\x» plus newline, for any «x» NOT SUPPORTED vim |
| |
| Vim flags: |
| \c ignore case NOT SUPPORTED vim |
| \C match case NOT SUPPORTED vim |
| \m magic NOT SUPPORTED vim |
| \M nomagic NOT SUPPORTED vim |
| \v verymagic NOT SUPPORTED vim |
| \V verynomagic NOT SUPPORTED vim |
| \Z ignore differences in Unicode combining characters NOT SUPPORTED vim |
| |
| Magic: |
| (?{code}) arbitrary Perl code NOT SUPPORTED perl |
| (??{code}) postponed arbitrary Perl code NOT SUPPORTED perl |
| (?n) recursive call to regexp capturing group «n» NOT SUPPORTED |
| (?+n) recursive call to relative group «+n» NOT SUPPORTED |
| (?-n) recursive call to relative group «-n» NOT SUPPORTED |
| (?C) PCRE callout NOT SUPPORTED pcre |
| (?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED |
| (?&name) recursive call to named group NOT SUPPORTED |
| (?P=name) named backreference NOT SUPPORTED |
| (?P>name) recursive call to named group NOT SUPPORTED |
| (?(cond)true|false) conditional branch NOT SUPPORTED |
| (?(cond)true) conditional branch NOT SUPPORTED |
| (*ACCEPT) make regexps more like Prolog NOT SUPPORTED |
| (*COMMIT) NOT SUPPORTED |
| (*F) NOT SUPPORTED |
| (*FAIL) NOT SUPPORTED |
| (*MARK) NOT SUPPORTED |
| (*PRUNE) NOT SUPPORTED |
| (*SKIP) NOT SUPPORTED |
| (*THEN) NOT SUPPORTED |
| (*ANY) set newline convention NOT SUPPORTED |
| (*ANYCRLF) NOT SUPPORTED |
| (*CR) NOT SUPPORTED |
| (*CRLF) NOT SUPPORTED |
| (*LF) NOT SUPPORTED |
| (*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre |
| (*BSR_UNICODE) NOT SUPPORTED pcre |
| |