正则表达式 Frank from - PDF Free Download

符号英文说明中文说明 \n Matches a newline character 新行 \r Matches a carriage return character 回车 \t Matches a tab character Tab 键 \0 Matches a null character Matches either an a, b or c character [abc] [^abc] [a-z] [^a-z] [a-za-z] [[:alnum:]] [[:alpha:]] [[:ascii:]] [[:blank:]] [[:cntrl:]] [[:digit:]] [[:graph:]] [[:lower:]] [[:print:]] [[:punct:]] /[abc]+/ a bb ccc Matches any character except for an a, b or c /[^abc]+/ Anything but abc. Matches any characters between a and z, including a and z /[a-z]+/ Only a-z Matches any characters except one in the range a-z /[^a-z]+/ Anything but a-z. Matches any characters between a-z or A-Z. You can combine as much as you please. /[a-za-z]+/ abc123def An alternate way to match any letter or digit /[[:alnum:]]/ 1st, 2nd, and 3rd. An alternate way to match alpanumeric letters /[[:alpha:]]+/ hello, there! Matches spaces and tabs (but not newlines) /[[:blank:]]/ Matches characters that are often used to control text presentation, including newlines, null characters, tabs and the escape character. Equivalent to [\x00-\x1f\x7f]. Matches decimal digits. Equivalent to [0-9]. /[[:digit:]]/ one: 1, two: 2 Matches printable, non-whitespace characters only. Matches lowercase letters. Equivalent to [a-z]. /[[:lower:]]+/ abcdefghi Matches printable characters, such as letters and spaces, without including control characters. Matches characters that are not whitespace, letters or numbers. 匹配某一个字符不匹配这些字符匹配 a 到 z 之间的字符匹配 a 到 z 之外的字符匹配字母第 1 页

/[[:punct:]]/ hello, regex user! Matches whitespace characters. Equivalent to \s. [[:space:]] /[[:space:]]+/ any whitespace character Matches uppercase letters. Equivalent to [A-Z]. [[:upper:]] /[[:upper:]]+/ ABCabcDEF Matches letters, numbers and underscores. Equivalent to \w [[:word:]] /[[:word:]]+/ any word character Matches hexadecimal digits. Equivalent to [0-9a-fA-F]. [[:xdigit:]]. \s \S \d \D \w \W /[[:xdigit:]]+/ hex123! Matches any character other than newline (or including newline with the /s flag) /.+/ Matches any space, tab or newline character. /\s/ any whitespace character Matches anything other than a space, tab or newline. /\S+/ any non-whitespace Matches any decimal digit. Equivalent to [0-9]. /\d/ one: 1, two: 2 Matches anything other than a decimal digit. /\D+/ one: 1, two: 2 Matches any letter, number or underscore. /\w+/ any word character Matches anything other than a letter, number or underscore. /\W+/ any word character 匹配所有字符, 除了新行匹配所有空白字符匹配所有非空白字符匹配所有数字匹配所有非数字匹配任意字母数字或下划线匹配除了数字字母下划线之外的字符 \X Matches any valid unicode sequence \C Matches exactly one data unit of input \R Matches any unicode newline character. \v Matches newlines and vertical tabs. Works with unicode. \V Matches anything not matched by \v \h Matches spaces and horizontal tabs. Works with unicode. /\h/

Matches anything not matched by \H. \H \K \n /\H/ Sets the given position in the regex as the new "start" of the match. This means that nothing preceding the \K will be captured in the overall match. Usually referred to as a `backreference`, this will match a repeat of the text captured in a previous set of parentheses. Matches a unicode character with the given property. \px /\pl+/ Matches a unicode character with the given group of properties. \p{ } /\p{l}+/ Matches a unicode character without the given property. \PX \P{ } \Q \E \k<name> \k name \k{name} \gn \g{n} \g{-n} \g name \g<n> \g n \g<+n> \g +n /\PL/ Matches a unicode character that doesn't have any of the given properties. /\P{L}/ Any characters between \Q and \E, including metacharacters, will be treated as literals. /\Qeverything \w is ^ literal\e/ everything \w is ^ literal Matches the text matched by a previously named capture group. This is an alternate syntax for \k<name>. This is an alternate syntax for \k<name>. This matches the text captured in the nth group. n can contain more than one digit, if necessary. This may be useful in order to avoid ambiguity with octal characters. This is an alternate syntax for \gn. It can be useful in a situation where a literal number needs to be matched immediately after a \gn in the regex. This matches the text captured in the nth group before the current position in the regex. Recursively matches the given named subpattern. Recursively matches the given subpattern. Alternate syntax for \g<n> Recursively matches the nth pattern ahead of the current position in the regex. Alternate syntax for \g<+n> Matches the 8-bit character with the given hex value. \xyy /\x20/ match all spaces

\x{yyyy} Matches the 16-bit character with the given hex value. Matches the 8-bit character with the given octal value. \ddd \cy [\b] /\041/ ocal escape! Matches ASCII characters typically associated with the Control+A through Control+Z: \x01 through \x1a Matches the backspace control character. \ ( ) (a b) (?: ) (?> ) (? ) This may be used to obtain the literal value of any metacharacter. /\\w/ match \w literally Parts of the regex enclosed in parentheses may be referred to later in the expression or extracted from the results of a successful match. /(he)+/ heheh he heh Matches the a or the b part of the subexpression. This construct is similar to (...), but won't create a capture group. /(?:he)+/ heheh he heh Matches the longest possible substring in the group and doesn't allow later backtracking to reevaluate the group. Any subpatterns in (...) in such a group share the same number. 转义字符捕获所有 () 内的内容匹配的内容但是不捕获 (?#...) Any text appearing in this group is ignored in the regex. (? name ) (?<name> ) (?P<name> ) This capturing group can be referred to using the given name instead of a number. This capturing group can be referred to using the given name instead of a number. This capturing group can be referred to using the given name instead of a number. These enable setting regex flags within the expression itself. (?imsxxu) (?( ) ) (?R) /a(?i)a/ aa Aa aa AA If the given pattern matches, matches the pattern before the vertical bar. Otherwise, matches the pattern after the vertical bar. Recursively match the entire expression. (?1) Recursively match the first subpattern. (?+1) (?&name) (?P=name) (?P>name) Recursively match the first pattern following the given position in the expression. Recursively matches the given named subpattern. Matches the text matched by a previously named capture group. Recursively matches the given named subpattern. Matches the given subpattern without consuming characters (?= ) /foo(?=bar)/ foobar foobaz (?!...) Starting at the current position in the expression, ensures that the given pattern will not match. Does not consume characters.

/foo(?!bar)/ foobar foobaz (?<= ) (?<!...) (*UTF16) Ensures that the given pattern will match, ending at the current position in the expression. Does not consume any characters. /(?<=foo)bar/ foobar foobaz Ensures that the given pattern would not match and end at the current position in the expression. Does not consume characters. /(?<!not )foo/ not foo but foo Verbs allow for advanced control of the regex engine. Full specs can be found in pcre.txt a? a* a+ a{3} a{3,} a{3,6} a.* a*? a*+ \G ^ Matches an `a` character or nothing. /ba?/ ba b a Matches zero or more consecutive `a` characters. /ba*/ a ba baa aaa ba b Matches one or more consecutive `a` characters. /a+/ a aa aaa aaaa bab baab Matches exactly 3 consecutive `a` characters. /a{3}/ a aa aaa aaaa Matches at least 3 consecutive `a` characters. /a{3,}/ a aa aaa aaaa aaaaaa Matches between 3 and 6 (inclusive) consecutive `a` characters. /a{3,6}/ a aa aaa aaaa aaaaaaaaaa Matches as many characters as possible. /a.*a/ greedy can be dangerous at times Matches as few characters as possible. /r\w*?/ r re regex Matches as many characters as possible; backtracking can't reduce the number of characters matched. This will match at the position the previous successful match ended. Useful with the /g flag. Matches the start of a string without consuming any characters. If multiline mode is used, this will also match immediately after a newline character. /^\w+/ start of string? 表示一次或没有 * 表示 0 次或多次 + 表示 1 次或多次准确地,3 次重复 3 次以上重复 3 次到 6 次重复. 表示任意字符 ;.* 表示任意长度的串总体表示匹配尽可能长的串匹配尽可能短的串 $ Matches the end of a string without consuming any characters. If multiline mode is used, this will also match immediately before a

newline character. /\w+$/ end of string \A \Z \z Matches the start of a string only. Unlike ^, this is not affected by multiline mode. /\A\w+/ start of string Matches the end of a string only. Unlike $, this is not affected by multiline mode. /\w+\z/ end of string Matches the end of a string only. Unlike $, this is not affected by multiline mode, and, in contrast to \Z, will not match before a trailing newline at the end of a string. /\w+\z/ absolute end of string \b \B g m i x s u X U Matches, without consuming any characters, immediately between a character matched by \w and a character not matched by \w (in either order). /d\b/ word boundaries are odd Matches, without consuming any characters, at the position between two characters matched by \w. /r\b/ regex is really cool Tells the engine not to stop after the first match has been found, but rather to continue until no more matches can be found. The ^ and $ anchors now match at the beginning/end of each line respectively, instead of beginning/end of the entire string. A case insensitive match is performed, meaning capital letters will be matched by non-capital letters and vice versa. This flag tells the engine to ignore all whitespace and allow for comments in the regex. Comments are indicated by a starting "#"-character. The dot (.) metacharacter will with this flag enabled also match new lines. Pattern strings will be treated as UTF-16. Any character following a \ that is not a valid meta sequence will be faulted and raise an error. The engine will per default do lazy matching, instead of greedy. This means that a? following a quantifier instead makes it greedy. A The pattern is forced to become anchored, equal to ^. \0 This will return a string with the complete match result from the regex. \1 $1 This will return a string with the contents from the first capture group. The number, in this case 1, can be any number as long as it corresponds to a valid capture group. This will return a string with the contents from the first capture group. The number, in this case 1, can be any number as long as it corresponds to a valid capture group.

${foo} \{foo} \g,foo> \g<1> \x20 \x{06fa} This will return a string with the contents from the capture group named `foo`. Any name can be used as long as it is defined in the regex. This syntax is made up and specific to only Regex101. If the J-flag is specified, content will be taken from the first capture group with the same name. This will return a string with the contents from the capture group named `foo`. Any name can be used as long as it is defined in the regex. This syntax is made up and specific to only Regex101. If the J-flag is specified, content will be taken from the first capture group with the same name. This will return a string with the contents from the capture group named `foo`. Any name can be used as long as it is defined in the regex. If the J-flag is specified, content will be taken from the first capture group with the same name. This will return a string with the contents from the first capture group. The number, in this case 1, can be any number as long as it corresponds to a valid capture group. You can use hexadecimals to insert any character into the replacement string using the standard syntax. You can use hexadecimals to insert any character into the replacement string using the standard syntax. \t Insert a tab character. \r Insert a carriage return character. \n Insert a newline character. \f Insert a form-feed character.

正则表达式 Frank from https://regex101.com/