正则表达式 Frank from https://regex101.com/

Similar documents
=~ determines to which variable the regex is applied. In its absence, $_ is used.

Paolo Santinelli Sistemi e Reti. Regular expressions. Regular expressions aim to facilitate the solution of text manipulation problems

Regular Expressions. Michael Wrzaczek Dept of Biosciences, Plant Biology Viikki Plant Science Centre (ViPS) University of Helsinki, Finland

Regular Expressions. Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl)

Regular Expressions. Computer Science and Engineering College of Engineering The Ohio State University. Lecture 9

Regular Expressions Overview Suppose you needed to find a specific IPv4 address in a bunch of files? This is easy to do; you just specify the IP

ICP Enablon User Manual Factory ICP Enablon 用户手册 工厂 Version th Jul 2012 版本 年 7 月 16 日. Content 内容

PCU50 的整盘备份. 本文只针对操作系统为 Windows XP 版本的 PCU50 PCU50 启动硬件自检完后, 出现下面文字时, 按向下光标键 光标条停在 SINUMERIK 下方的空白处, 如下图, 按回车键 PCU50 会进入到服务画面, 如下图

OTAD Application Note

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

Regular Expressions!!

Command Dictionary CUSTOM

nbns-list netbios-type network next-server option reset dhcp server conflict 1-34

perlrebackslash - Perl Regular Expression Backslash Sequences and Escapes

Previous on Computer Networks Class 18. ICMP: Internet Control Message Protocol IP Protocol Actually a IP packet

This page covers the very basics of understanding, creating and using regular expressions ('regexes') in Perl.

Skill-building Courses Business Analysis Lesson 3 Problem Solving

Introduction to Regular Expressions Version 1.3. Tom Sgouros

Lecture 3 for pipelining

Regular Expression Reference

Server-side Web Development (I3302) Semester: 1 Academic Year: 2017/2018 Credits: 4 (50 hours) Dr Antoun Yaacoub

Configuring the RADIUS Listener LEG

Here's an example of how the method works on the string "My text" with a start value of 3 and a length value of 2:

New Media Data Analytics and Application. Lecture 7: Information Acquisition An Integration Ting Wang

ZWO 相机固件升级参考手册. ZWO Camera Firmware Upgrade reference manual. 版权所有 c 苏州市振旺光电有限公司 保留一切权利 非经本公司许可, 任何组织和个人不得擅自摘抄 复制本文档内容的部分或者全部, 并

psed [-an] script [file...] psed [-an] [-e script] [-f script-file] [file...]

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

Regex, Sed, Awk. Arindam Fadikar. December 12, 2017

The Design of Everyday Things

Perl Regular Expressions. Perl Patterns. Character Class Shortcuts. Examples of Perl Patterns

AWK - PRETTY PRINTING

如何查看 Cache Engine 缓存中有哪些网站 /URL

STREAM EDITOR - REGULAR EXPRESSIONS

Pattern Matching. An Introduction to File Globs and Regular Expressions

Pattern Matching. An Introduction to File Globs and Regular Expressions. Adapted from Practical Unix and Programming Hunter College

实验三十三 DEIGRP 的配置 一 实验目的 二 应用环境 三 实验设备 四 实验拓扑 五 实验要求 六 实验步骤 1. 掌握 DEIGRP 的配置方法 2. 理解 DEIGRP 协议的工作过程

NAME DESCRIPTION. Modifiers. Perl version documentation - perlre. perlre - Perl regular expressions

The CSV data parser plugin PRINTED MANUAL

Safe Memory-Leak Fixing for C Programs

Operating Systems. Chapter 4 Threads. Lei Duan

Appendix. As a quick reference, here you will find all the metacharacters and their descriptions. Table A-1. Characters

Java Basic Datatypees

<properties> <jdk.version>1.8</jdk.version> <project.build.sourceencoding>utf-8</project.build.sourceencoding> </properties>

A Benchmark For Stroke Extraction of Chinese Characters

Do case-insensitive pattern matching. If use locale is in effect, the case map is taken from the current locale. See perllocale.

Essentials for Scientific Computing: Stream editing with sed and awk

Most times, the pattern is evaluated in double-quotish context, but it is possible to choose delimiters to force single-quotish, like

Regular Expressions. Regular Expression Syntax in Python. Achtung!

PowerGREP. Manual. Version October 2005

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!--- global properties --> <property>

Guidelines for TCLT7 Full Size Paper Submission

上汽通用汽车供应商门户网站项目 (SGMSP) User Guide 用户手册 上汽通用汽车有限公司 2014 上汽通用汽车有限公司未经授权, 不得以任何形式使用本文档所包括的任何部分

Logitech G302 Daedalus Prime Setup Guide 设置指南

OpenCascade 的曲面.

Configuring the RADIUS Listener Login Event Generator

VAS 5054A FAQ ( 所有 5054A 整合, 中英对照 )

The ASCII data query and parser plugin PRINTED MANUAL

CHINA VISA APPLICATION CONCIERGE SERVICE*

Understanding IO patterns of SSDs

1. Features. 2,Block diagram. 3. Outline dimension V power supply. 3. Assembled with 20 x 4 character displays

Technology: Anti-social Networking 科技 : 反社交网络

AvalonMiner Raspberry Pi Configuration Guide. AvalonMiner 树莓派配置教程 AvalonMiner Raspberry Pi Configuration Guide

Oxford isolution. 下載及安裝指南 Download and Installation Guide

U-CONTROL UMX610/UMX490/UMX250. The Ultimate Studio in a Box: 61/49/25-Key USB/MIDI Controller Keyboard with Separate USB/Audio Interface

Chapter 2. Lexical Elements & Operators

String Manipulation. Module 6

Windows Batch VS Linux Shell. Jason Zhu

UNIT - I. Introduction to C Programming. BY A. Vijay Bharath

三 依赖注入 (dependency injection) 的学习

TCL - STRINGS. Boolean value can be represented as 1, yes or true for true and 0, no, or false for false.

More Scripting and Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

Chapter 7: Deadlocks. Operating System Concepts 9 th Edition

The l3regex package: regular expressions in TEX

UNIX / LINUX - REGULAR EXPRESSIONS WITH SED

Perl Programming. Bioinformatics Perl Programming

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Version June 2017

最短路径算法 Dijkstra 一 图的邻接表存储结构及实现 ( 回顾 ) 1. 头文件 graph.h. // Graph.h: interface for the Graph class. #if!defined(afx_graph_h C891E2F0_794B_4ADD_8772_55BA3

DECLARATIONS. Character Set, Keywords, Identifiers, Constants, Variables. Designed by Parul Khurana, LIECA.

display portal server display portal user display portal user count display portal web-server

Typescript on LLVM Language Reference Manual

X Language Definition

Epetra_Matrix. August 14, Department of Science and Engineering Computing School of Mathematics School Peking University

CST Lab #5. Student Name: Student Number: Lab section:

计算机组成原理第二讲 第二章 : 运算方法和运算器 数据与文字的表示方法 (1) 整数的表示方法. 授课老师 : 王浩宇

NyearBluetoothPrint SDK. Development Document--Android

Regexs with DFA and Parse Trees. CS230 Tutorial 11

H3C CAS 虚拟机支持的操作系统列表. Copyright 2016 杭州华三通信技术有限公司版权所有, 保留一切权利 非经本公司书面许可, 任何单位和个人不得擅自摘抄 复制本文档内容的部分或全部, 并不得以任何形式传播 本文档中的信息可能变动, 恕不另行通知

Oriented Scene Text Detection Revisited. Xiang Bai Huazhong University of Science and Technology

Introduction to regular expressions

Regular Expressions. Perl PCRE POSIX.NET Python Java

ITP 342 Mobile App Dev. Strings

Packaging 10Apr2012 Rev V Specification MBXL HSG 1. PURPOSE 目的 2. APPLICABLE PRODUCT 适用范围

Regular Expressions Explained

Chapter 11 SHANDONG UNIVERSITY 1

ITC213: STRUCTURED PROGRAMMING. Bhaskar Shrestha National College of Computer Studies Tribhuvan University

Triangle - Delaunay Triangulator

Twin API Guide. How to use Twin

Dr. Sarah Abraham University of Texas at Austin Computer Science Department. Regular Expressions. Elements of Graphics CS324e Spring 2017

JFlex Regular Expressions

Transcription:

符号 英文说明 中文说明 \n Matches a newline character 新行 \r Matches a carriage return character 回车 \t Matches a tab character Tab 键 \0 Matches a null character Matches either an a, b or c character [abc] [^abc] [a-z] [^a-z] [a-za-z] [[:alnum:]] [[:alpha:]] [[:ascii:]] [[:blank:]] [[:cntrl:]] [[:digit:]] [[:graph:]] [[:lower:]] [[:print:]] [[:punct:]] /[abc]+/ a bb ccc Matches any character except for an a, b or c /[^abc]+/ Anything but abc. Matches any characters between a and z, including a and z /[a-z]+/ Only a-z Matches any characters except one in the range a-z /[^a-z]+/ Anything but a-z. Matches any characters between a-z or A-Z. You can combine as much as you please. /[a-za-z]+/ abc123def An alternate way to match any letter or digit /[[:alnum:]]/ 1st, 2nd, and 3rd. An alternate way to match alpanumeric letters /[[:alpha:]]+/ hello, there! Matches spaces and tabs (but not newlines) /[[:blank:]]/ Matches characters that are often used to control text presentation, including newlines, null characters, tabs and the escape character. Equivalent to [\x00-\x1f\x7f]. Matches decimal digits. Equivalent to [0-9]. /[[:digit:]]/ one: 1, two: 2 Matches printable, non-whitespace characters only. Matches lowercase letters. Equivalent to [a-z]. /[[:lower:]]+/ abcdefghi Matches printable characters, such as letters and spaces, without including control characters. Matches characters that are not whitespace, letters or numbers. 匹配某一个字符不匹配这些字符匹配 a 到 z 之间的字符匹配 a 到 z 之外的字符匹配字母 第 1 页

/[[:punct:]]/ hello, regex user! Matches whitespace characters. Equivalent to \s. [[:space:]] /[[:space:]]+/ any whitespace character Matches uppercase letters. Equivalent to [A-Z]. [[:upper:]] /[[:upper:]]+/ ABCabcDEF Matches letters, numbers and underscores. Equivalent to \w [[:word:]] /[[:word:]]+/ any word character Matches hexadecimal digits. Equivalent to [0-9a-fA-F]. [[:xdigit:]]. \s \S \d \D \w \W /[[:xdigit:]]+/ hex123! Matches any character other than newline (or including newline with the /s flag) /.+/ Matches any space, tab or newline character. /\s/ any whitespace character Matches anything other than a space, tab or newline. /\S+/ any non-whitespace Matches any decimal digit. Equivalent to [0-9]. /\d/ one: 1, two: 2 Matches anything other than a decimal digit. /\D+/ one: 1, two: 2 Matches any letter, number or underscore. /\w+/ any word character Matches anything other than a letter, number or underscore. /\W+/ any word character 匹配所有字符, 除了新行匹配所有空白字符匹配所有非空白字符匹配所有数字匹配所有非数字匹配任意字母数字或下划线匹配除了数字字母下划线之外的字符 \X Matches any valid unicode sequence \C Matches exactly one data unit of input \R Matches any unicode newline character. \v Matches newlines and vertical tabs. Works with unicode. \V Matches anything not matched by \v \h Matches spaces and horizontal tabs. Works with unicode. /\h/

Matches anything not matched by \H. \H \K \n /\H/ Sets the given position in the regex as the new "start" of the match. This means that nothing preceding the \K will be captured in the overall match. Usually referred to as a `backreference`, this will match a repeat of the text captured in a previous set of parentheses. Matches a unicode character with the given property. \px /\pl+/ Matches a unicode character with the given group of properties. \p{ } /\p{l}+/ Matches a unicode character without the given property. \PX \P{ } \Q \E \k<name> \k name \k{name} \gn \g{n} \g{-n} \g name \g<n> \g n \g<+n> \g +n /\PL/ Matches a unicode character that doesn't have any of the given properties. /\P{L}/ Any characters between \Q and \E, including metacharacters, will be treated as literals. /\Qeverything \w is ^ literal\e/ everything \w is ^ literal Matches the text matched by a previously named capture group. This is an alternate syntax for \k<name>. This is an alternate syntax for \k<name>. This matches the text captured in the nth group. n can contain more than one digit, if necessary. This may be useful in order to avoid ambiguity with octal characters. This is an alternate syntax for \gn. It can be useful in a situation where a literal number needs to be matched immediately after a \gn in the regex. This matches the text captured in the nth group before the current position in the regex. Recursively matches the given named subpattern. Recursively matches the given subpattern. Alternate syntax for \g<n> Recursively matches the nth pattern ahead of the current position in the regex. Alternate syntax for \g<+n> Matches the 8-bit character with the given hex value. \xyy /\x20/ match all spaces

\x{yyyy} Matches the 16-bit character with the given hex value. Matches the 8-bit character with the given octal value. \ddd \cy [\b] /\041/ ocal escape! Matches ASCII characters typically associated with the Control+A through Control+Z: \x01 through \x1a Matches the backspace control character. \ ( ) (a b) (?: ) (?> ) (? ) This may be used to obtain the literal value of any metacharacter. /\\w/ match \w literally Parts of the regex enclosed in parentheses may be referred to later in the expression or extracted from the results of a successful match. /(he)+/ heheh he heh Matches the a or the b part of the subexpression. This construct is similar to (...), but won't create a capture group. /(?:he)+/ heheh he heh Matches the longest possible substring in the group and doesn't allow later backtracking to reevaluate the group. Any subpatterns in (...) in such a group share the same number. 转义字符 捕获所有 () 内的内容 匹配 的内容但是不捕获 (?#...) Any text appearing in this group is ignored in the regex. (? name ) (?<name> ) (?P<name> ) This capturing group can be referred to using the given name instead of a number. This capturing group can be referred to using the given name instead of a number. This capturing group can be referred to using the given name instead of a number. These enable setting regex flags within the expression itself. (?imsxxu) (?( ) ) (?R) /a(?i)a/ aa Aa aa AA If the given pattern matches, matches the pattern before the vertical bar. Otherwise, matches the pattern after the vertical bar. Recursively match the entire expression. (?1) Recursively match the first subpattern. (?+1) (?&name) (?P=name) (?P>name) Recursively match the first pattern following the given position in the expression. Recursively matches the given named subpattern. Matches the text matched by a previously named capture group. Recursively matches the given named subpattern. Matches the given subpattern without consuming characters (?= ) /foo(?=bar)/ foobar foobaz (?!...) Starting at the current position in the expression, ensures that the given pattern will not match. Does not consume characters.

/foo(?!bar)/ foobar foobaz (?<= ) (?<!...) (*UTF16) Ensures that the given pattern will match, ending at the current position in the expression. Does not consume any characters. /(?<=foo)bar/ foobar foobaz Ensures that the given pattern would not match and end at the current position in the expression. Does not consume characters. /(?<!not )foo/ not foo but foo Verbs allow for advanced control of the regex engine. Full specs can be found in pcre.txt a? a* a+ a{3} a{3,} a{3,6} a.* a*? a*+ \G ^ Matches an `a` character or nothing. /ba?/ ba b a Matches zero or more consecutive `a` characters. /ba*/ a ba baa aaa ba b Matches one or more consecutive `a` characters. /a+/ a aa aaa aaaa bab baab Matches exactly 3 consecutive `a` characters. /a{3}/ a aa aaa aaaa Matches at least 3 consecutive `a` characters. /a{3,}/ a aa aaa aaaa aaaaaa Matches between 3 and 6 (inclusive) consecutive `a` characters. /a{3,6}/ a aa aaa aaaa aaaaaaaaaa Matches as many characters as possible. /a.*a/ greedy can be dangerous at times Matches as few characters as possible. /r\w*?/ r re regex Matches as many characters as possible; backtracking can't reduce the number of characters matched. This will match at the position the previous successful match ended. Useful with the /g flag. Matches the start of a string without consuming any characters. If multiline mode is used, this will also match immediately after a newline character. /^\w+/ start of string? 表示一次或没有 * 表示 0 次或多次 + 表示 1 次或多次准确地,3 次重复 3 次以上重复 3 次到 6 次重复. 表示任意字符 ;.* 表示任意长度的串总体表示匹配尽可能长的串匹配尽可能短的串 $ Matches the end of a string without consuming any characters. If multiline mode is used, this will also match immediately before a

newline character. /\w+$/ end of string \A \Z \z Matches the start of a string only. Unlike ^, this is not affected by multiline mode. /\A\w+/ start of string Matches the end of a string only. Unlike $, this is not affected by multiline mode. /\w+\z/ end of string Matches the end of a string only. Unlike $, this is not affected by multiline mode, and, in contrast to \Z, will not match before a trailing newline at the end of a string. /\w+\z/ absolute end of string \b \B g m i x s u X U Matches, without consuming any characters, immediately between a character matched by \w and a character not matched by \w (in either order). /d\b/ word boundaries are odd Matches, without consuming any characters, at the position between two characters matched by \w. /r\b/ regex is really cool Tells the engine not to stop after the first match has been found, but rather to continue until no more matches can be found. The ^ and $ anchors now match at the beginning/end of each line respectively, instead of beginning/end of the entire string. A case insensitive match is performed, meaning capital letters will be matched by non-capital letters and vice versa. This flag tells the engine to ignore all whitespace and allow for comments in the regex. Comments are indicated by a starting "#"-character. The dot (.) metacharacter will with this flag enabled also match new lines. Pattern strings will be treated as UTF-16. Any character following a \ that is not a valid meta sequence will be faulted and raise an error. The engine will per default do lazy matching, instead of greedy. This means that a? following a quantifier instead makes it greedy. A The pattern is forced to become anchored, equal to ^. \0 This will return a string with the complete match result from the regex. \1 $1 This will return a string with the contents from the first capture group. The number, in this case 1, can be any number as long as it corresponds to a valid capture group. This will return a string with the contents from the first capture group. The number, in this case 1, can be any number as long as it corresponds to a valid capture group.

${foo} \{foo} \g,foo> \g<1> \x20 \x{06fa} This will return a string with the contents from the capture group named `foo`. Any name can be used as long as it is defined in the regex. This syntax is made up and specific to only Regex101. If the J-flag is specified, content will be taken from the first capture group with the same name. This will return a string with the contents from the capture group named `foo`. Any name can be used as long as it is defined in the regex. This syntax is made up and specific to only Regex101. If the J-flag is specified, content will be taken from the first capture group with the same name. This will return a string with the contents from the capture group named `foo`. Any name can be used as long as it is defined in the regex. If the J-flag is specified, content will be taken from the first capture group with the same name. This will return a string with the contents from the first capture group. The number, in this case 1, can be any number as long as it corresponds to a valid capture group. You can use hexadecimals to insert any character into the replacement string using the standard syntax. You can use hexadecimals to insert any character into the replacement string using the standard syntax. \t Insert a tab character. \r Insert a carriage return character. \n Insert a newline character. \f Insert a form-feed character.