NotePad++ 正則表達(dá)式替換高級用法

心靈地圖sxh 2019-01-27

展開全文

在我們處理文件時，很多時候會用到查找與替換。當(dāng)我們想將文件中某一部分替換替換文件中另一部分時，怎么辦呢？下面正則表達(dá)式給我提供方法。
正則表達(dá)式，提供復(fù)雜并且彈性的查找與替換
注意: 不支持多行表達(dá)式 (involving \n, \r, etc).

1 基本表達(dá)式

符號解釋
. 匹配任意字符，除了新一行(\n)。也就是說 “.”可以匹配 \r ，當(dāng)文件中同時含有\(zhòng)r and \n時，會引起混亂。要匹配所有的字符，使用\s\S。
(…) 這個匹配一個標(biāo)簽區(qū)域. 這個標(biāo)簽可以被訪問，通過語法 \1訪問第一個標(biāo)簽, \2 訪問第二個, 同理 \3 \4 … \9。這些標(biāo)簽可以用在當(dāng)前正則表達(dá)式中，或則替search和replace中的換字符串。
\1, \2, etc 在替換中代表1到9的標(biāo)簽區(qū)域(\1 to \9)。例如, 查找字符串 Fred([1-9])XXX 并替換為字符串 Sam\1YYY的方法，當(dāng)在文件中找到Fred2XXX的字符串時，會替換為Sam2YYY。注意: 只有9個區(qū)域能使用，所以我們在使用時很安全，像\10\2 表示區(qū)域1和文本”0”以及區(qū)域2。
[…] 表示一個字符集合, 例如 [abc]表示任意字符 a, b or c.我們也可以使用范圍例如[a-z] 表示所以的小寫字母。
[^…] 表示字符補集. 例如, [^A-Za-z] 表示任意字符除了字母表。
^ 匹配一行的開始(除非在集合中, 如下).
$ 匹配行尾.
* 匹配0或多次, 例如 Sam 匹配 Sm, Sam, Saam, Saaam 等等.
+ 匹配1次或多次,例如 Sa+m 匹配 Sam, Saam, Saaam 等等.
? 匹配0或者1次, 例如 Sa?m 匹配 Sm, Sam.
{n} 匹配確定的 n 次.例如, 'Sa{2}m’ 匹配 Saam.
{m,n} 匹配至少m次，至多n次(如果n缺失，則任意次數(shù)).例如, 'Sa{2,3}m’ 匹配 Saam or Saaam. 'Sa{2,}m’ 與 'Saa+m’相同
?, +?, ??, {n,m}? 非貪心匹配，匹配第一個有效的匹配，通常 '<.>’ 會匹配整個 'content’字符串 –但 '<.?>’ 只匹配 ” .這個標(biāo)記一個標(biāo)簽區(qū)域，這些區(qū)域可以用語法\1 \2 等訪問多個對應(yīng)1-9區(qū)域。

2 標(biāo)記和分組

符號解釋
(…) 一組捕獲. 可以通過\1 訪問第一個組, \2 訪問第二個.
(?:…) 非捕獲組.
(?=…) 非捕獲組 – 向前斷言. 例如’(.)(?=ton)’ 表達(dá)式，當(dāng) 遇到’Appleton’字符串時，會匹配為’Apple’.
(?<=…) 非捕獲組 – 向后斷言. 例如’(?<=sir) (.)’ 表示式，當(dāng)遇到’sir William’ 字符串時，匹配為’ William’.
(?!…) 非捕獲組 – 消極的向前斷言. 例如’.(?!e)’ 表達(dá)式，當(dāng)遇到’Apple’時，會找到每個字母除了 'l’，因為它緊跟著 'e’.
(? 非捕獲組 – 消極向后斷言. 例如 '(?
(?P…) 命名所捕獲的組. 提交一個名稱到組中供后續(xù)使用，例如’(?PA[^\s]+)\s(?P=first)’ 會找到 'Apple Apple’. 類似的 '(A[^\s]+)\s\1’ 使用組名而不是數(shù)字.
(?=name) 匹配名為name的組. (?P…).
(?#comment) 批注 –括號中的內(nèi)容在匹配時將被忽略。

3 特殊符號

符號解釋
\s 匹配空格. 注意，會匹配標(biāo)記的末尾. 使用 [[:blank:]] 來避免匹配新一行。
\S 匹配非空白
\w 匹配單詞字符
\W 匹配非單詞字符
\d 匹配數(shù)字字符
\D 匹配非數(shù)字字符
\b 匹配單詞邊界. '\bW\w+’ 找到W開頭的單詞
\B 匹配非單詞邊界. '\Be\B+’ – 找到位于單子中間的字母’e’
\< This matches the start of a word using Scintilla’s definitions of words.
> This matches the end of a word using Scintilla’s definition of words.
\x 運行用x來表達(dá)可能具有其他意思的字符。例如, [ 用來插入到文本中作為[ 而不是作為字符集的開始.

4 字符類

符號解釋
[[:alpha:]] 匹配字母字符: [A-Za-z]
[[:digit:]] 匹配數(shù)字字符: [0-9]
[[:xdigit:]] 匹配16進(jìn)制字符: [0-9A-Fa-f]
[[:alnum:]] 匹配字母數(shù)字字符: [0-9A-Za-z]
[[:lower:]] 匹配小寫字符: [a-z]
[[:upper:]] 匹配大寫字符: [A-Z]
[[:blank:]] 匹配空白 (空格 or tab):[ \t]
[[:space:]] 匹配空白字符:[ \t\r\n\v\f]
[[:punct:]] 匹配標(biāo)點字符: [-!”#$%&’()*+,./:;<=>?@[]_`{
[[:graph:]] 匹配圖形字符: [\x21-\x7E]
[[:print:]] 匹配可打印的字符 (graphical characters and spaces)
[[:cntrl:]] 匹配控制字符

5 替換操作

使用正則表達(dá)式的標(biāo)記，通過（）來包圍想要用的字符，然后用\1 來替換字符串，第一個匹配文本。
例如:
Text body Search string Replace string Result
Hi my name is Fred my name is (.+) my name is not \1 Hi my name is not Fred
The quick brown fox jumped over the fat lazy dog brown (.+) jumped over the (.+) brown \2 jumped over the \1 The quick brown fat jumped over the fox lazy dog

6 限制

Support for regular expressions in PN2 is currently limited, the supported patterns and syntax are a very small subset of the powerful expressions supported by perl. 最大的限制是正則表達(dá)式只能匹配單行，不能用多行匹配表達(dá)?？梢杂肂ackslash Expressions代替.
準(zhǔn)備計劃是使用PCRE庫 library (used elsewhere in PN2) 來支持文檔搜索.
from http://www./docs/search/regular_expressions/
作者：Evan_Gu 來源：CSDN
原文：https://blog.csdn.net/gdp12315_gu/article/details/51730584

Regular Expressions

Search Patterns

_NOTE: For older versions of PN the tagged expressions start $ and end $ and there are no non-capture groups nor the backslash groups. _

Regular Expressions allow complicated and flexible search/replace using a specific syntax.

Note: Multi-line expressions (involving \n, \r, etc) are not yet supported. See Restrictions below.

Basic Expressions

Pattern	Meaning
`.`	Matches any character except new line (\n). Note: That this means "." will also match \r which might cause some confusion when you are editing a file with both \r and \n. To match all characters including new lines you can use \s\S.
`(...)`	This marks a region for tagging a match. These tag can be access using the syntax \1 for the first tag, \2 for the second, and \3 \4 ... \9. These tags can be used within the current regular expression or in the replacement string in a search/replace.
`	`
`\1, \2, etc`	This refers to the first through ninth (\1 to \9) tagged region when replacing. For example, if the search string was Fred([1-9])XXX and the replace string was Sam\1YYY, when applied to Fred2XXX this would generate Sam2YYY. Note: As only 9 regions can be used you can safely use replace string \10\2 to produce "text from region 1"0"text from region 2".
`[...]`	This indicates a set of characters, for example, [abc] means any of the characters a, b or c. You can also use ranges, for example [a-z] for any lower case character.
`[^...]`	The complement of the characters in the set. For example, `[^A-Za-z]` means any character except an alphabetic character.
`^`	This matches the start of a line (unless used inside a set, see above).
`$`	This matches the end of a line.
`*`	This matches 0 or more times. For example, Sa*m matches Sm, Sam, Saam, Saaam and so on.
`+`	This matches 1 or more times. For example, Sa+m matches Sam, Saam, Saaam and so on.
`?`	This matches 0 or 1 occurences. For example, Sa?m matches Sm, Sam.
`{n}`	This matches exactly n times. For example, 'Sa{2}m' matches Saam.
`{m,n}`	This matches at least m times at most n times (if n is excluded then any number of times). For example, 'Sa{2,3}m' matches Saam or Saaam. 'Sa{2,}m' is the same as 'Saa+m'
`*?, +?, ??, {n,m}?`	non-greedy matches -- matches the first valid match. Normally '<.>' will match the whole string 'content' -- but '<.?>' will match '' and ''.
	This marks a region for tagging a match. These tag can be access using the syntax \1 for the first tag, \2 for the second, and \3 \4 ... \9. These tags can be used within the current regular expression or in the replacement string in a search/replace.

Tagging and Groups

Pattern	Meaning
`(...)`	A capture group. Accessable though \1 for the first group, \2 for the second and so on.
`(?:...)`	A Non-capture group.
`(?=...)`	Non-capture group -- Look ahead assertion. '(.*)(?=ton)' given 'Appleton' will match 'Apple'.
`(?<=...)`	Non-capture group -- Look behind assertion. '(?<=sir) (.*)' given 'sir William' will find ' William'.
`(?!...)`	Non-capture group -- negative look ahead assertion. '.(?!e)' given 'Apple' will find each letter with the exception of 'l' because it is followed by an 'e'.
`(?<!...)`	Non-capture group -- negative look behind assertion. '(?<!sir) (.*)(?=ton)' given 'sir William' will find ' William'.
`(?P<name>...)`	Named capture group. Assign a name to a group for later use: '(?PA[^\s]+)\s(?P=first)' will find 'Apple Apple'. Similar to '(A[^\s]+)\s\1' but uses names rather than group number.
`(?=name)`	Match to named group. see (?P...) for example.
`(?#comment)`	Comment -- contents of the parentheses are ignored during matching.

Special Symbols

Pattern	Meaning
`\s`	Match whitespace. note: will match the end of like marker. Use `[[:blank:]]` when you need to avoid matching to a newline character.
`\S`	Match non-whitespace
`\w`	Match word character
`\W`	Match non-word character
`\d`	Match numeric digit
`\D`	Match non-digit character
`\b`	Match word boundary. '\bW\w+' -- finds words that begin with a 'W'
`\B`	Match non-word boundary. '\Be\B+' -- finds the letter 'e' only when it is in the middle of a word
`\<`	This matches the start of a word using Scintilla's definitions of words.
`\>`	This matches the end of a word using Scintilla's definition of words.
`\x`	This allows you to use a character x that would otherwise have a special meaning. For example, [ would be interpreted as [ and not as the start of a character set.

Character Classes

Pattern	Meaning
`[[:alpha:]]`	Match a letter character: [A-Za-z]
`[[:digit:]]`	Match a digit character: [0-9]
`[[:xdigit:]]`	Match a hexadecimal digit character: [0-9A-Fa-f]
`[[:alnum:]]`	Match an alphanumeric character: [0-9A-Za-z]
`[[:lower:]]`	Match a lower case character: [a-z]
`[[:upper:]]`	Match an upper case character: [A-Z]
`[[:blank:]]`	Match a blank (space or tab):[ \t]
`[[:space:]]`	Match a whitespace character):[ \t\r\n\v\f]
`[[:punct:]]`	Match a punctuation character: [-!"#$%&'()*+,./:;<=>?@[\]_`{
`[[:graph:]]`	Match Graphical character: [\x21-\x7E]
`[[:print:]]`	Match Printable character (graphical characters and spaces)
`[[:cntrl:]]`	Match control character

Replacing

Regular Expressions supports tagged expressions. This is accomplished using ( and ) to surround the text you want tagged, and then using \1 in the replace string to substitute the first matched text, \2 for the second, etc.

For example:

Text body	Search string	Replace string	Result
Hi my name is Fred	`my name is (.+)`	`my name is not \1`	Hi my name is not Fred
The quick brown fox jumped over the fat lazy dog	`brown (.+) jumped over the (.+)`	`brown \2 jumped over the \1`	The quick brown fat jumped over the fox lazy dog

Restrictions

Support for regular expressions in PN2 is currently limited, the supported patterns and syntax are a very small subset of the powerful expressions supported by perl. The biggest restriction is that regular expressions match only within a single line, you cannot use multi-line regular expressions. As a workaround to the lack of multi-line search, you can instead use BackslashExpressions.

There are plans to improve this support by using the PCRE library (used elsewhere in PN2) to provide document searching. If you're interested in helping please make yourself known to the pn-discuss mailing list: PN Mailing Lists.

Examples

Description	Search	Replace
Remove leading whitespace on each line	`^[ \t]*`
Change getVariable() to setVariable()	`get(\w+)`	`set\1()`

Breaking up a URL to display arguments:
Given a URL such as http://www.google.com/search?q=Programmers+Notepad&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

Description	Search	Replace
Breakup a URL's arguments	`.?[?&](\w+)=([\w-+%.:]+)(.)$`	`\1: \2\n\3`

Place cursor at the beginning of the line. Hit the FindNext button, and then repeatedly hit the Replace button. Note that hitting "replace all" will not work right since we want to search though the results of the last replace.

Result is a list of the URL parameters:

q: Programmers+Notepad
ie: utf-8
oe: utf-8
aq: t
rls: org.mozilla:en-US:official
client: firefox-a

小男孩‘自慰网亚洲一区二区,亚洲一级在线播放毛片,亚洲中文字幕av每天更新,黄aⅴ永久免费无码,91成人午夜在线精品,色网站免费在线观看,亚洲欧洲wwwww在线观看

NotePad++ 正則表達(dá)式替換高級用法

1 基本表達(dá)式

2 標(biāo)記和分組

3 特殊符號

4 字符類

5 替換操作

6 限制

Regular Expressions

Search Patterns

Basic Expressions

Tagging and Groups

Special Symbols

Character Classes

Replacing

Restrictions

Examples

See Also

小男孩‘自慰网亚洲一区二区,亚洲一级在线播放毛片,亚洲中文字幕av每天更新,黄aⅴ永久免费无码,91成人午夜在线精品,色网站免费在线观看,亚洲欧洲wwwww在线观看

NotePad++ 正則表達(dá)式替換 高級用法

1 基本表達(dá)式

2 標(biāo)記和分組

3 特殊符號

4 字符類

5 替換操作

6 限制

Regular Expressions

Search Patterns

Basic Expressions

Tagging and Groups

Special Symbols

Character Classes

Replacing

Restrictions

Examples

See Also

NotePad++ 正則表達(dá)式替換高級用法