.Net 2.0 正則表達(dá)式里的$在Multiline模式下的精確含意
在正則表達(dá)式里,$本來代表字符串的結(jié)尾,但是如果使用了RegexOptions.Multiline選項(xiàng)的話,它的含意就變成了任意行的結(jié)尾.
我們知道,在MS平臺上,行與行之間的分隔符是\r\n,先回車,再換行,那么這個(gè)$到底是在\r的前邊,還是在\r后后面,還是\n的后面?再者,如果字符串里包含了沒有連續(xù)成對出現(xiàn)的\r和\n,那么這些字符的前后能不能匹配$?
using NUnit.Framework;
[TestFixture]
public class RegexTest
{
private readonly string text1 = @"String 1
String 2
String 3";
private readonly string text2 = @"String 1
String 2
String 3
";
private readonly string text3 = "String 1\rString 2\rString 3\r";
private readonly string text4 = "String 1\nString 2\nString 3\n";
[Test]
public void BeforeReturn()
{
Regex r = new Regex(@"^String \d+$", RegexOptions.Multiline);
Assert.AreEqual(1, MatchCount(r, text1));
Assert.AreEqual("String 3", r.Match(text1).Value);
Assert.AreEqual(0, MatchCount(r, text2));
Assert.AreEqual(0, MatchCount(r, text3));
Assert.AreEqual(3, MatchCount(r, text4));
}
[Test]
public void BeforeNextLine()
{
Regex r = new Regex(@"^String \d+\r$", RegexOptions.Multiline);
//由于最后一行結(jié)尾沒有\(zhòng)r,所以最后一行未被匹配
Assert.AreEqual(2, MatchCount(r, text1));
Assert.AreEqual(3, MatchCount(r, text2));
Assert.AreEqual(0, MatchCount(r, text3));
Assert.AreEqual(0, MatchCount(r, text4));
}
[Test]
public void AfterNextLine()
{
Regex r = new Regex(@"^String \d+\r\n$", RegexOptions.Multiline);
Assert.AreEqual(0, MatchCount(r, text1));
Assert.AreEqual(1, MatchCount(r, text2));
//注意,這里的$實(shí)際上是匹配到了一下個(gè)空行的行尾
Assert.AreEqual("String 3\r\n", r.Match(text2).Value);
Assert.AreEqual(0, MatchCount(r, text3));
Assert.AreEqual(0, MatchCount(r, text4));
}
[Test]
public void FirstCharNextLine()
{
string s = "\nabc";
Regex r = new Regex("$", RegexOptions.Multiline);
//即使\n是第一個(gè)字符,它的前面依然能匹配$
Assert.AreEqual(2, MatchCount(r, s));
}
[Test]
public void LastCharNextLine()
{
string s = "abc\n";
Regex r = new Regex("$", RegexOptions.Multiline);
//即使最后一個(gè)字符是\n,它的后面依然能匹配$
Assert.AreEqual(2, MatchCount(r, s));
}
[Test]
public void OnlyNextLine()
{
string s = "\n";
Regex r = new Regex("$", RegexOptions.Multiline);
//\n之前之后各有一個(gè)
Assert.AreEqual(2, MatchCount(r, s));
}
[Test]
public void Nothing()
{
Regex r = new Regex("$", RegexOptions.Multiline);
Assert.AreEqual(1, MatchCount(r, string.Empty));
}
int MatchCount(Regex r, string s)
{
return r.Matches(s).Count;
}
}
結(jié)論是:$匹配\n之前的位置以及字符串結(jié)束前的位置.
而在對RegexTester進(jìn)行調(diào)試時(shí),我發(fā)現(xiàn)RichTextBox對\r\n的處理十分古怪--對它的Text屬性使用文本可視化工具查看,結(jié)果是 有換行;但是對其調(diào)用Contains("\n"),返回的結(jié)果居然是false!時(shí)間有限,出現(xiàn)這種現(xiàn)象的原因留待以后深究.目前至少證實(shí)了,如果牽扯到換行的話,最好還是直接用代碼對正則表達(dá)式進(jìn)行測試.





