|
一、統(tǒng)計(jì)單詞在字符串中出現(xiàn)的次數(shù) 請(qǐng)注意,若要執(zhí)行計(jì)數(shù),請(qǐng)先調(diào)用Split方法來創(chuàng)建詞數(shù)組。Split方法存在性能開銷,如果對(duì)字符串執(zhí)行的唯一操作是計(jì)數(shù)詞,則應(yīng)考慮改用Matches或 IndexOf方法。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 統(tǒng)計(jì)單詞在字符串中出現(xiàn)的次數(shù) const string text = @"Historically, the world of data and the world of objects" + @" have not been well integrated. Programmers work in C# or Visual Basic" + @" and also in SQL or XQuery. On the one side are concepts such as classes," + @" objects, fields, inheritance, and .NET Framework APIs. On the other side" + @" are tables, columns, rows, nodes, and separate languages for dealing with" + @" them. Data types often require translation between the two worlds; there are" + @" different standard functions. Because the object world has no notion of query, a" + @" query can only be represented as a string without compile-time type checking or" + @" IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to" + @" objects in memory is often tedious and error-prone."; const string searchWord = "data"; //字符串轉(zhuǎn)換成數(shù)組
var source = text.Split(new[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries); //創(chuàng)建查詢,并忽略大小寫比較。
var query = from word in source where string.Equals(word, searchWord, StringComparison.InvariantCultureIgnoreCase) select word; //統(tǒng)計(jì)匹配數(shù)量
var wordCount = query.Count();
Console.WriteLine($"{wordCount} occurrences(s) of the search word \"{searchWord}\" were found.");
Console.Read(); #endregion
}
}運(yùn)行結(jié)果如下:
二、查詢包含指定一組單詞的句子 此示例演示如何查找文本文件中包含指定一組單詞中每個(gè)單詞匹配項(xiàng)的句子。雖然在此示例中搜索條件數(shù)組是硬編碼的,但也可以在運(yùn)行時(shí)動(dòng)態(tài)填充此 數(shù)組。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 查詢包含指定一組單詞的句子 const string text = @"Historically, the world of data and the world of objects " + @"have not been well integrated. Programmers work in C# or Visual Basic " + @"and also in SQL or XQuery. On the one side are concepts such as classes, " + @"objects, fields, inheritance, and .NET Framework APIs. On the other side " + @"are tables, columns, rows, nodes, and separate languages for dealing with " + @"them. Data types often require translation between the two worlds; there are " + @"different standard functions. Because the object world has no notion of query, a " + @"query can only be represented as a string without compile-time type checking or " + @"IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to " + @"objects in memory is often tedious and error-prone."; //將文本塊切割成數(shù)組
var sentences = text.Split('.', '?', '!'); //定義搜索條件,此列表可以在運(yùn)行時(shí)動(dòng)態(tài)添加。
string[] wordsToMatch = { "Historically", "data", "integrated" }; var query = from sentence in sentences
let t = sentence.Split(new char[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries) where t.Distinct().Intersect(wordsToMatch).Count() == wordsToMatch.Length //去重,取交集后的數(shù)量對(duì)比。
select sentence; foreach (var sentence in query)
{
Console.WriteLine(sentence);
}
Console.Read(); #endregion
}
}運(yùn)行結(jié)果如下:
查詢運(yùn)行時(shí)首先將文本拆分成句子,然后將句子拆分成包含每個(gè)單詞的字符串?dāng)?shù)組。對(duì)于每個(gè)這樣的數(shù)組,Distinct<TSource> 方法移除所有重復(fù)的單詞, 然后查詢對(duì)單詞數(shù)組和wordstoMatch數(shù)組執(zhí)行Intersect<TSource>操作。如果交集的計(jì)數(shù)與wordsToMatch數(shù)組的計(jì)數(shù)相同,則在單詞中找到了所有的單詞, 然后返回原始句子。 在對(duì)Split的調(diào)用中,使用標(biāo)點(diǎn)符號(hào)作為分隔符,以從字符串中移除標(biāo)點(diǎn)符號(hào)。如果您沒有這樣做,則假如您有一個(gè)字符串“Historically,”,該字符串不會(huì) 與wordsToMatch數(shù)組中的“Historically”相匹配。根據(jù)源文本中標(biāo)點(diǎn)的類型,您可能必須使用其他分隔符。 三、在字符串中查詢字符 因?yàn)?a >String類實(shí)現(xiàn)泛型IEnumerable<T>接口,所以可以將任何字符串作為字符序列進(jìn)行查詢。但是,這不是LINQ的常見用法。若要執(zhí)行復(fù)雜的模式匹配操 作,請(qǐng)使用Regex類。 下面的示例查詢一個(gè)字符串以確定它包含的數(shù)字的數(shù)目。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 在字符串中查詢字符 const string source = "ABCDE99F-J74-12-89A"; //只選擇數(shù)字的字符
var digits = from character in source where char.IsDigit(character) select character;
Console.Write("Digit:"); foreach (var digit in digits)
{
Console.Write($"{digit} ");
}
Console.WriteLine(); //選擇第一個(gè)"-"之前的所有字符
var query = source.TakeWhile(x => x != '-'); foreach (var character in query)
{
Console.Write(character);
}
Console.Read(); #endregion
}
}運(yùn)行結(jié)果如下:
四、正則表達(dá)式結(jié)合LINQ查詢 此示例演示如何使用Regex類創(chuàng)建正則表達(dá)式以便在文本字符串中進(jìn)行更復(fù)雜的匹配。使用LINQ查詢可以方便地對(duì)您要用正則表達(dá)式搜索的文件進(jìn)行準(zhǔn)確 篩選以及對(duì)結(jié)果進(jìn)行加工。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 正則表達(dá)式結(jié)合LINQ查詢 //請(qǐng)根據(jù)不同版本的VS進(jìn)行路徑修改
const string floder = @"C:\Program Files (x86)\Microsoft Visual Studio\"; var fileInfoes = GetFiles(floder); //創(chuàng)建正則表達(dá)式來尋找所有的"Visual"
var searchTerm = new Regex(@"http://(www.w3.org|www.npmjs.org)"); //搜索每一個(gè)“.html”文件 //通過where找到匹配項(xiàng) //注意:select中的變量要求顯示聲明其類型,因?yàn)镸atchCollection不是泛型IEnumerable集合。
var query = from fileInfo in fileInfoes where fileInfo.Extension == ".html"
let text = File.ReadAllText(fileInfo.FullName)
let matches = searchTerm.Matches(text) where matches.Count > 0
select new
{
name = fileInfo.FullName,
matchValue = from Match match in matches select match.Value
};
Console.WriteLine($"The term \"{searchTerm}\" was found in:");
Console.WriteLine(); foreach (var q in query)
{ //修剪匹配找到的文件中的路徑
Console.WriteLine($"name==>{q.name.Substring(floder.Length - 1)}"); //輸出找到的匹配值
foreach (var v in q.matchValue)
{
Console.WriteLine($"matchValue==>{v}");
} //輸出空白行 Console.WriteLine();
}
Console.Read(); #endregion
} /// <summary>
/// 獲取指定路徑的文件信息 /// </summary>
/// <param name="path"></param>
/// <returns></returns>
private static IList<FileInfo> GetFiles(string path)
{ var files = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories); return files.Select(file => new FileInfo(file)).ToList();
}
}運(yùn)行結(jié)果如下:
五、查找兩個(gè)集合間的差異 此示例演示如何使用LINQ對(duì)兩個(gè)字符串列表進(jìn)行比較,并輸出那些位于text1.txt中但不在text2.txt中的行。 ![]() Bankov, Peter Holm, Michael Garcia, Hugo Potra, Cristina Noriega, Fabricio Aw, Kam Foo Beebe, Ann Toyoshima, Tim Guy, Wey Yuan Garcia, Debra ![]() Liu, Jinghao Bankov, Peter Holm, Michael Garcia, Hugo Beebe, Ann Gilchrist, Beth Myrcha, Jacek Giakoumakis, Leo McLin, Nkenge El Yassir, Mehdi ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 查找兩個(gè)集合間的差異 //創(chuàng)建數(shù)據(jù)源
var text1 = File.ReadAllLines(@"..\..\text1.txt"); var text2 = File.ReadAllLines(@"..\..\text2.txt"); //創(chuàng)建查詢,這里必須使用方法語法。
var query = text1.Except(text2); //執(zhí)行查詢
Console.WriteLine("The following lines are in text1.txt but not text2.txt"); foreach (var name in query)
{
Console.WriteLine(name);
}
Console.Read(); #endregion
}
}運(yùn)行結(jié)果如下:
注:某些類型的查詢操作(如 Except<TSource>、Distinct<TSource>、Union<TSource> 和 Concat<TSource>)只能用基于方法的語法表示。 六、排序或過濾任意單詞或字段的文本數(shù)據(jù) 下面的示例演示如何按結(jié)構(gòu)化文本(如逗號(hào)分隔值)行中的任意字段對(duì)該文本行進(jìn)行排序,可在運(yùn)行時(shí)動(dòng)態(tài)指定該字段。 假定scores.csv中的字段表示學(xué)生的ID號(hào),后面跟著四個(gè)測驗(yàn)分?jǐn)?shù)。 ![]() 111, 97, 92, 81, 60112, 75, 84, 91, 39113, 88, 94, 65, 91114, 97, 89, 85, 82115, 35, 72, 91, 70116, 99, 86, 90, 94117, 93, 92, 80, 87118, 92, 90, 83, 78119, 68, 79, 88, 92120, 99, 82, 81, 79121, 96, 85, 91, 60122, 94, 92, 91, 91 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 排序或過濾任意單詞或字段的文本數(shù)據(jù) //創(chuàng)建數(shù)據(jù)源
var scores = File.ReadAllLines(@"..\..\scores.csv"); //可以改為0~4的任意值
const int sortIndex = 1; //演示從方法返回查詢變量,非查詢結(jié)果。
foreach (var score in SplitSortQuery(scores, sortIndex))
{
Console.WriteLine(score);
}
Console.Read(); #endregion
} /// <summary>
/// 分割字符串排序 /// </summary>
/// <param name="scores"></param>
/// <param name="num"></param>
/// <returns></returns>
private static IEnumerable<string> SplitSortQuery(IEnumerable<string> scores, int num)
{ var query = from line in scores
let fields = line.Split(',') orderby fields[num] descending select line; return query;
}
}運(yùn)行結(jié)果如下:
七、對(duì)一個(gè)分割的文件的字段重新排序 逗號(hào)分隔值 (CSV) 文件是一種文本文件,通常用于存儲(chǔ)電子表格數(shù)據(jù)或其他由行和列表示的表格數(shù)據(jù)。通過使用Split方法分隔字段,可以非常輕松地使用 LINQ來查詢和操作CSV文件。事實(shí)上,可以使用此技術(shù)來重新排列任何結(jié)構(gòu)化文本行部分。此技術(shù)不局限于CSV文件。 在下面的示例中,假定有三列分別代表學(xué)生的“姓氏”、“名字”和“ID”,這些字段基于學(xué)生的姓氏按字母順序排列。查詢生成一個(gè)新序列,其中首先出現(xiàn)的是 ID列,后面的第二列組合了學(xué)生的名字和姓氏。根據(jù)ID字段重新排列各行,結(jié)果保存到新文件,但不修改原始數(shù)據(jù)。 ![]() Adams,Terry,120Fakhouri,Fadi,116Feng,Hanying,117Garcia,Cesar,114Garcia,Debra,115Garcia,Hugo,118Mortensen,Sven,113O'Donnell,Claire,112Omelchenko,Svetlana,111Tucker,Lance,119Tucker,Michael,122Zabokritski,Eugene,121 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 對(duì)一個(gè)分割的文件的字段重新排序 //數(shù)據(jù)源
var lines = File.ReadAllLines(@"..\..\spread.csv"); //將舊數(shù)據(jù)的第2列的字段放到第一位,逆向結(jié)合第0列和第1列的字段。
var query = from line in lines
let t = line.Split(',') orderby t[2] select $"{t[2]} {t[1]} {t[0]}"; foreach (var item in query)
{
Console.WriteLine(item);
}
Console.Read(); #endregion
}
}運(yùn)行結(jié)果如下:
八、組合和比較字符串集合 此示例演示如何合并包含文本行的文件,然后排序結(jié)果。具體來說,此示例演示如何對(duì)兩組文本行執(zhí)行簡單的串聯(lián)、聯(lián)合和交集。 注:text1.txt及text2.txt與五、的一致。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 組合和比較字符串集合 var text1 = File.ReadAllLines(@"..\..\text1.txt"); var text2 = File.ReadAllLines(@"..\..\text2.txt"); //簡單連接并排序,重復(fù)保存。
var concatQuery = text1.Concat(text2).OrderBy(x => x);
OutputQueryResult(concatQuery, "Simple concatenate and sort,duplicates are preserved:"); //基于默認(rèn)字符串比較器連接,并刪除重名。
var unionQuery = text1.Union(text2).OrderBy(x => x);
OutputQueryResult(unionQuery, "Union removes duplicate names:"); //查找在兩個(gè)文件中出現(xiàn)的名稱
var intersectQuery = text1.Intersect(text2).OrderBy(x => x);
OutputQueryResult(intersectQuery, "Merge based on intersect:"); //在每個(gè)列表中找到匹配的字段,使用concat將兩個(gè)結(jié)果合并,然后使用默認(rèn)的字符串比較器進(jìn)行排序。
const string nameMatch = "Garcia"; var matchQuery1 = from name in text1
let t = name.Split(',') where t[0] == nameMatch select name; var matchQuery2 = from name in text2
let t = name.Split(',') where t[0] == nameMatch select name; var temp = matchQuery1.Concat(matchQuery2).OrderBy(x => x);
OutputQueryResult(temp, $"Concat based on partial name match \"{nameMatch}\":");
Console.Read(); #endregion
} /// <summary>
/// 輸出查詢結(jié)果 /// </summary>
/// <param name="querys"></param>
/// <param name="title"></param>
private static void OutputQueryResult(IEnumerable<string> querys, string title)
{
Console.WriteLine(Environment.NewLine + title); foreach (var query in querys)
{
Console.WriteLine(query);
}
Console.WriteLine($"Total {querys.Count()} names in list.");
}
}運(yùn)行結(jié)果如下:
九、從多個(gè)源中填充對(duì)象集合 不要嘗試將內(nèi)存中的數(shù)據(jù)或文件系統(tǒng)中的數(shù)據(jù)與仍在數(shù)據(jù)庫中的數(shù)據(jù)相聯(lián)接。此種跨域聯(lián)接會(huì)生成未定義的結(jié)果,因?yàn)閿?shù)據(jù)庫查詢和其他類型的源定義聯(lián) 接運(yùn)算的方式可能不同。另外,如果數(shù)據(jù)庫中的數(shù)據(jù)量足夠大,則存在此類運(yùn)算引發(fā)內(nèi)存不足異常的風(fēng)險(xiǎn)。若要將數(shù)據(jù)庫數(shù)據(jù)與內(nèi)存中的數(shù)據(jù)相聯(lián)接,請(qǐng)首 先對(duì)數(shù)據(jù)庫查詢調(diào)用ToList或ToArray,然后對(duì)返回的集合執(zhí)行聯(lián)接。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 從多個(gè)源中填充對(duì)象集合 //spread.csv每行包含姓氏、名字和身份證號(hào),以逗號(hào)分隔。例如,Omelchenko,Svetlana,111
var names = File.ReadAllLines(@"..\..\spread.csv"); //scores.csv每行包括身份證號(hào)碼和四個(gè)測試評(píng)分,以逗號(hào)分隔。例如,111,97,92,81,60
var scores = File.ReadAllLines(@"..\..\scores.csv"); //使用一個(gè)匿名的類型合并數(shù)據(jù)源。 //注:動(dòng)態(tài)創(chuàng)建一個(gè)int的考試成績成員列表。 //跳過分割字符串中的第一項(xiàng),因?yàn)樗菍W(xué)生的身份證,不是一個(gè)考試成績。
var students = from name in names
let t1 = name.Split(',') from score in scores
let t2 = score.Split(',') where t1[2] == t2[0] select new
{
FirstName = t1[0],
LastName = t1[1],
ID = Convert.ToInt32(t1[2]),
ExamScores = (from score in t2.Skip(1) select Convert.ToInt32(score)).ToList()
}; foreach (var student in students)
{
Console.WriteLine($"The average score of {student.FirstName} {student.LastName} is {student.ExamScores.Average()}.");
}
Console.Read(); #endregion
}
}運(yùn)行結(jié)果如下:
十、使用group將一個(gè)文件拆分成多個(gè)文件 此示例演示一種進(jìn)行以下操作的方法:合并兩個(gè)文件的內(nèi)容,然后創(chuàng)建一組以新方式組織數(shù)據(jù)的新文件。 注:text1.txt及text2.txt與五、的一致。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 使用group將一個(gè)文件拆分成多個(gè)文件 var text1 = File.ReadAllLines(@"..\..\text1.txt"); var text2 = File.ReadAllLines(@"..\..\text2.txt"); //并集:連接并刪除重復(fù)的名字
var mergeQuery = text1.Union(text2); //根據(jù)姓氏的首字母對(duì)姓名進(jìn)行分組
var query = from name in mergeQuery
let t = name.Split(',')
group name by t[0][0] into g orderby g.Key select g; //注意嵌套的 foreach 循環(huán)
foreach (var g in query)
{ var fileName = @"testFile_" + g.Key + ".txt";
Console.WriteLine(g.Key + ":"); //寫入文件
using (var sw = new StreamWriter(fileName))
{ foreach (var name in g)
{
sw.WriteLine(name);
Console.WriteLine(" " + name);
}
}
}
Console.Read(); #endregion
}
}運(yùn)行結(jié)果如下:
十一、向不同的文件中加入內(nèi)容 此示例演示如何聯(lián)接兩個(gè)逗號(hào)分隔文件中的數(shù)據(jù),這兩個(gè)文件共享一個(gè)用作匹配鍵的共同值。如果您必須將兩個(gè)電子表格的數(shù)據(jù)或一個(gè)電子表格和一個(gè)其 他格式的文件的數(shù)據(jù)組合為一個(gè)新文件,則此技術(shù)很有用,還可以修改此示例以適合任意種類的結(jié)構(gòu)化文本。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 向不同的文件中加入內(nèi)容 var names = File.ReadAllLines(@"..\..\spread.csv"); var scores = File.ReadAllLines(@"..\..\scores.csv"); //該查詢基于ID連接兩個(gè)不同的電子表格
var query = from name in names
let t1 = name.Split(',') from score in scores
let t2 = score.Split(',') where t1[2] == t2[0] orderby t1[0] select $"{t1[0]},{t2[1]},{t2[2]},{t2[3]},{t2[4]}"; //輸出
OutputQueryResult(query, "Merge two spreadsheets:");
Console.Read(); #endregion
} /// <summary>
/// 輸出查詢結(jié)果 /// </summary>
/// <param name="querys"></param>
/// <param name="title"></param>
private static void OutputQueryResult(IEnumerable<string> querys, string title)
{
Console.WriteLine(Environment.NewLine + title); foreach (var query in querys)
{
Console.WriteLine(query);
}
Console.WriteLine($"Total {querys.Count()} names in list.");
}
}運(yùn)行結(jié)果如下:
十二、計(jì)算一個(gè)CSV文本文件中的列值 此示例演示如何對(duì).csv文件的列執(zhí)行諸如Sum、Average、Min和Max等聚合計(jì)算,此示例可以應(yīng)用于其他類型的結(jié)構(gòu)化文本。 ![]() class Program
{ static void Main(string[] args)
{ #region LINQ 計(jì)算一個(gè)CSV文本文件中的列值 var scores = File.ReadAllLines(@"..\..\scores.csv"); //指定要計(jì)算的列
const int examNum = 3; //+1表示跳過第一列 //統(tǒng)計(jì)單列
SingleColumn(scores, examNum + 1);
Console.WriteLine(); //統(tǒng)計(jì)多列 MultiColumns(scores);
Console.Read(); #endregion
} /// <summary>
/// 統(tǒng)計(jì)單列 /// </summary>
/// <param name="lines"></param>
/// <param name="examNum"></param>
private static void SingleColumn(IEnumerable<string> lines, int examNum)
{
Console.WriteLine("Single Column Query:"); //查詢步驟: //1.分割字符串 //2.對(duì)要計(jì)算的列的值轉(zhuǎn)換為int
var query = from line in lines
let t = line.Split(',') select Convert.ToInt32(t[examNum]); //對(duì)指定的列進(jìn)行統(tǒng)計(jì)
var average = query.Average(); var max = query.Max(); var min = query.Min();
Console.WriteLine($"Exam #{examNum}: Average:{average:##.##} High Score:{max} Low Score:{min}");
} /// <summary>
/// 統(tǒng)計(jì)多列 /// </summary>
/// <param name="lines"></param>
private static void MultiColumns(IEnumerable<string> lines)
{
Console.WriteLine("Multi Column Query:"); //查詢步驟: //1.分割字符串 //2.跳過ID列(第一列) //3.將當(dāng)前行的每個(gè)評(píng)分都轉(zhuǎn)換成int,并選擇整個(gè)序列作為一行結(jié)果。
var query1 = from line in lines
let t1 = line.Split(',')
let t2 = t1.Skip(1) select (from t in t2 select Convert.ToInt32(t)); //執(zhí)行查詢并緩存結(jié)果以提高性能
var results = query1.ToList(); //找出結(jié)果的列數(shù)
var count = results[0].Count(); //執(zhí)行統(tǒng)計(jì)
for (var i = 0; i < count; i++)
{ var query2 = from result in results select result.ElementAt(i); var average = query2.Average(); var max = query2.Max(); var min = query2.Min(); //#1表示第一次考試
Console.WriteLine($"Exam #{i + 1} Average: {average:##.##} High Score: {max} Low Score: {min}");
}
}
}運(yùn)行結(jié)果如下:
|
|
|