Fast Token Replacement in C#
http://www.codeproject.com/Articles/298519/Fast-Token-Replacement-in-Csharp
Fast Token Replacement in C#
Introduction
FastReplacer
is good for executing many Replace operations on a large string when performance is important.
The main idea is to avoid modifying existing text or allocating new memory every time a string is replaced.
We have designed FastReplacer
to help us on a project where we had to generate a large text with a large number of append and replace operations. The first version of the application took 20 seconds to generate the text usingStringBuilder
. The second improved version that used the String
class took 10 seconds. Then we implementedFastReplacer
and the duration dropped to 0.1 seconds.
Using the code
Use of FastReplacer
should come intuitively to developers that have used StringBuilder
before.
Add classes FastReplacer
and FastReplacerSnippet
to your C# console application and copy the following code in Program.cs:
using System;
using Omega.Alpha.Common; namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
// Tokens will be delimited with { and }.
FastReplacer fr = new FastReplacer("{", "}");
fr.Append("{OPTION} OR {ANOTHER}.");
fr.Replace("{ANOTHER}", "NOT {OPTION}");
// The text is now "{OPTION} OR NOT {OPTION}."
fr.Replace("{OPTION}", "TO BE");
// The text is now "TO BE OR NOT TO BE."
Console.WriteLine(fr.ToString());
}
}
}
Note that only properly formatted tokens can be replaced using FastReplacer
. This constraint is set to ensure good performance since tokens can be extracted from text in one pass.
Case-insensitive
For a case-insensitive replace operation, set the additional constructor parameter caseSensitive
to false
(default istrue
).
FastReplacer fr = new FastReplacer("{", "}", false); // Case-insensitive
fr.Append("{Token}, {token} and {TOKEN}.");
fr.Replace("{tOkEn}", "x"); // Text is "x, x and x."
Console.WriteLine(fr.ToString());
What is in the package
The algorithm is implemented in two C# classes on .NET Framework 4.0:
FastReplacer
– Class for generating strings with a fastReplace
function.FastReplacerSnippet
– Internal class that is used fromFastReplacer
. No need to use it directly.
The attached solution contains three projects for Visual Studio 2010:
Omega.Alpha.Common
– A class library with theFastReplacer
class and theFastReplacerSnippet
class.FastReplacerDemo
– Demonstration console application. Contains samples and performance tests used in this article.FastReplacerTest
– Unit tests forFastReplacer
. All cool features can be seen in these tests.
Performance
Speed
Using String
or StringBuilder
to replace tokens in large text takes more time because every time a Replace
function is called, a new string is generated.
These tests were performed with FastReplacerDemo.exe (attached project) on a computer with Windows Experience Index: Processor 5.5, Memory 5.5.
ReplaceCount | TextLength | FastReplacer | String | StringBuilder |
100 | 907 | 0.0008 sec | 0.0006 sec | 0.0014 sec |
300 | 2707 | 0.0023 sec | 0.0044 sec | 0.0192 sec |
1000 | 10008 | 0.0081 sec | 0.0536 sec | 1.2130 sec |
3000 | 30008 | 0.0246 sec | 0.4515 sec | 43.5499 sec |
10000 | 110009 | 0.0894 sec | 5.9623 sec | 1677.5883 sec |
30000 | 330009 | 0.3649 sec | 60.9739 sec | Skipped |
100000 | 1200010 | 1.5461 sec | 652.8718 sec | Skipped |
Memory usage
Memory usage while working with FastReplacer
is usually 3 times the memory size of the resulting text. This includes:
- Original strings sent to
FastReplacer
as an argument forAppend
,Replace
,InsertBefore
, andInsertAfter
functions. - Temporary memory used while generating final text in the
FastReplacer.ToString
function. - The final generated string.
The algorithm
Using the conventional string.Replace
function or StringBuilder.Replace
function to generate large text takes O(n*m) time, where n is the number of replace operations that is executed and m is the text length, because a new string is generated every time the function is executed.
This chapter explains that FastReplacer
will take O(n*log(n) + m) time for the same task.
Tokens
Tokens are recognized in text every time a new text is added (by the Append
, Replace
, InsertBefore
, andInsertAfter
functions). Positions of tokens are kept in a Dictionary
for fast retrieval. That way searching for a text to be replaced takes O(n*log(n) + m) time instead of O(n*m). That is good, but the following part of the algorithm has more impact.
Snippets
When the Replace
function is called to replace a token with a new string, FastReplacer
does not modify existing text. Instead, it keeps track of the position in the original text where the token should be cut out and the new string inserted. Next time the Replace
function is called, it will do the same on the original text and in the new strings that were previously inserted. That creates a directed acyclic graph where every node (called FastReplacerSnippet
) represents a string with information where it should be inserted in its parent text, and with an array of child nodes that should be inserted in that string.
Every replace operation takes O(log n) time to find the matching token (covered in the previous chapter) and O(1) to insert a new node in the data structure.
Generating the final string takes O(m) time because there is only one pass through the data structure to recursively collect all the parts that need to be concatenated.
Sample 1
For example, in the string “My {pet} likes {food}.”, if token “{pet}” is replaced with “tortoise”, the following data structure will be created:
The final text will be composed by concatenating the text parts “My ”, “tortoise”, and “ likes {food}.”.
Sample 2
A more complex example is the text “{OPTION} OR {ANOTHER}”. If the token “{ANOTHER}” is replaced with “NOT {OPTION}”, then the token “{OPTION}” replaced with “TO BE”, we will get the following structure:
Constraints
When snippets of text are inserted, tokens are searched in every snippet separately. Tokens in every snippet must be properly formatted. A token cannot start in one snippet then end in another.
For example, you cannot insert a text that contains only the beginning of a token (e.g., “Hello {”) then append a text with the end of the token (e.g., “USERNAME}.”). Each of these function calls would fail because the token in each text is not properly formatted.
To ensure maximal consistency, FastReplacer
will throw an exception if the inserted text contains an improperly formatted token.
Fast Token Replacement in C#的更多相关文章
- Cheatsheet: 2013 09.01 ~ 09.09
.NET Multi Threaded WebScraping in CSharpDotNetTech .NET Asynchronous Patterns An Overview of Projec ...
- 一)如何开始 ehcache ?
官网地址 http://www.ehcache.org/ 从哪开始 第一步优先下载 http://www.ehcache.org/downloads/ 下载 Ehcache 2.10.0 .tar.g ...
- 如何使用VS在SharePont 2013中插入ashx文件
http://www.lifeonplanetgroove.com/adding-and-deploying-generic-handlers-ashx-to-a-sharepoint-2010-vi ...
- ExtJS笔记 Form
A Form Panel is nothing more than a basic Panel with form handling abilities added. Form Panels can ...
- 教你如何用AST语法树对代码“动手脚”
个推安卓工程师,负责公司移动端项目的架构和开发,主导移动端日志管理平台系统架构和开发工作,熟悉前后端的技术线,参与个推SDK主要业务研发工作,善于解决项目中遇到的痛点问题. 作为程序猿,每天都在写代码 ...
- sql改写优化:简单规则重组实现
我们知道sql执行是一个复杂的过程,从sql到逻辑计划,到物理计划,规则重组,优化,执行引擎,都是很复杂的.尤其是优化一节,更是内容繁多.那么,是否我们本篇要来讨论这个问题呢?答案是否定的,我们只特定 ...
- NuGet在创建pack时提示”The replacement token 'author' has no value“问题解决
在创建pack时出现了“The replacement token 'author' has no value”的错误提示. 解决方法: 1.可能程序没生成过,在解决方案上重新生成解决方案,注意Deb ...
- Android学习笔记之Fast Json的使用
PS:最近这两天发现了Fast Json 感觉实在是强大.. 学习内容: 1.什么是Fast Json 2.如何使用Fast Json 3.Fast Json的相关原理 4.Fast Json的优势, ...
- JSON Web Token
What is JSON Web Token? JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact a ...
随机推荐
- 电商指尖---(9).net发展Solr中间Facet特征
上一节中我们演示了在SolrAdmin中使用Facet功能来进行分组统计.这一节我们看看如何使用.NET开发Solr中的Facet功能.在讲Facet功能的同一时候, 我们看下.Net中如何使用Sol ...
- 被FusionCharts V3.4抛弃的东西
从FusionCharts Suite XT V3.4開始,FusionCharts产品家族放弃了Flash,发展成为一个纯JavaScript的图表组件.前面我们介绍了怎样升级到V3.4,接下来让我 ...
- 解决OUTLOOK 533错误问题
OutLook中“553 sorry, that domain isn‘t in my list of allowed rcpthosts (#5.7.1)”,无法发送邮件错误,解决方法 最近我在给徐 ...
- 超高性能的json序列化
超高性能的json序列化之MVC中使用Json.Net 超高性能的json序列化之MVC中使用Json.Net 先不废话,直接上代码 Asp.net MVC自带Json序列化 1 /// <su ...
- quartz.net动态添加job
quartz.net动态添加job设计-(五) 介绍 在实际项目使用中quartz.net中,都希望有一个管理界面可以动态添加job,而避免每次都要上线发布. 也看到有园子的同学问过.这里就介绍下实现 ...
- How to: Installshield做安装包时如何添加文件
原文:How to: Installshield做安装包时如何添加文件 我一直以为这不是一个问题,可是没想到在几个群内,对于如何向安装包添加文件不解的大有人在,今日稍暇,整理成篇,以供参考 首先我想再 ...
- 【淡墨Unity3D Shader计划】五 圣诞用品: Unity在Shader三种形式的控制&混合操作编译
本系列文章由@浅墨_毛星云 出品,转载请注明出处. 文章链接:http://blog.csdn.net/poem_qianmo/article/details/42060963 作者:毛星云(浅墨) ...
- openSUSE13.2安装Nodejs并更新到最新版
软件源中直接安装Nodejs即可 sudo zypper in nodejs 查看nodejs版本 sincerefly@linux-utem:~> node --version v0.10.5 ...
- Java 多线程之happens-before规则解释
关于happens-before规则的解释网上有很多,我就不敢班门弄斧了.贴出两篇不错的文章以供学习. 1.happens-before俗解 2.深入Java内存模型--happen-before规则
- Python开发工具Wing IDE发布5.0.1版本
Wing IDE是一个跨平台的Python IDE,提供了一个专业代码编辑.自动编辑.自动完成.重构.强大的图形调试器.版本控制.单位测试.搜索及其他功能.目前已经成为最全面.最综合.最先进的智能化P ...