Thursday, March 5, 2015

Fastest StringReplace for Delphi?

Not sure. At least this is the fastest pure pascal, unicode compatible, StringReplace() alternative I've ever tested. I wrote this based on some very old code I found in my HDD and I just don't know where it came from.
I also use a Pos() replacement for x64 (this came directly from FastCode project), because Delphi x64 Pos() function is DOG SLOW!!!

If someone has a faster StringReplace() implementation, written in pure pascal, I would be glad to know it :-)

This is my final conclusion:

In my tests it is at least 20% faster in small strings and up to 500% faster in most cases.
Some corner cases tests showed that this version can be 100 times (yes!) faster than standard StringReplace(), e.g. Length(OldPattern) = Length(NewPattern) and huge number of occurrences of OldPattern in the string. It also kicks TStringBuilder.Replace() butt in all tested scenarios.

Not bad at all, isn't it?

Download the unit here!

The comparison test case is here

Take a look at these results for instance. In this case we have:

  • A long string (S)
  • Length(SearchPattern) = 1
  • Length(ReplacePattern) > Length(SearchPattern)
  • Multiple occurrences of the SearchPattern within the string S
The results are quite impressive:
  • Standard StringReplace(): 0,7350 secs
  • StringReplace1: 0,2650 secs
  • StringReplace2: 0,2970 secs
  • StringReplace3: 2,1410 secs
  • StringReplace4: 0,4060 secs
  • FastStringReplace: 0,1090 secs
FastStringReplace is my custom implementation. It is more than 100% faster than the second place (StringReplace1, based on TStringBuilder.Replace() method). The most important: It is 7 times faster than standard StringReplece()! This difference gets bigger and bigger if S grows.
In some specific cases, FastStringReplace() may be slower than one of the implementations, but not by much (10% or less). You also have to notice that:

  • TStringBuilder can't handle case insensitive search/replace, so it is not a real replacement for StringReplace() without reviewing your code completely.
  • StringReplace4 implementation beats FastStringReplace in really small strings, but has the same limitation of TStringBuilder.
  • Standard StringReplace() is fast for small strings but dog slow if S is long. In really long strings (32 Kb or more), StringReplace() performance is almost prohibitive.

Ops! I almost forgot! This is also part of IntraWeb from version 14.0.38 and up.

About the license.... this code is FREE and there is no boring license. If you make any improvements (or fixes) please let me know. :-)

20 comments:

Anonymous said...

Excellent. Do get one error in 32bit Delphi XE2 at line 60:
i := {$IFDEF CPUX64}FastPos{$ELSE}Pos{$ENDIF}(xOldPattern, Str,i);
Easy enough to fix but sharing.
Many Thanks

Vincent Parrett said...

What is the license on this code?

Alexandre Caldas Machado said...

Hi Anonymous. You are correct. Thanks. The file is now fixed and should be compatible with D2009 to XE8.

Alexandre Caldas Machado said...

Hi Vincent,
I've included the license information in the file. In short: you are free to do anything you want with that code. If you find any bugs or make any improvements, I would be glad to receive a copy! :-)

Vincent Parrett said...

Possible optiomization, if the old pattern and the new pattern are the same length, result could just be copied once from the input string, since it will always be the same length as the input no matter how many replaces are done. Just a thought.

Vincent Parrett said...

Ignore my last comment, I see you are doing that already!

HydroByte said...

Hi,

I've tested in C++Builder (BDS 2006) reading a XML files obtained from a SOAP/HTTP/POST method response. I've got a computation result 133 times faster than regular StringReplace (from 16.0s to .12s).

With C++Builder (BDS 2006) just two changes:

Line 6 : add StrUtils
Line 60: change Pos to PosEx.

Nice job!

Alexandre Caldas Machado said...

Hi HydroByte,

thanks for the feedback!
It was a huge improvement in performance, indeed! :-)

Anonymous said...

Congratulation !! good job ! ReplaceStr was really to slow to manage big file ..
thx much, work fine with me

Franck (Paris)

baur said...

Good job!!!
very very fast code, thanks!!!

Janneman said...

Very nice, Alexandre.

I have improved the test program so that the interface more clearly shows what it does, and added the ability to test a case insensitive replace for the StringReplace() and FastStringReplace().
Of course only those two should be compared if someone runs the case insensitive test.

Email me at jcmdoggen at gmail so that I can send it to you (XE2 source).

KOOLdic said...

@Alexandre C. Machado
You're a super hero *****
admire you :)

Note: If you use procedure the performance is even better:

procedure ReplaceMe(var s: String);
begin
s := FastStringReplace(s, OldPattern, newPattern, [rfReplaceAll]);
end;

Carl Olsen said...

While this does not do as much, you might find this code to be considerably faster in many cases:

procedure Replace (Var InString: String;WhatToReplace,WhatToReplaceWith:String);
{
does a search and replace within InString. Replaces all occurrences
of "WhatToReplace" with "WhatToReplaceWith".
Instring: The string we are going to modify
WhatToReplace: The part of Instring that we are going to replace
WhatToReplaceWith: The string we will replace WhatToReplace with.

You can use this routine to delete characters by simply setting
WhatToReplaceWith:='';
}
var
ReplacePosition:Integer;
begin

if WhatToReplace=WhatToReplaceWith then exit;

ReplacePosition:=Pos(WhatToReplace,InString);
if ReplacePosition<>0 then
begin
repeat
Delete(InString,ReplacePosition,length(WhatToReplace));
Insert(WhatToReplaceWith,InString,ReplacePosition);
//ReplacePosition:=PosEx(WhatToReplace,InString,ReplacePosition);
ReplacePosition:=Pos(WhatToReplace,InString); // Remarkably, Pos is faster than PosEX, despite the ReplacePosition parameter in PosEx
until ReplacePosition=0;
end; // if

end; // procedure

I Ran it through your tester using this code:

h1 := GetTickCount;
for i := 1 to count do
begin
oldS:=s; // Memorize S
s:=res3; // to mimic what the function does to make test fair.
Replace(s, OldPattern, newPattern);
res3:=s;
s:=Olds; // restore S
end;
h2 := GetTickCount;
Log('REPLACE');

...and found it to be about 10x faster than the other functions in Delphi XE8, despite the extra code in the test to save and restore the value of "s". Of course, this does not have options like "ignore case", but if you don't need that, it is super fast. I suppose it could be adapted to have more options, or do an "uppercase" on everything prior to passing, etc., if you need an ignore case capability. I have not tried that, as in my case for writing this function I did not need it.

John Burgess said...

Amazing performance improvement "out of the box".

Was attempting to replace field markers with tabs and carriage returns in a text file with 750,000 records. StringReplace was still choking on it after 6 hours. This function did it in seconds.

Great work!

Pierre said...

Happy to find your code! Delphi XE6 StringReplace of 6MB string taking many !minutes! - less than a second for your code! Great job!

Anonymous said...


Do you have version of your FastPOs/FastPosEx that works solely with AnsiStrings as both the substring and the main String. The reason I ask is that I want to do a string search in a file of 1.5G single character/single byte and if I use your version with Unicode I get an out of memory error. I know it works with Delphi Pos but as you know it is notoriously slow.

Thanks

Regards

Zac

UngVannara said...

Amazing

Decten76 said...

Gents, the download links are dead!

Decten76 said...

anyone has a copy of Alex StrUtilsEx.pas? please send me one. Many thanks. Decten76@gmail.com.

Tiago Rodrigues said...

Is this stil available? I cant download the files