Thursday, March 12, 2015

Merge Sort for Delphi Revisited

A reader named dangph did me a great favor and created a complete test case for my MergeSort unit. He had already commented on my StackOverflow answer.

Well, he found that the algorithm was not stable after all. And he is correct. BTW, thank you!
So I did a little test to discover why it was not stable. Some hidden bug or a complete logic failure? Seems that the book algorithm has a bug... The author uses an "improved" Insertion sort algorithm when the element set is small enough. Take a look at the original code:

  // find the smallest element in the list
  IndexOfMin := FirstIndex;
  for i := FirstIndex to LastIndex do begin
    if (CompareFunc(ptrList^[i], ptrList^[IndexOfMin]) < 0) then begin
      IndexOfMin := i;
    end;
  end;
  if (FirstIndex <> IndexOfMin) then begin
    Temp := ptrList^[FirstIndex];
    ptrList^[FirstIndex] := ptrList^[IndexOfMin];
    ptrList^[IndexOfMin] := Temp;
  end;

Explanation for that, from the book:

A much better optimization is to scan the entire list for the smallest item and swap it into the first position (in essence, performing the first cycle of selection sort).

 But this approach just breaks the stability. Take a look at some real data used when debugging this algorithm. Please note the value column: the first number is the Key and the second number is the Index. The Index keeps the original position of this element in the list, when it was added/created:

This first picture is a snapshot of some part of the list, before entering that part of the code (from Index=50 to Index=62). Please note elements 50 and 60. Element 50 is the first of the list, and element 60 is the smallest element on the list. The piece of code above will swap them (put the smallest element of that part of the list in the first position, skipping the first cycle of selection sort). And this is the result:


Now, please note that elements 50 and 60 were indeed swapped. But this just broke the stability of the merge sort algorithm because, as you can see, there is another element (53) that has a bigger original index (it was originally ahead) than the element in position 50. In order to make this stable, Element 60 should be swapped with element 50, which in turn should be swapped with element 53.

To avoid all this trouble, I decided to use a simple insertion sort in place of the improved insertion sort of the book. Now the code passes all test cases! :-)

Please find the fixed algorithm and the test case project here.

Tuesday, March 10, 2015

So, .NET code is not slower than native Delphi code?

Just to prove once again that .NET IS slower than Delphi generated code in LOTS of scenarios:

In this post I created a few implementations for a StringReplace() replacement and found that my custom FastStringReplace() implementation was the fastest, correct?

Now, I want to check if it is REALLY fast. Why not compare with something like .NET super-duper-optimized-code?

In .NET, they told me somewhere that I could not do much more than string.Replace() method. So I did a simple WinForms application in C# using VS 2013 + .NET 4.5, using build configuration. The code is really simple:

for (int i=1; i <= 1000000; i++) {
    sourceString = originalString;
    s = sourceString.Replace(searchPattern, replacePattern);
}

You can test all the 6 test cases I did in my previous Delphi project. I won't give you all the numbers, just one:

Test case 1, 1000000 iterations:

C# code: 6.2 seconds

Delphi code: 0.98 seconds. Let's round it: 1.0 second.

Then, my Delphi native code runs 600% faster than C# code, correct?

So... There is no algorithm involved, no hidden compiler setting, no tricky cache thing, nothing. I just have one string and want to replace some parts of it and copy the new string to another variable.

But C# programmers will still notice that in .NET strings are immutable and this is creating a new string and blah, blah, blah. But Delphi code also creates a new string, because the whole content of the old string is copied to a new string and returned as the result. Of course the Delphi string is not immutable and there is a clear advantage here, but don't blame me! Blame Microsoft, Anders Hejlsberg, Obama, financial crisis...
Both C# and Delphi code are pretty basic and can be found in any software out there written in those languages.
Then another C# programmer I know told me: "Oh, but you didn't compare Delphi's standard StringReplace(), you used a custom code". Yes that's true. And that's the beauty behind non-managed, native languages like Delphi. I can use whatever I want, including assembly code. I'm not limited to classes and methods available in some framework. In fact I patched my Delphi program at runtime, and I don't even need to replace the standard StringReplace() calls with the FastStringReplace().

Also, in StackOverflow I found some "tips" to make the C# code faster: use StringBuilder class instead of string.Replace() method. But, creating a StringBuilder instance inside the loop makes no big difference.


Thursday, March 5, 2015

Fastest StringReplace for Delphi?

Not sure. At least this is the fastest pure pascal, unicode compatible, StringReplace() alternative I've ever tested. I wrote this based on some very old code I found in my HDD and I just don't know where it came from.
I also use a Pos() replacement for x64 (this came directly from FastCode project), because Delphi x64 Pos() function is DOG SLOW!!!

If someone has a faster StringReplace() implementation, written in pure pascal, I would be glad to know it :-)

This is my final conclusion:

In my tests it is at least 20% faster in small strings and up to 500% faster in most cases.
Some corner cases tests showed that this version can be 100 times (yes!) faster than standard StringReplace(), e.g. Length(OldPattern) = Length(NewPattern) and huge number of occurrences of OldPattern in the string. It also kicks TStringBuilder.Replace() butt in all tested scenarios.

Not bad at all, isn't it?

Download the unit here!

The comparison test case is here

Take a look at these results for instance. In this case we have:

  • A long string (S)
  • Length(SearchPattern) = 1
  • Length(ReplacePattern) > Length(SearchPattern)
  • Multiple occurrences of the SearchPattern within the string S
The results are quite impressive:
  • Standard StringReplace(): 0,7350 secs
  • StringReplace1: 0,2650 secs
  • StringReplace2: 0,2970 secs
  • StringReplace3: 2,1410 secs
  • StringReplace4: 0,4060 secs
  • FastStringReplace: 0,1090 secs
FastStringReplace is my custom implementation. It is more than 100% faster than the second place (StringReplace1, based on TStringBuilder.Replace() method). The most important: It is 7 times faster than standard StringReplece()! This difference gets bigger and bigger if S grows.
In some specific cases, FastStringReplace() may be slower than one of the implementations, but not by much (10% or less). You also have to notice that:

  • TStringBuilder can't handle case insensitive search/replace, so it is not a real replacement for StringReplace() without reviewing your code completely.
  • StringReplace4 implementation beats FastStringReplace in really small strings, but has the same limitation of TStringBuilder.
  • Standard StringReplace() is fast for small strings but dog slow if S is long. In really long strings (32 Kb or more), StringReplace() performance is almost prohibitive.

Ops! I almost forgot! This is also part of IntraWeb from version 14.0.38 and up.

About the license.... this code is FREE and there is no boring license. If you make any improvements (or fixes) please let me know. :-)

Wednesday, February 25, 2015

Merge Sort for Delphi

Today I was investigating a problem in TList sort order after calling the built-in Sort() method . A few objects were added to the list and then the Sort() method is called. In this specific case, happens that all objects were equally ranked, and once TList uses a Quick Sort algorithm, the order of elements in the list change, even though there is no need to sort them (they are equally ranked, so the comparison function/method always returns 0).

This is because Quick Sort algorithm is unstable, i.e. it does NOT keep the relative position of equally ranked elements within the list, after execution. Read more about sorting algorithms and stability here.
Lets see an example: We have a list containing only 3 objects. This object class (TMyObject) has a property named Key, used to sort the objects within the list. It also has another property named Index, where the original index within the list is kept:

List[0]: Key = 1, Index = 0;
List[1]: Key = 1, Index = 1;
List[2]: Key = 3, Index = 2;

The comparison function used is this:

function CompareItems(Data1, Data2: Pointer): Integer;
var
  Obj1, Obj2: TMyObject;
begin
  Obj1 := TMyObject(Data1);
  Obj2 := TMyObject(Data2);
  if Obj1.Key < Obj2.Key then
    Result := -1
  else
  if Obj1.Key > Obj2.Key then
    Result := 1
  else
    Result := 0;
end;

Using the built-in Quick Sort algorithm, your list may end like this:

List[0]: Key = 1, Index = 1;
List[1]: Key = 1, Index = 0;
List[2]: Key = 3, Index = 2;

As you can see, the Quick Sort algorithm may exchange elements 0 and 1 when sorting the list, because they both have Key = 1 (so the CompareItems() function will return 0 when comparing these elements).

This behavior of Quick Sort may or may not have some impact on your application. Well... In my specific problem, it HAS a severe impact. This is because the natural order of the list represents the creation order of the objects and I need this order to be preserved, if possible (if elements are equally ranked).

The best alternative seemed to be the Merge Sort algorithm, but I started looking for any fast and stable sorting algorithm for Delphi. Unfortunately, I found only a few and most of them were poorly implemented or required specialized data structures, like linked lists.

After digging a little more, I found an old Merge Sort implementation published in the - excellent - The Tomes of Delphi Algorithms and Data Structures book, by Julian Bucknall.

The implementation is a little old and needed some adjustments to make it run with newer versions of Delphi. In fact, I ended up with an implementation compatible with Delphi 2009 up to Delphi XE7. It probably will also run with older versions of Delphi, and newer versions as well.

Some interesting facts about all this sorting stuff:

  • Quick Sort has a worst-case running time of O(n2). On the other hand, Merge Sort does not have a worst-case scenario, and has a running time of n log n, every time.
  • Merge Sort requires an auxiliary list to do the sorting, when working with TList-like object classes. This imposes a higher memory requirement, but this is probably irrelevant nowadays.
  • The Merge Sort implementation is AS FAST AS Delphi's built-in implementation in most cases. In some cases, sorting HUGE lists with low dispersion, Merge Sort may be FASTER than Quick Sort.
  • Using this implementation to sort 1000000 items in my list I got:
    • Delphi's built-in Quick Sort:  0,38 seconds
    • Merge Sort:  0,34 seconds, i.e. more than 10% faster!
  • As already said, Merge Sort is STABLE. Under some circumstances this is important.

In the end, with all advantages of Merge Sort, I also created a TList and a TObjectList descendant redeclaring the Sort() method, using this same Merge Sort implementaion, instead of the default Quick Sort.

Download the merge sort Delphi units and test application here.

Wednesday, June 12, 2013

Why I use if MyString <> '' and not if Length(MyString) > 0 ?

1) if MyString <> '' then
2) if Length(MyString) > 0 then

Both forms are used to test whether a string is empty. Let's see what the Delphi compiler has to say:

Create a new project and add this code to it:

procedure Proc1;
var
   MyString: string;
begin
  if MyString <> '' then begin

  end;
end;

procedure Proc2;
var
  MyString: string;
begin
  if Length(MyString) > 0 then begin

  end;
end;

Put a breakpoint in front of both lines containing a if statement and run the program. When the program hits the breakpoint open the window: View -> Debug Windows - CPU Windows -> Disassembly. (I'm using Delphi XE3 here, but recent versions has the same functionality). Then we see this:

The code for the first if statement is this:

Unit1.pas.35: if MyString <> '' then begin

005ACE55 837DFC00 cmp dword ptr [ebp-$04],$00

The code for the second if statement is this:

Unit1.pas.49: if Length(MyString) > 0 then begin

005ACEF5 8B45FC mov eax,[ebp-$04]
005ACEF8 8945F4 mov [ebp-$0c],eax
005ACEFB 837DF400 cmp dword ptr [ebp-$0c],$00
005ACEFF 740B jz $005acf0c
005ACF01 8B45F4 mov eax,[ebp-$0c]
005ACF04 83E804 sub eax,$04
005ACF07 8B00 mov eax,[eax]
005ACF09 8945F4 mov [ebp-$0c],eax
005ACF0C 837DF400 cmp dword ptr [ebp-$0c],$00

One assembly instruction for the first, and five (or nine) for the second. If MyString is empty, the jump instruction (jz) will skip part of the code.

Time measures prove that the first form is faster than the second. Using Delphi XE3 (x86), if MyString is empty, the first form is slightly faster. If MyString is not empty (even 1 char in size), the first form is much faster (100% or more).





Tuesday, February 26, 2013

Why Mr. Delphi Hater should be called Mr. Dumbass Hater

Mr. Hater is a well known Delphi-related FUD spreader. From times to times some colleague or friend send me a link pointing to one of his infamous posts.
Mr. "dumbass" Hater pretends to be some kind of Delphi user defender, some kind of Batman of the Delphi community, but he turned out to be more like the Jokerman. His correspondents from the north pole, Atlantis and Bermuda triangle always send him "hot" information that he "generously" shares with us poor mortals. Unfortunately, he don't have a clue about most of the things that he intends to review or analyze. I think that positive criticisms are necessary, but what he does has nothing to do with it.

The last post that I read was this: http://delphihaters.blogspot.de/2013/01/delphi-hidden-settings.html

Some options are really funny, I must admit. The funnier part though is that Mr. Dumbass didn't publish my genuine question/comment about IntraWeb. All that I asked him was:

1) Provide me a test case where an IntraWeb application uses 2 Gb of memory without any explanation and without memory leaks caused by user code.

2) Provide me a test case where an IntraWeb application can't handle Unicode by default (HTML rendering/uploaded and downloaded file names/whatever)

3) Provide me a test case where an IntraWeb application takes forever to render a page (no need to take forever though. Just more than expected and I would be satisfied)

I'm not asking much from such a programming guru, right? But the comment wasn't approved by Mr. D. Can't you find/create such examples, Mr. D?

Another genuine question, Mr. D:
Why someone that keeps spreading FUD in the dark has a MODERATED blog??? You are blogging anonymously, so why do you want me to identify myself when adding comments to your blog?

My psychologist would say that Mr. Dumbass has some kind of complex with strange name. His crusade against IntraWeb, for instance, is more than a personal point of view. Not satisfied spreading FUD about Delphi and related products he also attacks the companies and the people behind it, what I find to be unacceptable. Probably someone really hurt Mr. Dumbass and he just can't live with that... Why don't you accept Delphi podcast invitation and tell us who did it to you, Mr. D?

PS: This is my personal point of view about the subject and is not endorsed or has nothing to do with Embarcadero, Atozed or whatever.



Wednesday, December 19, 2012

ShellExecute as administrator + Drag and Drop in Windows 7

This is a little tip that took me sometime to figure it out:
This kind of issue happened when I was debugging a IntraWeb application created with Delphi XE3 in Windows 7. I often run Delphi XE3 as local administrator. I know, I know... but sometimes it is required for proper ISAPI debugging.
This application renders a control that supports drag and drop from Windows Explorer, when the browser also supports it (Firefox, Chorme, etc.). The issue then is: When you start the browser using ShellExecute (or CreateProcess for that matter), from Delphi XE3 (or any other version or software) running with administrator credentials, the browser is also running as administrator, right? But Windows Explorer IS NOT. So, drag and drop from Windows Explorer to Firefox, for instance, won't work. Applications running with elevated privileges and applications running as regular user just won't do drag and drop... Simple, yes?