This is because Quick Sort algorithm is unstable, i.e. it does NOT keep the relative position of equally ranked elements within the list, after execution. Read more about sorting algorithms and stability here.
Lets see an example: We have a list containing only 3 objects. This object class (TMyObject) has a property named Key, used to sort the objects within the list. It also has another property named Index, where the original index within the list is kept:
List[0]: Key = 1, Index = 0;
List[1]: Key = 1, Index = 1;
List[2]: Key = 3, Index = 2;
The comparison function used is this:
function CompareItems(Data1, Data2: Pointer): Integer; var Obj1, Obj2: TMyObject; begin Obj1 := TMyObject(Data1); Obj2 := TMyObject(Data2); if Obj1.Key < Obj2.Key then Result := -1 else if Obj1.Key > Obj2.Key then Result := 1 else Result := 0; end;
Using the built-in Quick Sort algorithm, your list may end like this:
List[0]: Key = 1, Index = 1;
List[1]: Key = 1, Index = 0;
List[2]: Key = 3, Index = 2;
As you can see, the Quick Sort algorithm may exchange elements 0 and 1 when sorting the list, because they both have Key = 1 (so the CompareItems() function will return 0 when comparing these elements).
This behavior of Quick Sort may or may not have some impact on your application. Well... In my specific problem, it HAS a severe impact. This is because the natural order of the list represents the creation order of the objects and I need this order to be preserved, if possible (if elements are equally ranked).
The best alternative seemed to be the Merge Sort algorithm, but I started looking for any fast and stable sorting algorithm for Delphi. Unfortunately, I found only a few and most of them were poorly implemented or required specialized data structures, like linked lists.
After digging a little more, I found an old Merge Sort implementation published in the - excellent - The Tomes of Delphi Algorithms and Data Structures book, by Julian Bucknall.
The implementation is a little old and needed some adjustments to make it run with newer versions of Delphi. In fact, I ended up with an implementation compatible with Delphi 2009 up to Delphi XE7. It probably will also run with older versions of Delphi, and newer versions as well.
Some interesting facts about all this sorting stuff:
- Quick Sort has a worst-case running time of O(n2). On the other hand, Merge Sort does not have a worst-case scenario, and has a running time of n log n, every time.
- Merge Sort requires an auxiliary list to do the sorting, when working with TList-like object classes. This imposes a higher memory requirement, but this is probably irrelevant nowadays.
- The Merge Sort implementation is AS FAST AS Delphi's built-in implementation in most cases. In some cases, sorting HUGE lists with low dispersion, Merge Sort may be FASTER than Quick Sort.
- Using this implementation to sort 1000000 items in my list I got:
- Delphi's built-in Quick Sort: 0,38 seconds
- Merge Sort: 0,34 seconds, i.e. more than 10% faster!
- As already said, Merge Sort is STABLE. Under some circumstances this is important.
In the end, with all advantages of Merge Sort, I also created a TList and a TObjectList descendant redeclaring the Sort() method, using this same Merge Sort implementaion, instead of the default Quick Sort.
Download the merge sort Delphi units and test application from Bitbucket repository here.
7 comments:
Unfortunately it doesn't appear to be completely stable. I wrote some DUnit tests, and one of them fails (TTestCaseMergeSort.Test10000).
http://pastie.org/10020369
If you remove the calls to the Insertion Sort, then it becomes stable. But I'm not sure I trust it now.
Hi dangph,
thanks for your feedback. I'll take a look at your test case and I'll probably post the results here, OK?
Sure, please do.
Hi dangph,
the code is fixed. Please check my new post about this subject. If you can give me your name, I can include it in the unit credits :-)
Thank you very much!
Great! I notice you simplified the insertion sort. Was the original one from the book?
Thanks, but I'd prefer not to use my real name because I don't link this account with my real name.
Yes, the original Insertion Sort comes from the book. It is a modified version meant to be faster. But it clearly breaks the stability of the merge sort algorithm.
Post a Comment