Quicksort Guide, Meaning , Facts, Information and Description
Quicksort is a well-known sorting algorithm developed by C. A. R. Hoare that, on average, needs Θ(n log n) comparisons to sort n items, while requiring Θ(n2) comparisons in the worst-case.Quicksort's inner loop is such that it is usually easy to implement very efficiently on most computer architectures, which makes it significantly faster in practice than other Θ(n log n) algorithms that can sort in place or nearly so in the average case (recursively-implemented quicksort is not, as is sometimes regarded, an in-place algorithm, requiring Θ(log n) on average, and Θ(n) in the worst case, of stack space for recursion.)
Performance and algorithm details
Because of its good average performance and simple implementation, Quicksort is one of the most popular sorting algorithms in use. It is an unstable sort in that it doesn't preserve any ordering that is already between elements of the same value. Quicksort's worst-case performance is Θ(n2); much worse than some other sorting algorithms such as heapsort or merge sort. However, if pivots are chosen randomly, most bad choices of pivots are unlikely; the worst-case has only probability 1/nfactorial of occurring.
The Quicksort algorithm uses a recursive divide and conquer strategy to sort a list. The steps are:
- Pick a pivot element from the list.
- Reorder the list so that all elements less than the pivot precede all elements greater than the pivot. This means that the pivot is in its final place; the algorithm puts at least one element in its final place on each pass over the list. This step is commonly referred to as "partitioning".
- Recursively sort the sub-list of elements less than the pivot and the sub-list of elements greater than the pivot. If one of the sub-lists is empty or contains one element, it can be ignored.
function partition(a, left, right, pivotIndex)
pivotValue := a[pivotIndex]
swap(a[pivotIndex], a[right]) ''// Move pivot to end
storeIndex := left
for i from left to right-1
if a[i] <= pivotValue
swap(a[storeIndex], a[i])
storeIndex := storeIndex + 1
swap(a[right], a[storeIndex]) ''// Move pivot to its final place
return storeIndex
function quicksort(a, left, right)
if right > left
select a pivot value a[pivotIndex]
pivotNewIndex := partition(a, left, right, pivotIndex)
quicksort(a, left, pivotNewIndex-1)
quicksort(a, pivotNewIndex+1, right)
The inner loop which performs the partition is amenable to optimization, for two main reasons:
- All comparisons are being done with a single pivot value, which can be stored in a register.
- The list is being traversed sequentially, which for arrays produces very good locality of reference and cache behavior.
The most crucial problem of Quicksort is the choice of pivot element. A naïve implementation of Quicksort, like the ones below, will be terribly inefficient for certain inputs. For example, if the pivot always turns out to be the smallest element in the list, then Quicksort degenerates to Selection sort with Θ(n2) running time. A secondary problem is the recursion depth. This becomes linear, and the stack requires Θ(n) extra space.
Note that the partition procedure only requires the ability to traverse the list sequentially; therefore, quicksort is not confined to operating on arrays (it can be used, for example, on linked lists). Choosing a good pivot, however, benefits from random access, as we will see.
The worst-case behavior of quicksort is not merely a theoretical problem. When quicksort is used in web services, for example, it is possible for an attacker to deliberately exploit the worst case performance and choose data which will cause a slow running time or maximize the chance of running out of stack space. See competitive analysis for more discussion of this issue.
Sorted or partially sorted data is quite common in practice and the naïve implementation which selects the first element as the pivot does poorly with such data. To avoid this problem the middle element can be used. This works well in practice but attacks can cause worst-case performance.
A better optimization can be to select the median element of the first, middle and last elements as the pivot. Adding two randomly selected elements resists chosen data attacks, more so if a cryptographically sound random number generator is used to reduce the chance of an attacker predicting the "random" elements. The use of the fixed elements reduces the chance of bad luck causing a poor pivot selection for partially sorted data when not under attack. These steps increase overhead, so it may be worth skipping them once the partitions grow small and the penalty for poor pivot selection drops.
Finding the true median value and using it as the pivot can be done if the number of elements is large enough to make it necessary but this is seldom done in practice.
Another optimization is to switch to a different sorting algorithm once the list becomes small, perhaps ten or less elements. Selection sort might be inefficient for large data sets, but it is often faster than Quicksort on small lists.
One widely used implementation of quicksort, that in the 1997 Microsoft C library, used a cutoff of 8 elements before switching to insertion sort, asserting that testing had shown that to be a good choice. It used the middle element for the partition value, asserting that testing had shown that the median of three algorithm did not, in general, increase performance.
In datasets which contain a lot of equal elements, quicksort can degenerate to worst case time complexity in sorting the 'bottom tier' of partitions. A good variation in such cases is to test separately for equal elements and store these in a 'fat pivot' in the center of the partition. An implementation of this variation implemented in C is shown below.
Sedgewick (1978) suggested an enhancement to the use of simple sorts for small numbers of elements, which reduced the number of instructions required by postponing the simple sorts until the quicksort had finished, then running an insertion sort over the whole array.
LaMarca and Ladner (1997) consider "The Influence of Caches on the Performance of Sorting", a very significant issue in microprocessor systems with multi-level caches and high cache miss times. They conclude that while the Sedgewick optimization decreases the number of instructions, it also decreases locality of cache references and worsens performance compared to doing the simple sort when the need for it is first encountered. However, the effect was not dramatic and they suggested that it was starting to become more significant with more than 4 million 64 bit float elements. This work is cited by Musser, following. Their work showed greater improvements for other sorting types.
Because recursion requires additional memory, Quicksort has been implemented in a non-recursive, iterative form. This has the advantage of predictable memory use regardless of input, and the disadvantage of considerably greater code complexity. Those considering iterative implementations of Quicksort would do well to also consider Introsort or especially Heapsort.
A simple alternative for reducing Quicksort's memory consumption uses true recursion only on the smaller of the two sublists and tail recursion on the larger. This limits the additional storage of Quicksort to O(log n). The procedure quicksort in the preceding pseudocode would be rewritten as
An optimization of quicksort which is becoming widely used is introspective sort, often called introsort (Musser 1997). This starts with quicksort and switches to heapsort when the recursion depth exceeds a preset value. This overcomes the overhead of increasingly complex pivot selection techniques while ensuring O(n log n) worst-case performance. Musser reported that on a median-of-3 killer sequence of 100,000 elements running time was 1/200th that of median-of-3 quicksort. Musser also considered the effect of Sedgewick delayed small sorting on caches, reporting that it could double the number of cache misses but that its performance with double-ended queues was significantly better and it should be retained for template libraries, in part because the gain in other cases from doing the sorts immediately was not great.
The June 2000 SGI C++ Standard Template Library stl_algo.c implementation of unstable sort uses the Musser introsort approach with the recursion depth to switch to heapsort passed as a parameter, median-of-3 pivot selection and the Sedgewick final insertion sort pass. The element threshold for switching to the simple insertion sort was 16.
The C++ STL implementations generally significantly (several times as fast) outperform the C implementation because they are implemented to allow inlining, while the generic C equivalent must use function calls for comparisons. This advantage could be compensated for by using custom versions of the sort function, at the cost of losing the advantage of a totally generic library function.
The most direct competitor of Quicksort is heapsort. Heapsort is typically somewhat slower than Quicksort, but the worst-case running time is always O(n log n). Quicksort is usually faster, though there remains the chance of worst case performance except in the introsort variant. If it's known in advance that heapsort is going to be necessary, using it directly will be faster than waiting for introsort to switch to it. Heapsort also has the important advantage of using only constant additional space (heapsort is in-place), whereas even the best variant of Quicksort uses O(log n) space. However, heapsort requires efficient random access to be practical.
Quicksort is a space-optimized version of the binary tree sort. Instead of inserting items sequentially into an explicit tree, Quicksort organizes them concurrently into a tree that is implied by the recursive calls. The algorithms make exactly the same comparisons, but in a different order.
A simple selection algorithm, which chooses the kth smallest of a list of elements, works nearly the same as quicksort, except instead of recursing on both sublists, it only recurses on the sublist which contains the desired element. This small change changes the average complexity to linear or O(n) time. A variation on this algorithm brings the worst-case time down to O(n) (see selection algorithm for more information).
Conversely, once we know a worst-case O(n) selection algorithm is available, we can use it to find the ideal pivot (the median) at every step of Quicksort, producing a variant with worst-case O(n log n) running time. This variant is considerably slower on average, however.
Sample quicksort implementations in various languages, sorted by number of non-comment lines of code. Samples are written in a non-contrived style, characteristic of the respective languages.
The following Python implementation uses a more efficient partitioning strategy:
A more elegant implementation:
This is a straightforward implementation. It is certainly possible to come up with a more efficient one, but it will probably not be as clear as this one:
This is a straightforward implementation based off the AppleScript example. It is certainly possible to come up with a more efficient one, but it will probably not be as clear as this one:
The following implementation uses a 'fat pivot':
The following Java implementation uses a randomly selected pivot. Analogously to the Erlang solution above, a user-supplied
Example of QuickSort using delegates. Pass the array of objects in the constructor of the class. For comparing other type of objects rewrite your own compare function instead of CompareInt.
The following is an example of using QuickSort to sort an array of strings.
The call produces 3 words of stack per recursive call and is able
to take advantage of its knowledge of its own behavior. A more efficient
implementation would sort small ranges by a more efficient method. If an implementation obeying standard calling conventions were needed, a simple wrapper could be written for the initial call to the above function that saves the appropriate registers.
Lets first define quicksort predicate without using Tail_recursion
And now more elegant version, using Tail_recursion
This is an Article on Quicksort. Page Contains Information, Facts Details or Explanation Guide About Quicksort Choosing a better pivot
Other optimizations
function quicksort(a, left, right)
while right > left
select a pivot value a[pivotIndex]
pivotNewIndex := partition(a, left, right, pivotIndex)
if (pivotNewIndex-1) - left < right - (pivotNewIndex+1)
quicksort(a, left, pivotNewIndex-1)
left := pivotNewIndex+1
else
quicksort(a, pivotNewIndex+1, right)
right := pivotNewIndex-1Introsort optimization
Competitive sorting algorithms
Relationship to selection
Sample implementations
J
\n sort =: ]`(($:@: ((}.<:{.)#}.)) \n ,{., \n ($:@: ((}.> {.)#}.))) @. (*@#)\nJoy
\n DEFINE sort == [small][]\n [uncons [>] split]\n [[swap] dip cons concat] binrec .\n
Miranda
\n sort [] = [] \n sort (pivot:rest) = sort [ y | y <- rest; y <= pivot ] \n ++ [pivot] ++\n sort [ y | y <- rest; y > pivot ]\n
NGL
\n sort () == id\n sort pivot,,rest == ( self : rest[rest <= pivot] )\n , pivot , \n ( self : rest[rest > pivot] )\n
Haskell
The following Haskell code is almost self explanatory but can suffer from inefficiencies because it crawls through the list "rest" twice, once for each list comprehension. A smart implementation can perform optimizations to prevent this inefficiency, but these are not required by the language.\n sort :: (Ord a) => [a] -> [a]\n \n sort [] = []\n sort (pivot:rest) = sort [y | y <- rest, y < pivot]\n ++ [pivot] ++ \n sort [y | y <- rest, y >=pivot]\n
Erlang
The following Erlang code sorts lists of items of any type.\nqsort([]) -> [];\nqsort([Pivot|Rest]) ->\n qsort([ X || X <- Rest, X < Pivot]) ++ [Pivot] ++ qsort([ Y || Y <- Rest, Y >= Pivot]).\n
OCaml
\n''val sort : 'a list -> 'a list =
Common Lisp
\n(defun partition (fun array)\n (list (remove-if-not fun array) (remove-if fun array)))\n \n(defun sort (array)\n (if (null array) nil\n (let ((part (partition (lambda (x) (< x (car array))) (cdr array))))\n (append (sort (car part)) (cons (car array) (sort (cadr part)))))))\n
Ruby
\ndef sort(array)\n return [] if array.empty?\n pivot = array[0]\n left, right = array[1..-1].partition { |y| y <= pivot }.map { |a| sort(a) }\n left + [ pivot ] + right\nend\nC++
\n#include
Python
\ndef partition(array, begin, end, cmp):\n while begin < end:\n while begin < end:\n if cmp(array[begin], array[end]):\n (array[begin], array[end]) = (array[end], array[begin])\n break\n end -= 1\n while begin < end:\n if cmp(array[begin], array[end]):\n (array[begin], array[end]) = (array[end], array[begin])\n break\n begin += 1\n return begin\n\ndef sort(array, cmp=lambda x, y: x > y, begin=None, end=None):\n if begin is None: begin = 0\n if end is None: end = len(array)\n if begin < end:\n i = partition(array, begin, end-1, cmp)\n sort(array, cmp, begin, i)\n sort(array, cmp, i+1, end)\n
\n def qsort(L):\n if L == []: return []\n return qsort([x for x in L[1:] if x< L[0]]) + L[0:1] + \\\n qsort([x for x in L[1:] if x>=L[0]])\n
AppleScript
\n on sort( array, left, right )\n set i to left\n set j to right\n set v to item ( ( left + right ) div 2 ) of array -- pivot\n repeat while ( j > i )\n repeat while ( item i of array < v )\n set i to i + 1\n end repeat\n repeat while ( item j of array > v )\n set j to j - 1\n end repeat\n if ( not i > j ) then\n tell array to set { item i, item j } to { item j, item i } -- swap\n set i to i + 1\n set j to j - 1\n end if\n end repeat \n if ( left < j ) then sort( array, left, j )\n if ( right > i ) then sort( array, i, right )\n end sort\nAutoIt v3
\n Func sort( ByRef $array, $left, $right )\n $i = $left\n $j = $right\n $v = $array[Round( ( $left + $right ) / 2 ,0)]\n While ( $j > $i )\n While ($array[$i] < $v )\n $i = $i + 1\n WEnd\n While ( $array[$j] > $v )\n $j = $j - 1\n WEnd\n If ( NOT ($i > $j) ) then\n swap($array[$i], $array[$j])\n $i = $i + 1\n $j = $j - 1\n EndIf\n WEnd\n if ( $left < $j ) then sort( $array, $left, $j )\n if ( $right > $i ) then sort( $array, $i, $right )\n EndFunc\n
C
This implementation is limited to arrays of integers:\nvoid sort(int array[], int begin, int end) {\n if (end > begin) {\n int pivot = array[begin];\n int l = begin + 1;\n int r = end;\n while(l < r) {\n if (array[l] <= pivot) {\n l++;\n } else {\n r--;\n swap(array[l], array[r]); \n }\n }\n l--;\n swap(array[begin], array[l]);\n sort(array, begin, l);\n sort(array, r, end);\n }\n}\n\nvoid sort(int array[], int begin, int end) {\n int pivot = array[begin];\n int i = begin + 1, j = end, k = end;\n int t;\n\n while (i < j) {\n if (array[i] < pivot) i++;\n else if (array[i] > pivot) {\n j--; k--;\n t = array[i];\n array[i] = array[j];\n array[j] = array[k];\n array[k] = t; }\n else {\n j--;\n swap(array[i], array[j]);\n } }\n i--;\n swap(array[begin], array[i]); \n if (i - begin > 1)\n sort(array, begin, i);\n if (end - k > 1)\n sort(array, k, end); \n}\nJava
Comparator determines the partial ordering of array elements:\nimport java.util.Comparator;\nimport java.util.Random;\n\npublic class Quicksort {\n public static final Random RND = new Random(); \n private void swap(Object[] array, int i, int j) {\n Object tmp = array[i];\n array[i] = array[j];\n array[j] = tmp;\n }\n private int partition(Object[] array, int begin, int end, Comparator cmp) {\n int index = begin + RND.nextInt(end - begin + 1);\n Object pivot = array[index];\n swap(array, index, end); \n for (int i = index = begin; i < end; ++ i) {\n if (cmp.compare(array[i], pivot) <= 0) {\n swap(array, index++, i);\n } }\n swap(array, index, end); \n return (index);\n }\n private void qsort(Object[] array, int begin, int end, Comparator cmp) {\n if (end > begin) {\n int index = partition(array, begin, end, cmp);\n qsort(array, begin, index - 1, cmp);\n qsort(array, index + 1, end, cmp);\n } }\n public void sort(Object[] array, Comparator cmp) {\n qsort(array, 0, array.length - 1, cmp);\n} }\nC#
The following C# implementation uses a random pivot and is limited to integer arrays; for other value types, replace all instances of int[] with the appropriate type (for example, decimal[]). For object[] comparison, create a delegate to your custom object compare function and pass it as an added parameter to both methods:\nclass Quicksort {\n private void swap(int[] Array, int Left, int Right) {\n int temp = Array[Right];\n Array[Right] = Array[Left];\n Array[Left] = temp;\n }\n\n public void sort(int[] Array, int Left, int Right) {\n int LHold = Left;\n int RHold = Right;\n Random ObjRan = new Random();\n int Pivot = ObjRan.Next(Left,Right);\n swap(Array,Pivot,Left);\n Pivot = Left;\n Left++;\n\n while (Right >= Left) {\n if (Array[Left] >= Array[Pivot] && Array[Right] < Array[Pivot])\n swap(Array, Left, Right);\n else if (Array[Left] >= Array[Pivot])\n Right--;\n else if (Array[Right] < Array[Pivot])\n Left++;\n else {\n Right--;\n Left++;\n } } \n swap(Array, Pivot, Right);\n Pivot = Right; \n if (Pivot > LHold)\n sort(Array, LHold, Pivot);\n if (RHold > Pivot+1)\n sort(Array, Pivot+1, RHold);\n} }\n\nclass QuickSort {\n private delegate int CmpOp(object Left, object Right);\n private void swap(object[] Array, int Left, int Right, CmpOp Cmp) {\n object tempObj = Array[Left];\n Array[Left] = Array[Right];\n Array[Right] = tempObj;\n }\n private int CmpInt(object Left, object Right) {\n if ((int) Left < (int) Right)\n return -1;\n else \n return -2;\n }\n public QuickSort(object[] Array) {\n CmpOp Cmp = new CmpOp(CmpInt);\n Sort(Array, 0, Array.Length-1, Cmp); \n }\n private void Sort(object[] Array, int Left, int Right, CmpOp Cmp) {\n int LHold = Left;\n int RHold = Right;\n Random ObjRan = new Random();\n int Pivot = ObjRan.Next(Left,Right);\n swap(Array, Pivot, Left, Cmp);\n Pivot = Left;\n Left++;\n\n while (Right >= Left) {\n if (Cmp(Array[Left], Array[Pivot])!= -1 && Cmp(Array[Right], ArrObj[Pivot])== -1)\n swap(Array, Left, Right, Cmp);\n else if (Cmp(Array[Left], Array[Pivot]) != -1)\n Right--;\n else if (Cmp(Array[Right],Array[Pivot]) == -1)\n Left++;\n else {\n Right--;\n Left++;\n } } \n swap(Array, Pivot, Right, Cmp);\n Pivot = Right;\n\n if (Pivot > LHold)\n Sort(Array, LHold, Pivot, Cmp);\n if (RHold > Pivot+1)\n Sort(Array, Pivot+1,RHold, Cmp);\n} }\n\nclass Quicksort {\n private void quickSwap(string[] Array, int Left, int Right) \n {\n string Temp = Array[Right];\n Array[Right] = Array[Left];\n Array[Left] = Temp;\n }\n\n public void quickSort(string[] Array, int Left, int Right) \n {\n int LHold = Left;\n int RHold = Right;\n Random ObjRan = new Random();\n int Pivot = ObjRan.Next(Left, Right);\n quickSwap(Array, Pivot, Left);\n Pivot = Left;\n Left++;\n\n while (Right >= Left) \n {\n int cmpLeftVal = Array[Left].CompareTo(Array[Pivot]);\n int cmpRightVal = Array[Right].CompareTo(Array[Pivot]);\n\n if ((cmpLeftVal >= 0) && (cmpRightVal < 0))\n {\n quickSwap(Array, Left, Right);\n }\n else \n {\n if (cmpLeftVal >= 0)\n {\n Right--;\n }\n else \n {\n if (cmpRightVal < 0)\n {\n Left++;\n }\n else \n {\n Right--;\n Left++;\n }\n }\n }\n } \n quickSwap(Array, Pivot, Right);\n Pivot = Right; \n if (Pivot > LHold)\n {\n quickSort(Array, LHold, Pivot);\n }\n if (RHold > Pivot + 1)\n {\n quickSort(Array, Pivot + 1, RHold);\n }\n }\n}\nARM assembly language
This ARM RISC assembly language implementation for sorting an array of
32-bit integers demonstrates how well quicksort takes advantage of the
register model and capabilities of a typical machine instruction set
(note that this particular implementation does not meet standard
calling conventions and may use more than O(log n) space):\n qsort: @ Takes three parameters:\n @ a: Pointer to base of array a to be sorted (arrives in r0)\n @ left: First of the range of indexes to sort (arrives in r1)\n @ right: One past last of range of indexes to sort (arrives in r2)\n @ This function destroys: r1, r2, r3, r5, r7\n stmfd sp!, {r4, r6, lr} @ Save r4 and r6 for caller\n mov r6, r2 @ r6 <- right\n qsort_tailcall_entry:\n sub r7, r6, r1 @ If right - left <= 1 (already sorted),\n cmp r7, #1\n ldmlefd sp!, {r4, r6, pc} @ Return, restoring r4 and r6\n ldr r7, [r0, r1, asl #2] @ r7 <- a[left], gets pivot element\n add r2, r1, #1 @ l <- left + 1\n mov r4, r6 @ r <- right\n partition_loop:\n ldr r3, [r0, r2, asl #2] @ r3 <- a[l]\n cmp r3, r7 @ If a[l] <= pivot_element,\n addle r2, r2, #1 @ ... increment l, and\n ble partition_test @ ... continue to next iteration.\n sub r4, r4, #1 @ Otherwise, decrement r,\n ldr r5, [r0, r4, asl #2] @ ... and swap a[l] and a[r].\n str r5, [r0, r2, asl #2]\n str r3, [r0, r4, asl #2]\n partition_test:\n cmp r2, r4 @ If l < r,\n blt partition_loop @ ... continue iterating.\n partition_finish:\n sub r2, r2, #1 @ Decrement l\n ldr r3, [r0, r2, asl #2] @ Swap a[l] and pivot\n str r3, [r0, r1, asl #2]\n str r7, [r0, r2, asl #2]\n bl qsort @ Call self recursively on left part,\n @ with args a (r0), left (r1), r (r2),\n @ also preserves r4 and r6\n mov r1, r4\n b qsort_tailcall_entry @ Tail-call self on right part,\n @ with args a (r0), l (r1), right (r6)\nProlog
\nappend([], L, L).\nappend([H | L1], L2, [H | Result]) :- append(L1, L2, Result).\n\npartition([], _, [], []).\npartition([H | T], X, [H | Left], Right) :- H =< X, partition(T, X, Left, Right).\npartition([H | T], X, Left, [H | Right]) :- H > X, partition(T, X, Left, Right).\n\nqsort([],[]).\nqsort([H | Tail], Sorted) :-\n partition(Tail, H, Left, Right),\n qsort(Left, SortedLeft),\n qsort(Right, SortedRight),\n append(SortedLeft, [X | SortedRight], Sorted).\n
\nqsort(List, Sorted) :- qsort(List, [], Sorted).\n\nqsort([], Acc, Acc).\nqsort([H | Tail], Acc, Sorted) :-\n partition(Tail, H, Left, Right),\n qsort(Right, Acc, SortedRight),\n qsort(Left, [H | SortedRight], Sorted).\n
External links
References
