目录

Sorting


Data Structure

Sorting

QuickSort

Idea

$Process\ after\ dividing \rightarrow$(like MergeSort in thought)

Divide the sequence into two subsequences $$ S=S_L+S_R $$ Downsizing $$ max(|S_L|,|S_R|) < n $$ Be independent of each other $$ max(S_L) \leq min(S_R) $$ After the subsequences are sorted recursively, the original sequence is ordered naturally $$ Sorted(S)=Sorted(S_L)+Sorted(S_R) $$ A trivial solution: With a single element, is itself a solution$\rightarrow Recursive\ base\newline$

Problem

How to delimit subsequence?

Pivot

None of the left/right elements are bigger/smaller than it

$\rightarrow$ With the pivot as the boundary, the partition of the sequence will be realized naturally $$ [l0,hi)=[lo,mi)+[mi]+(mi,hi) $$ Key point: Construct Pivot!

1
2
3
4
5
6
7
template <typename T> 
void Vector<T>::quickSort ( Rank lo, Rank hi ) { //0 <= lo < hi <= size
   if ( hi - lo < 2 ) return; //Recursive base
   Rank mi = partition ( lo, hi ); //Key point
   quickSort ( lo, mi ); 
   quickSort ( mi + 1, hi ); 
}

Partition

Difficulty: pivot may not exist in the original sequence

Good news: The pivot must be in place

In an ordered sequence, all elements are pivots and vice versa

Construct Pivot

Analysis

Unstable $$ \begin{align} &Example:\newline &Before:[…5_a…5_b…]\newline &After:[…5_b…5_a…]\newline \end{align} $$

In-place

$O(1)$ nice!

In-time
Best case

The pivot is always (near) in the center

$$ T(n)=2T(\frac{n-1}{2})+O(n)\newline T(n)=O(n\log n)\newline $$

Worst case

Each division is extremely uneven $$ T(n)=T(n-1)+T(0)+O(n)\newline T(n)=O(n^2)\newline $$

Reduce the worst case probability

Random selection

Middle of three(Randomly select three elements in the sequence and take the middle value)

But it can only reduce the probability, not eliminate

Average

$O(n\log n)\newline$

Take a uniformly independent distributed sequence as an example $$ \begin{align} &order\ the\ rank\ of\ the\ pivot\ is\ k\newline &\rightarrow probability:\frac{1}{n}\newline\newline &then\ T(n)=(n+1)+(\frac{1}{n}) * \sum_{k=0}^{n-1}[T(k)+T(n-k-1)]\newline &It’s\ easy\ to\ see\ that\rightarrow T(k)=T(n-k-1)\newline &T(n)=(n+1)+(\frac{2}{n}) * \sum_{k=0}^{n-1}T(k)\newline &n*T(n)=n * (n+1)+2 * \sum_{k=0}^{n-1}T(k)\ (1)\newline\newline &make\ n = n - 1\newline &(n-1) * T(n-1)=(n-1) * n+2 * \sum_{k=0}^{n-2}T(k)\ (2)\newline\newline &Then\ (1)-(2)\newline &n * T(n) - (n - 1) * T(n - 1) = 2 * n + 2 * T(n-1)\newline &n * T(n) - (n + 1) * T(n - 1)=2 * n\newline &\frac {T(n)}{n + 1} = \frac {T(n - 1)}{n}+\frac {2}{n+1}\newline &=\frac {2}{n+1} + \frac {2}{n} + \frac {2}{n-1} + … + \frac {2}{2} + \frac{T(0)}{1}\newline &=(2\ln2) * \log n \approx 1.39 * \log n\newline \end{align} $$

Algorithm of optimization(LGU)

$$ S=[l0]+L(lo,mi]+G(mi,k)+U[l,hi]\newline L < pivot\leq G\newline $$

code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
//pivot construction algorithm;
//by adjusting the element position to construct the pivot of the interval [lo, hi] and return its rank
template <typename T>
Rank Vector<T>::partition ( Rank lo, Rank hi ) { //LGU
   swap ( _elem[lo], _elem[ lo + rand() % ( hi - lo ) ] ); //Swap any element with the first element
   T pivot = _elem[lo]; //Take the first element as the candidate pivot 
    					//-- after the above exchange, equivalent to random selection
   int mi = lo;
   //++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   //   ---- L < [lo] ----- ] ----- [lo] <= G --- ] [ ----- Unknown -------
   // X x . . . . . . . . . x . . . . . . . . . . . x . . . . . . . . . . x X
   // |                     |                       |                       |
   // lo (pivot candidate)  mi                      k                       hi
   //++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
   for ( Rank k = lo + 1; k < hi; k++ ) //Scan from left to right
      if ( _elem[k] < pivot ) //If the current element _elem[k] is less than pivot
         swap ( _elem[++mi], _elem[k] ); //After swapping _elem[k] to the original mi, 
    									 //the L subsequence expands to the right
    
   //++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   //   --------- L < [lo] ---------- ] ------------- [lo] <= G ----------]
   // X x . . . . . . . . . . . . . . x . . . . . . . . . . . . . . . . . x X
   // |                               |                                     |
   // lo (pivot candidate)            mi                                    hi
   //++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
   swap ( _elem[lo], _elem[mi] ); //Candidate pivot homing
   return mi; //return its rank
}
In-place

$O(1)\newline$

In-time

$O(n)\newline$

Selection

The time cost of sorting is too high and you need to find other methods

K-selection

In any set of comparably sized elements, how do you go from small to large to find the elements of order k?

Example

Excel: large(range, rank)

Median

The length of the sorted sequence S is n$(range\ from\ 0\ to\ n - 1)$, the element whose order is $\lfloor \frac {n}{2} \rfloor$ is called median

Example

Excel: median(range)

Mode

In an unordered sequence, if more than half of the elements are m, m is called the mode

Example $$ \begin{align} &[3,5,2,3,3]\rightarrow mode\ is\ 3\newline &but\ in\ [3,5,2,3,3,0]\rightarrow no\ mode\newline \end{align} $$

mode algorithm

Trivial algorithm: sort + traversal

but if we want that $\rightarrow T(n) \leq O(n)\ and\ S(n)\leq O(1)?\newline$

Necessary condition: If mode exists, then the mode must also be the median

The mode, if it exists, must be frequent

$Process\ after\ reducing\newline$

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
template <typename T> 
T majEleCandidate ( Vector<T> A ) { 
   T maj;
   for ( int c = 0, i = 0; i < A.size(); i++ )
      if ( 0 == c ) { 
         maj = A[i]; 
          c = 1; 
      } else 
         maj == A[i] ? c++ : c--; 
   return maj;
}

General purpose algorithm

quickSelect

$Process\ after\ reducing\newline$

Partition

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
template <typename T> 
void quickSelect ( Vector<T> & A, Rank k ) {
   for ( Rank lo = 0, hi = A.size() - 1; lo < hi; ) {
      Rank i = lo, j = hi; 
      T pivot = A[lo];
      while ( i < j ) { //O(hi - lo + 1) = O(n)
         while ( ( i < j ) && ( pivot <= A[j] ) ) j--; A[i] = A[j];
         while ( ( i < j ) && ( A[i] <= pivot ) ) i++; A[j] = A[i];
      } //assert: quit with i == j
      A[i] = pivot; // A[0,i) <= A[i] <= A(i, n)
      if ( k <= i ) hi = i - 1;
      if ( i <= k ) lo = i + 1;
   } //A[k] is now a pivot
}
Worst case

$T(n)=O(n)+T(\frac {n}{Q})+T(\frac{3n}{4})\newline$

$if\ we\ want\ that\ T(n)=O(n)\newline$

$we\ should\ make\ \frac{n}{Q}+\frac{3n}{4}<n \newline$

$so\ \frac{1}{Q}+\frac{3}{4}<1\newline$

$make\ Q=5\newline$

$T(n)=cn+T(\frac{n}{5})+T(\frac{3n}{4})\newline$

$then\ T(n)=O(20cn)=O(n)\newline$

linearSelect

Let Q be a small constant

  1. if ( n = |A| < Q ) return trivialSelect( A, k )

  2. else divide A evenly into n/Q subsequences (each of size Q)

  3. Sort each subsequence and determine n/Q medians //e.g. by insertionsort

  4. Call linearSelect() to find M, median of the medians //by recursion

  5. Let L/E/G = { x </=/> M | x $\in$ A }

1
2
3
	if ( k <= |L| ) return linearSelect( L, k ) 
  	if ( k <= |L| + |E| ) return M 
  	return linearSelect( G, k - |L| - |E| )
In-time

$T(n)=O(n)\newline$

ShellSort

Donald L.Shell,1959

Idea

Think of the whole sequence as a matrix, with each column sorted individually$\rightarrow(w-sorting)\newline$

step sequence

An inverted sequence of the width of each matrix

$set(step)={w_1=1,w_2,w_3,…w_k,…}\newline$

diminishing increment

Algorithm

$Call-by-rank\newline$

$The\ width\ of\ the\ matrix\ is\ w,then\ the\ elements\ in\ the\ i^{th}\ column\ is\ a[i+kw],0\leq k <\frac {n}{w}\newline$

Internal sort $\rightarrow$Input sensitive algorithm

$\rightarrow insertionsort(The\ running\ time\ depends\ on\ the\ total\ number\ of\ inversions)\newline$

The biggest factor

$\rightarrow set(step)={w_1=1,w_2,w_3,…w_k,…}\newline$

Example

$set(step)_{shell}={1,2,4,8,…,2^k,…}\newline$

worst-case: $O(n\log n)\newline$

best-case: $O(n)\newline$

code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
template <typename T>
void Vector<T>::shellSort ( Rank lo, Rank hi ) { //0 <= lo < hi <= size <= 2^30
   /*DSA*/ printf ( "\tSHELLsort [%3d, %3d)\n", lo, hi );
   for ( Rank d = 0x3FFFFFFF; 0 < d; d >>= 1 ) //PS Sequence: { 1, 3, 7, 15, ..., 1073741823 }
      for ( Rank j = lo + d; j < hi; j++ ) { //for each j in [lo+d, hi)
         T x = _elem[j]; Rank i = j - d;
         while ( lo <= i && _elem[i] > x )
            { _elem[i + d] = _elem[i]; i -= d; }
         _elem[i + d] = x; //insert [j] into its subsequence
      }
}

Code by Python(use numpy)

1
import numpy as np

SelectionSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def SelectionSort(x: np.ndarray):
    length = x.size
    CountOfCompare, CountOfMove = 0, 0

    for i in range(length):
        minPosition = i

        for j in range(i + 1, length):
            minPosition = j if x[j] < x[minPosition] else minPosition
            CountOfCompare = CountOfCompare + 1

        if minPosition != i:
            CountOfMove = CountOfMove + 1
            temp = x[i]
            x[i] = x[minPosition]
            x[minPosition] = temp

    return CountOfCompare, CountOfMove

BubbleSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
def BubbleSort(x: np.ndarray):
    length = x.size
    CountOfCompare, CountOfMove = 0, 0

    for i in range(length - 1, 0, -1):
        flag = True

        for j in range(i):
            CountOfCompare = CountOfCompare + 1
            if x[j] > x[j + 1]:
                temp = x[j]
                x[j] = x[j + 1]
                x[j + 1] = temp
                CountOfMove = CountOfMove + 1
                flag = False

        if flag:
            break

    return CountOfCompare, CountOfMove

MergeSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def MergeSort(x: np.ndarray):
    CountOfCompare, CountOfMove = 0, 0

    def mergeSort(arr: np.ndarray, lo: int, hi: int):
        if hi - lo < 2: return
        mi = (lo + hi) >> 1
        mergeSort(arr, lo, mi)
        mergeSort(arr, mi, hi)
        merge(arr, lo, mi, hi)

    def merge(Arr: np.ndarray, low: int, middle: int, high: int):
        nonlocal CountOfCompare, CountOfMove
        i, j, k = 0, 0, 0
        B = Arr[low:middle].copy()
        lb, lc = middle - low, high - middle

        while j < lb and k < lc:
            CountOfCompare = CountOfCompare + 1
            CountOfMove = CountOfMove + 1
            if B[j] <= Arr[middle + k]:
                Arr[low + i] = B[j]
                j = j + 1
            else:
                Arr[low + i] = Arr[middle + k]
                k = k + 1
            i = i + 1

        while j < lb:
            CountOfMove = CountOfMove + 1
            Arr[low + i] = B[j]
            i = i + 1
            j = j + 1

    mergeSort(x, 0, x.size)
    return CountOfCompare, CountOfMove

BinaryInsertSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def BinaryInsertSort(x: np.ndarray):
    CountOfCompare, CountOfMove = 0, 0

    def Search(arr: np.ndarray, Val, lo: int, hi: int):
        nonlocal CountOfCompare

        while lo < hi:
            mi = (lo + hi) >> 1
            CountOfCompare = CountOfCompare + 1
            if Val < arr[mi]:
                hi = mi
            else:
                lo = mi + 1

        return lo - 1

    def Insert(Arr: np.ndarray, val, Rank: int, Length: int):
        nonlocal CountOfMove
        for _ in range(Length, Rank, -1):
            CountOfMove = CountOfMove + 1
            Arr[_] = Arr[_ - 1]
        Arr[Rank + 1] = val
        CountOfMove = CountOfMove + 1

    for i in range(x.size):
        Local = Search(x, x[i], 0, i)
        Insert(x, x[i], Local, i)

    return CountOfCompare, CountOfMove

InsertSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def InsertSort(x: np.ndarray):
    CountOfCompare, CountOfMove = 0, 0

    for i in range(x.size):
        key = x[i]

        j = i - 1
        while j >= 0:
            CountOfCompare = CountOfCompare + 1
            if x[j] > key:
                j = j - 1
            else:
                break
        for _ in range(i, j, -1):
            CountOfMove = CountOfMove + 1
            x[_] = x[_ - 1]

        x[j + 1] = key
        CountOfMove = CountOfMove + 1

    return CountOfCompare, CountOfMove

HeapSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def HeapSort(x: np.ndarray):
    CountOfCompare, CountOfMove = 0, 0
    length = x.size

    def BuildHeap(Arr: np.ndarray):
        nonlocal length

        for i in range(length // 2 - 1, -1, -1):
            heapify(Arr, i, length)

    def heapify(arr: np.ndarray, i: int, n: int):
        nonlocal CountOfCompare, CountOfMove

        largest = i
        leftChild = i * 2 + 1
        rightChild = i * 2 + 2

        CountOfCompare = CountOfCompare + 2
        largest = leftChild if leftChild < n and arr[largest] < arr[leftChild] else largest
        largest = rightChild if rightChild < n and arr[largest] < arr[rightChild] else largest

        CountOfCompare = CountOfCompare + 1
        if largest != i:
            CountOfMove = CountOfMove + 1
            temp = arr[i]
            arr[i] = arr[largest]
            arr[largest] = temp
            heapify(arr, largest, n)

    def heap_sort(Array: np.ndarray):
        nonlocal length, CountOfMove
        BuildHeap(Array)

        for i in range(length - 1, 0, -1):
            CountOfMove = CountOfMove + 1
            temp = Array[0]
            Array[0] = Array[i]
            Array[i] = temp
            heapify(Array, 0, i)

    heap_sort(x)
    return CountOfCompare, CountOfMove

ShellSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def ShellSort(x: np.ndarray):
    CountOfCompare, CountOfMove = 0, 0

    n = x.size
    inc = n // 2

    while inc > 0:
        for i in range(inc, n):
            key = x[i]
            j = i
            while j >= inc and key < x[j - inc]:
                CountOfCompare = CountOfCompare + 1
                CountOfMove = CountOfMove + 1
                x[j] = x[j - inc]
                j = j - inc
            CountOfMove = CountOfMove + 1
            x[j] = key
        inc = inc // 2
    return CountOfCompare, CountOfMove

QuickSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
def QuickSort(x: np.ndarray):
    CountOfCompare, CountOfMove = 0, 0

    def qsort(arr: np.ndarray, lo: int, hi: int):
        if lo < hi:
            pi = partition(arr, lo, hi)

            qsort(arr, lo, pi - 1)
            qsort(arr, pi + 1, hi)

    def partition(Arr: np.ndarray, low: int, high: int):
        nonlocal CountOfCompare, CountOfMove
        i = low - 1
        pivot = Arr[high]

        for j in range(low, high):
            CountOfCompare = CountOfCompare + 1
            if Arr[j] <= pivot:
                i = i + 1
                temp = Arr[i]
                Arr[i] = Arr[j]
                Arr[j] = temp
                CountOfMove = CountOfMove + 1

        CountOfMove = CountOfMove + 1
        temp = Arr[i + 1]
        Arr[i + 1] = Arr[high]
        Arr[high] = temp

        return i + 1

    qsort(x, 0, x.size - 1)
    return CountOfCompare, CountOfMove

TwoWay_InsertionSort

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
def TwoWay_InsertionSort(x: np.ndarray):
    CountOfCompare, CountOfMove = 0, 0
    n = x.size
    if n == 0:
        return CountOfCompare, CountOfMove
    first, final = 0, 0
    temp = np.array(np.zeros(n))
    temp[0] = x[0]

    for i in range(1, n):
        if x[i] < temp[first]:
            CountOfCompare = CountOfCompare + 1
            CountOfMove = CountOfMove + 1
            first = (first - 1 + n) % n
            temp[first] = x[i]
        elif x[i] > temp[final]:
            CountOfCompare = CountOfCompare + 2
            CountOfMove = CountOfMove + 1
            final = (final + 1 + n) % n
            temp[final] = x[i]
        else:
            CountOfCompare = CountOfCompare + 3
            k = (final + 1 + n) % n

            while temp[(k - 1 + n) % n] > x[i]:
                CountOfCompare = CountOfCompare + 1
                CountOfMove = CountOfMove + 1
                temp[(k + n) % n] = temp[(k - 1 + n) % n]
                k = (k - 1 + n) % n

            CountOfMove = CountOfMove + 1
            temp[(k + n) % n] = x[i]
            final = (final + 1 + n) % n

    for k in range(n):
        CountOfMove = CountOfMove + 1
        x[k] = temp[(first + k) % n]

    return CountOfCompare, CountOfMove