Selection Sorting a Linked List

I’ve always liked the selection sort algorithm. I’m not sure why, I think it’s the frank simplicity of it. Unfortunately, it is amongst the slowest of sorting algorithms, and is firmly in the realm of theoretic interest over practical use. Yet still it is studied because it’s useful instruction for how to construct a more complicated algorithm from basic array operations.

 void selectionsort(int a[], int n) {
    for (int i = 0; i < n - 1; i++) {
        int min = i;
        for (int j = i+1; j < n; j++) {
            if (a[min] > a[j])
                min = j;
        }
        int t = a[i]; a[i] = a[min]; a[min] = t;  
    }
}

When applied to linked lists, it’s implementation becomes more interesting. In an array, we were able to efficiently swap the minimum element into its sorted position. Swapping values between nodes in a linked list does not make sense if the object becoming move is larger than the size of a pointer, as you can rearrange a linked list solely through pointer swaps (as you should). So how do you efficiently implement selection sort on a linked list?

A Simple Plan

The basic plan for selection sort on a linked list is the same as it was with arrays, but for a few key differences. In the array version above, we search for the minimum in the unsorted portion, and build the sorted portion from left to right. In the linked list version, we will be searching for the maximum value, and build the sorted list right to left.

The fundamental process is as follows: designate an empty list as the sorted list, while the input (unsorted) list still has values, find the node with the maximum value in the unsorted list, removed it, and add it to the front of the sorted list. when their are no unsorted nodes left, return the sorted list.

 struct node {
    int info;
    node* next;
    node(int i = 0, node* n = nullptr) : info(i), next(n) { }
};
typedef node* nodeptr;

nodeptr selectionsort(nodeptr head) {
    nodeptr sorted = nullptr;
    while (head != nullptr) {
        nodeptr max = new node(INT_MIN, nullptr);
        head = removeMax(head, &max);
        max->next = sorted;
        sorted = max;
    }
    return sorted;
}

Simple enough right? The tricky bit lies in finding and then removing the next node to be added from the unsorted list.

A Devil in The Details

Taken separately, the two steps are simple, but we want to do both operations in just one traversal of the list if possible. In our case, we must traverse the entire list to ensure we are finding the right node to be removed. If we must then traverse the list again to remove the node once it has been found, as in the code below, the result will be not just very inefficient, but terribly slow for all but the smallest of lists.

//simple
nodeptr findMax(nodeptr h) {
    nodeptr max = h;
    for (nodeptr t = h; t != nullptr; t = t->next) {
        if (max->info < t->info)
            max = t;
    }
    return max;
}
//simple
nodeptr removeNode(nodeptr h, nodeptr toRemove) {
    if (h == toRemove) {
        h = h->next;
        return h;
    }
    nodeptr x = h;
    while (x->next && x->next != toRemove)
        x = x->next;
    if (x->next == toRemove) {
        x->next = toRemove->next;
    }
    return h;
}
//inefficient
nodeptr removeMax(nodeptr* h) {
    nodeptr max = findMax(*h);
    *h = removeNode(*h, max);
    return max;
}

nodeptr selectionsort(nodeptr head) {
    nodeptr sorted = nullptr;
    while (head != nullptr) {
        nodeptr max = removeMax(&head);
        max->next = sorted;
        sorted = max;
    }
    return sorted;
}

So how do we prevent this already poor performing algorithm from performing any worse? It’s actually simpler than you might think.

Recursion To the Rescue

Since finding the maximum value in a list necessitates a traversal of the entire unsorted portion of the list, we can use recursion to help us along with this by using the pre-order portion of the procedure to find the maximum during the traversal. Then, when the stack unwinds we will encounter the node during the post-order traversal back up the list and so use that opportunity to remove it from the list. By passing the maximum value by reference, we can return the node to be added to the sorted list, while preserving the unsorted portion to be further processed.

 nodeptr removeMax(nodeptr h, nodeptr* max) {
    if (h == nullptr) {
        return h;
    }
    if ((*max)->info < h->info) {
        *max = h;
    }
    h->next = removeMax(h->next, max);
    if (h == *max) {
        nodeptr t = h;
        h = h->next;
        t->next = nullptr;
        max = &t;
    }
    return h;
}

nodeptr selectionsort(nodeptr head) {
    nodeptr sorted = nullptr;
    while (head != nullptr) {
        //nodeptr max = removeMax(&head);
        nodeptr max = new node(-1337, nullptr);
        head = removeMax(head, &max);
        max->next = sorted;
        sorted = max;
    }
    return sorted;
}

Now having seen both versions, it should be noted that neither approach is less lines of code than, nor more straight forward than to just implement a merge sort for the linked list instead, which in all cases will do the job both better and faster.

Until Next Time, Happy Hacking!