Top-Down Deletion for Red/Black Trees
Depending on which resource you first encounter an algorithmic description of Red/Black trees in, the code will likely have descended from one of two lineages: Those which descend from the original Sedgewick/Guibas paper[1] and Sedgewicks venerable "Algorithms" book, and those which descend from the listing in "Introduction to Algorithms" by Cormen et. Al. *(CLRS)[2]. Prior to 2008, this meant either top down without parent pointers, or bottom up with parent pointers. Both lineages model 2-3-4 trees with red links slanting in any direction. For the most part, both lineages were presented as iterative algorithms, with the exception of the implementation presented in the 3rd edition of "Algorithms", which was recursive.

A Red/Black Tree
With few exceptions, every production-grade implementation of Red/Black Trees, such as the versions found in the Linux Kernel and in the C++ STL, are derived from the CLRS lineage. There are a few different reasons which contribute to the CLRS implementation being dominant. The first and most cited reason is that number of rotation performed for each insertion and deletion is bounded in the bottom up algorithms, requiring at most one rotation for insertion, and at most 2 for deletion. Other methods may require up to lgN rotations like AVL trees do. Another frequently cited reason for the use of parent pointers applies to all BST's in that they allow for the implementation of iterator objects which use O(1) space instead of the O(height(T)) needed for storing their path for backtracking. And lastly, albeit a somewhat less important reason is that while there was an (incomplete) description of the algorithm in the 1978 paper, none of Sedgewicks books included a deletion algorithm!
This last issue was finally addressed (at least partially) In 2008 when Sedgewick revived an old idea first leveraged by Arne Anderson to enforce a constraint on the direction which red links leaned[4]. The result was Left Leaning Red/Black trees, which replaced the original Red/Black tree in the fourth edition of his book "Algorithms". And unlike the first three editions of the text, this one included a full implementation of the delete algorithm.
I said it was partially addressed because while it is a top down algorithm, as presented it unfortunately only works for the 2-3 tree based variant of the left-leaning trees. It is also not a "true" top-down algorithm as it still requires rotations on the way back up the tree. Something is better than nothing however, and while not exactly the same as the original, the algorithm from the 2008 paper does follow the same general idea originally proposed in the 1978 paper and later reiterated as an exercise of the 3rd edition of "Algorithms".
After alot of research, frustration, and false starts, it was ultimately the 2008 paper which would act as a sort of "rosetta stone" between the high level description presented as an exercise in the 3rd edition book, and the not-so-clear pseudocode of the 1978 paper, and enabled me to implement the single pass top dop algorithm I'm going to present in this article. So without further ado, let's get to it.
The Driving Idea
Top down deletion relies on a simple but clever premise which Sedgewick described in the original 1978 paper. The premise arises naturally from a few observations about the nature of both Red/Black trees and the wider family of binary search tree's.
1) We know that deleting a leaf node from a binary search tree requires no additional work besides removing the leaf.
2) We know that deleting a red node from a red/black tree does not effect the black height of the tree and so does not leave the tree in an unbalanced state.
It then holds that if we can reduce every deletion to such a case of deleting a red leaf then we should be able to delete any value from the tree in a single top-down pass. We know from the fundamental theorem of rotations in binary search trees that any BST can be converted to any other BST through the careful application of rotations, so it would follow that we should be able to transform a red black tree through a combination of rotations and color flips as well.
An illustration of pushing redness down the search path, as depicted in Algorithms[4].
This approach was repeated as two exercises in the 3rd edition of "Algorithms"[6], where the reader was first encouraged to "develop an implementation of the search procedure that through the use of rotations and colorflips ensured that the current node or one of it's children is red." We accomplish this by "pushing" a red node down the tree from the root towards the leaf through a series of color flips and rotations, using what is essentially the reverse logic of insertion. The reader is then encourage to use this search procedure to implement deletion by ensuring that the node being removed from the tree was always red.
Armed with a high level idea of the algorithm, lets take a closer look at top down deletion in BST's to see how we can apply the aformentioned idea.
Top-Down BST Deletion
What's great about this method, is that it is almost identical to normal deletion in a binary search tree. So long as we maintain the invariant that the node to be deleted is a red leaf, everything proceeds identically to a normal top down deletion. I would be remiss if i didnt say that "so long as..." is doing an outsized amount of work, but we'll cover it all and in the end its actually not so bad.
To help maintain our invariant, we begin by coloring the root node red.
void erase(K data) {
if (empty())
return;
if (isBlack(root->left) && isBlack(root->right)) {
root->color = red;
}
root = eraseR(root, data);
if (root != nullptr) root->color = black;
}
Since we want to maintain this invariant at all times, it is performed as soon as we know the current root is not null. And before we do any comparisons. Only after the path is prepared do we continue searching down the tree for the desired value.
link eraseR(link h, K key) {
if (h == nullptr)
return h;
h = pushRedDown(h, key);
if (key < h->info) {
h->left = eraseR(h->left, key);
} else if (key > h->info) {
h->right = eraseR(h->right, key);
} else {
if (h->right == nullptr) {
link t = h;
h = h->left;
delete t;
} else {
link tmp = min(h->right);
h->info = tmp->info;
h->right = eraseR(h->right, h->info);
}
}
return balance(h);
}
When we find the value we want to delete from the tree, we proceed one of two ways based on it's position in the greater tree. If the key we're looking for also happens to be a leaf node, we simply remove it and were done, safe in the knowledge that it was a red leaf. If the key we want is at an internal node or has one child, we replace it's value by that of it's in-order successor - not the whole node mind you, just the data - and then continue moving down the tree to finish by deleting the node we just copied the replacement data from.
The call to balance() at the end seems to indicate I was lying about this being a truly top down algorithm: I assure you that in the rare cases any of the cases match, it is only at the very bottom of the path and immediately related to the node just removed.
And now to address the elephant in the room...
Pushing Red Nodes "Down"
This is where all the "magic" happens in top down balancing. We have 2 main "cases" we need to handle, with each case having left and right mirrors. When we prepare for insertion, we are predominantly concerned with the color of the current-nodes parent-nodes sibling node. Take a moment to re-read that because its important.
When we prepare the path for deletion we are concerned with current nodes sibling color. It's critical to realize that when we call pushRedDown(), we're passing it the parent of what will become the current node. Remember: we're preparing the path for the future to make deletion easier. To help keep clear which node is which and their respective roles in the tree, we assign aliases so the target node - the node that will become the "current" node is labeled x, and it's sibling is labeled s.
link pushRedDown(link p, K key) {
link x, s;
if (key < p->info) {
x = p->left;
s = p->right;
} else {
x = p->right;
s = p->left;
}
if (isBlack(x) && isRed(s)) {
p = (key < p->info) ? rotL(p):rotR(p);
}
x = (key < p->info) ? p->left:p->right;
if (x && isBlack(x) && isBlack(x->left) && isBlack(x->right)) {
p = (key < p->info) ? pushLeft(p):pushRight(p);
}
return p;
}
With everything labeled we can start our balancing preparations. The first case is the "simple" case: The target node is black, and it's sibling is red. This situation is resolved with a single rotation towards the direction of traversal. The rotations used are the same as for insertion, but the colorFlip code has been slightly modified.
link rotL(link h) {
link x = h->right; h->right = x->left; x->left = h;
x->color = h->color;
h->color = red;
return x;
}
link rotR(link h) {
link x = h->left; h->left = x->right; x->right = h;
x->color = h->color;
h->color = red;
return x;
}
link colorFlip(link h) {
h->color = !h->color;
if (h->left)
h->left->color = !h->left->color;
if (h->right)
h->right->color = !h->right->color;
return h;
}
After performing this rotation we need to re-establish which node is which before testing for the second case. Case 2 occurs If the sibling was not red or the rotation did not fully resolve our issue. We check to see if the target node is a 2-node, meaning its colored black and so are its children.
link pushRight(link h) {
h = colorFlip(h);
if (h->left) {
if (isRed(h->left) && isRed(h->left->right)) {
h->left = rotL(h->left);
h = rotR(h);
h = colorFlip(h);
} else if (isRed(h->left) && isRed(h->left->left)) {
h = rotR(h);
h = colorFlip(h);
}
}
return h;
}
link pushLeft(link h) {
h = colorFlip(h);
if (h->right) {
if (isRed(h->right) && isRed(h->right->left)) {
h->right = rotR(h->right);
h = rotL(h);
h = colorFlip(h);
} else if (isRed(h->right) && isRed(h->right->right)) {
h = rotL(h);
h = colorFlip(h);
}
}
return h;
}
If that code looks familiar, its because they are the same cases handled by insertion, with the additional steps of performing colorFlips both before and after the rotations. That's it, really. Speaking of which, we do have to add one more special case to the balance() routine which may be needed to fix the very bottom of the bath after removing the leaf. That is the last case at the bottom, which checks if both children are red and the right grandchild is also red.
link balance(link h) {
if (isRed(h->left) && isRed(h->right)) {
h = colorFlip(h);
}
if (isRed(h->right)) {
if (isRed(h->right->left)) {
h->right = rotR(h->right);
h = rotL(h);
} else if (isRed(h->right->right)) {
h = rotL(h);
}
} else if (isRed(h->left)) {
if (isRed(h->left->right)) {
h->left = rotL(h->left);
h = rotR(h);
} else if (isRed(h->left->left)) {
h = rotR(h);
}
}
if (isRed(h->left) && isRed(h->right) && isRed(h->right->right)) {
if (h->right && h->right->left)
h->right = rotR(h->right);
h = rotL(h);
h = colorFlip(h);
}
return h;
}
I hope you got something from this, as the personal journey to bring this algorithm to the public ended up being quite the personal journey. Until next time, Happy Hacking!
Further Reading
[1] Sedgewick, Guibas "A dichromatic framework for balanced trees", 1978
[2] Cormen, Et. Al. "Introduction to Algorithms", 1989
[3] Anderson, Arne "Balanced Search Trees Made Simple", 1993
[4] Sedgewick, R. and Wayne, K. "Algorithms", 2011 (4th ed.)
[5] Sedgewick "Algorithms", 1992 (2nd ed.)
[6] Sedgewick, R. and Van Wyck, C. "Algorithms" 1998 (3rd ed.)
[7] Sedgewick, R. "Left Leaning Red Black Trees", 2008
-
Top-Down Deletion for Red/Black Trees
-
Implementing Closures in Bytecode: Heap Allocated Activation Records & Access Links
-
Pascal & Bernoulli & Floyd: Triangles
-
A Quick tour of MGCLex
-
Compiling Regular Expressions for "The VM Approach"
-
Composable Linked Digraphs: An efficient NFA Data Structure for Thompsons Construction
-
Improving the Space Efficiency of Suffix Arrays
-
Augmenting B+ Trees For Order Statistics
-
Top-Down AST Construction of Regular Expressions with Recursive Descent
-
Balanced Deletion for in-memory B+ Trees
Leave A Comment