Saturday, July 14, 2012

Red – Black Trees (Cây Đỏ Đen)

Red Black trees were proposed by Rudolf Bayer in 1972, and refined by Leonidas J. Guibas and Robert Sedgewick in 1978. Guibas and Redgewick presented the coloring scheme and the name RB tree sticks. RB trees are found in many practical search structures. C++’s std::map and std::set are typically implemented with a red-black tree, so are symbol tables in many operating systems and other languages.



Rigorously, a red-black tree is a BST with the following properties:

· (Bicolor property) All nodes are colored red or black

· (Black root and leaves) The root and the leaves (the NULL nodes) are colored black

· (Black height property) For every internal node v, the number of black nodes we encounter on any path from v down to one of its leaves are the same. This number is called the black height of v. Note that the black height of a node does not count the node’s own color.

· (Black parent property) A red node must have a black parent. Or, equivalently, every red node must have two black children. This is the property that ensures the sparsity of red nodes.

Here is an example of what a RB tree looks like:

clip_image002
RB trees have logarithmic heights

Consider any RB tree T. Suppose we lump together every red node with its parent (which must be black); then, we obtain a tree T'which is no longer a binary tree. The tree T' will be a (2,4)-tree (or 2-3-4 tree), where each internal node has 2, 3, or 4 children. If the RB tree shown above is T, then its 2-3-4 counterpart is

clip_image004

Let h be the height of T and h’ be the height of T’. Also, let n be the number of keys in T. First of all, we claim that the number of black leaves (squares in the pictures) is exactly n+1. This is because the RB tree is a full binary tree and from that we can prove this claim by induction. (Sketch: if the left subtree of the root has a keys and the right subtree has b keys, then there are totally a+b+1 keys in the tree and (a+1)+(b+1) leaves by the induction hypothesis.)

clip_image005Now, due to the 2 to 4 branching factors, the number of leaves varies between 2h’ and 4h’:

By the black parent property, the height of T’ is at least half the height of T. Hence,

clip_image006
How to maintain the RB properties

After an insert. We will always color the newly inserted node red, unless it is the root in which case we color it black. Call the newly inserted node z.

If z or its parent is black, then there is nothing to do. All properties are still satisfied.

Now, suppose z‘s parent is red. Then, we have the double red problem. We fix this problem by considering the following cases.

· If z‘s uncle is red, then we recolor z‘s parent and uncle black, grandparent red (it had to be black before), and consider the grandparent the new z. We potentially have a new double red problem but it is now higher up in the tree. If z‘s grand parent is the root then we color it black and we are done. The following picture illustrates this case.

clip_image008

· If z‘s uncle is black then there are two subcases which can be resolved with either a single rotation or a double rotation as seen in the following pictures.

clip_image010

clip_image012

After a delete. When we delete a node from a BST, we either splice it or splice its successor. Let z be the children of the node that got spliced. If we spliced a red node, then we’re done; there is nothing to fix. Suppose we spliced a black node. In this case, we will violate the black height property of all ancestors of z (except for the trivial case when z is the root or when z‘s parent is the root).

Conceptually, if z was allowed to have “double blackness” then the black height property is not violated. Note that z‘s sibling cannot be a leaf node; thus, the sibling must have two children. We consider three cases as follows.

· z‘s sibling is black with a red child. In this case, we do a single or double rotation and color that child black. In essence we give that red child one of z‘s “blackness”. The fact that z is double black also indicates that the subtree rooted at z is a little short on height; and, the fact that the sibling’s side has a red node means that side has extra height to spare. Overall, a rotation toward z‘s side makes sense. The following pictures illustrate the two sub-cases:

clip_image014

clip_image016

· z‘s sibling is black with both black children and the parent is red. In this case, we color the sibling red, parent black, and we are done:

clip_image018

· z‘s sibling is black with both black children and the parent is black. In this case, we color the sibling red, parent double black, and move the double black problem up the tree.

clip_image020

· Finally, if z‘s sibling is red we perform a single rotation, give the sibling the parent’s color, and the parent red color. As z‘s new sibling is black, we are back to one of the previous cases.

clip_image022

No comments: