algorithm Tutorial => Optimal, disjoint-set based implementation

Example

We can do two things to improve the simple and sub-optimal disjoint-set subalgorithms:

Path compression heuristic: findSet does not need to ever handle a tree with height bigger than 2. If it ends up iterating such a tree, it can link the lower nodes directly to the root, optimizing future traversals;
```
subalgo findSet(v: a node):
    if v.parent != v
        v.parent = findSet(v.parent)
    return v.parent
```

Height-based merging heuristic: for each node, store the height of its subtree. When merging, make the taller tree the parent of the smaller one, thus not increasing anyone's height.

subalgo unionSet(u, v: nodes):
    vRoot = findSet(v)
    uRoot = findSet(u)

    if vRoot == uRoot:
        return

    if vRoot.height < uRoot.height:
        vRoot.parent = uRoot
    else if vRoot.height > uRoot.height:
        uRoot.parent = vRoot
    else:
        uRoot.parent = vRoot
        uRoot.height =  uRoot.height + 1

This leads to O(alpha(n)) time for each operation, where alpha is the inverse of the fast-growing Ackermann function, thus it is very slow growing, and can be considered O(1) for practical purposes.

This makes the entire Kruskal's algorithm O(m log m + m) = O(m log m), because of the initial sorting.

Note

Path compression may reduce the height of the tree, hence comparing heights of the trees during union operation might not be a trivial task. Hence to avoid the complexity of storing and calculating the height of the trees the resulting parent can be picked randomly:

    subalgo unionSet(u, v: nodes):
        vRoot = findSet(v)
        uRoot = findSet(u)

        if vRoot == uRoot:
            return
        if random() % 2 == 0:
            vRoot.parent = uRoot
        else:
            uRoot.parent = vRoot

In practice this randomised algorithm together with path compression for findSet operation will result in comparable performance, yet much simpler to implement.

PDF - Download algorithm for free

Previous Next

algorithm

Fastest Entity Framework Extensions

Example

Got any algorithm Question?

algorithm

algorithm Kruskal's Algorithm Optimal, disjoint-set based implementation

Fastest Entity Framework Extensions

Example

Got any algorithm Question?