Register | Login

Stacking Code

public interface IBlog { string Dump(Stream consciousness); }

Priority Queue

Thursday, 14 November, 2013 @ 8:00 AM < Adam Boddington
Tags: C#, Collections

You can find all the code for this priority queue, including unit tests, over on Bitbucket.

In a recent coding challenge I was presented with the well-flogged "find the shortest path between two nodes" problem. That's pretty quick to solve with a naive implementation of Dijkstra's algorithm, but this particular problem involved large datasets, so performance was a factor. To take my solution from O(N^2) to O(N*log(N)), I had to add a minimum priority queue to the mix.

Operations needed were:

  • T Dequeue(): Take the most precedent item from the queue and return it (in a minimum priority queue, that's any item with the smallest priority value).
  • void Reprioritise(T item): Reorder an item in the queue based on its new priority value.

The queue will be initialised with all the items, so Enqueue isn't necessary. The items will also have all the same priority to start with, so an initial sort isn't necessary either. Finally, the queue must be able to store multiple objects with the same priority.

After casting about for a suitable collection in .NET that could do the job, it appeared that nothing was immediately appropriate. All of the sorted collections, SortedDictionary, SortedList, and SortedSet, disallow objects that compare identically, i.e. return an IComparer<T>.Compare result of zero. That can be worked around by comparing priority first and then another unique value second, ensuring Compare never returns zero, but that didn't occur to me at the time. So I ended up rolling my own priority queue. It's not rocket science, but I thought it was interesting enough to spruce up and share after the challenge.

What is a Priority Queue?

Simply put, a collection of items that stores (or knows how to find) a priority value for each item. Items with the most precedent priority are presented before others. For a minimum priority queue, that's items with the lowest priority value. For a maximum priority queue, that's items with the highest priority value. Items with the same priority value can be presented in any order (usually).

Priority queues can be implemented a number of ways, but the most common way is with a binary heap.

What is a Heap?

A heap is a tree that ensures every node is less than or equal to its children (min-heap), or greater than or equal to its children (max-heap). By ensuring that simple rule is true from the root of the tree down, it's well suited for a queue that wants to ensure the most precedent item is first.

What is a Binary Heap?

A binary heap is a heap that uses a complete binary tree. That means each node has at most two children, every level (except potentially the last) is completely filled, and every level is filled from left to right.

The neat thing about a complete binary tree is that it can be easily represented in memory with a simple array or list. Given the index of any item in a heap array, the address of its children are:

  • index * 2 + 1
  • index * 2 + 2

Given the index of any item in a heap array, the address of its parent is:

  • (index - 1) / 2

When finding the parent of a second child we get an odd number before it is halved, but thanks to the way integer math works (fractions are truncated), it results in the same parent as its sibling.

So in a heap array, item 0 is the root, 1 and 2 are its children, 3 and 4 are the children of 1, 5 and 6 are the children of 2, 7 and 8 are the children of 3, and so on.

Usage

The priority queue I built is similiar to SortedSet<T> in that it only accepts one type, the type of the object being stored (there is no separate key for priority), and it uses IComparer<T> to make comparisons between objects. Where it differs is that those comparisons can be equal, i.e. it allows multiple objects with the same priority. It also allows the same object being added more than once, although that feature wasn't required for Dijkstra's algorithm.

Here's an example of some classes that might be used with the priority queue:

public class Node
{
    // This doesn't have to be an int
    // It can be an enum, string, anything really
    public int Distance { get; set; } // Priority

    ...
}

// The priority comparer
public class NodeDistanceComparer : IComparer<Node>
{
    public int Compare(Node x, Node y)
    {
        EnsureNodeNotNull(x);
        EnsureNodeNotNull(y);

        return x.Distance.CompareTo(y.Distance);
    }
}

Usage could look something like this:

var pq = new MinimumPriorityQueue(nodes, new NodeDistanceComparer());

while (pq.Count > 0)
{
    Node node = pq.Dequeue();

    ...
}

Operations

Whatever a priority queue does, it has to keep its heap valid. Here are some of the operations common to heaps and priority queues that were implemented to keep everything copasetic.

Sift Down

Sifting down is the process of taking an item in the heap, comparing it with the most precedent of its children, swapping it if required, then repeating the process down the heap until a swap is not required, or the end of the heap is reached.

private void SiftDown(int index)
{
    while (index < Count)
    {
        int child1Index = GetChild1Index(index);
        int child2Index = GetChild2Index(index);

        // We can stop if there are no children
        if (child1Index >= Count)
            break;

        // Which child is precedent?
        // Make sure there is a child 2 before comparing
        if (child2Index == Count || IsPrecedent(child1Index, child2Index))
        {
            if (!IsPrecedent(child1Index, index))
                break;

            Switch(child1Index, index);
            index = child1Index;
        }
        else
        {
            if (!IsPrecedent(child2Index, index))
                break;

            Switch(child2Index, index);
            index = child2Index;
        }
    }
}

Sift Up

Sifting up is the process of taking an item in the heap, comparing it with its parent, swapping it if required, then repeating the process up the heap until a swap is not required, or the beginning of the heap is reached.

private void SiftUp(int index)
{
    while (index > 0)
    {
        int parentIndex = GetParentIndex(index);

        if (!IsPrecedent(index, parentIndex))
            break;

        Switch(index, parentIndex);
        index = parentIndex;
    }
}

Heapify

Heapifying is the act of transforming a random collection of items into a valid heap, usually only done when a heap needs to be created from an existing set of items. It can be done in a variety of ways, but working from the last parent node to the root, performing a sift down on each item, is the fastest way to do it. You can read why here.

private void Heapify()
{
    int index = GetParentIndex(Count - 1); // Last parent

    while (index >= 0)
        SiftDown(index--);
}

Enqueue

Enqueuing an item is done by adding it to the end of the heap and sifting it up. It might be tempting to discard Heapify and just use Enqueue when creating a heap from an existing collection. It works, but Heapify is faster for the reasons linked above.

Remove

Removing an item is done by swapping it with the end of the heap, removing it from the end, then sifting down and up at the swapped position. Because it was the last item in the heap, it's easy to think only sifting down is required, but it is possible that a sift up might be required if the last item has moved to a less precedent branch of the tree. Calling sift up after sift down is fine, only one will do anything (if either).

Here's an example of a min-heap that requires sifting up after an item is removed.

{ 1, 4, 2, 5, 8, 7, 3 } // We want to remove 5
{ 1, 4, 2, 3, 8, 7, 5 } // Swapped 5 with 3
{ 1, 4, 2, 3, 8, 7 }    // Removed 5, but 3 is smaller than its parent 4
{ 1, 3, 2, 4, 8, 7 }    // Sift up 3, we have a valid min-heap again

Dequeue and Peek

Nothing special here. Both inspect the root of the heap and return it if there is one. Dequeue also removes the root using Remove.

Bells and Whistles

Here are some extras used to make the implementation easier.

Behaviour

The only difference between a minimum priority queue and a maximum priority queue is in the comparison of the priority value. Does the smaller one come first, or the bigger one? The PriorityQueue<T> class can be either a minimum or a maximum priority queue by setting the appropriate behaviour.

public enum PriorityQueueBehaviour
{
    Minimum = 0,
    Maximum = 1
}

public class PriorityQueue<T> : IEnumerable<T>
{
    // Minimum is "normal", maximum is the reverse
    private static readonly int[] BehaviourModifiers = new[] { 1, -1 };

    ...

    public PriorityQueueBehaviour Behaviour { get; private set; }

    ...

    private int BehaviourModifier
    {
        get { return BehaviourModifiers[(int)Behaviour]; }
    }

    ...

    private bool IsPrecedent(int index1, int index2)
    {
        T item1 = _heap[index1];
        T item2 = _heap[index2];

        return IsPrecedent(item1, item2);
    }

    private bool IsPrecedent(T item1, T item2)
    {
        int compare = _comparer.Compare(item1, item2);
        // Minimum: do nothing
        // Maximum: multiply by -1
        compare *= BehaviourModifier;

        return compare < 0;
    }

    ...
}

Index

Sifting in an array is fast when we can easily calculate where to go next. Doing things at the beginning and end of an array is also fast. What isn't fast is searching for an item in an unsorted array. And a heap is unsorted. So an index was added to keep track of where each object is in the heap. Finding the position of any item in the heap is now close to an O(1) operation (depending on duplicates). All operations involving the index are close to O(1) unless the priority queue exceeds its capacity. If you know the capacity of your priority queue beforehand, it's a good idea to set it.

public class PriorityQueue<T> : IEnumerable<T>
{
    ...

    private readonly List<T> _heap;
    // An index of all the items in the heap and where in the heap they
    // currently are
    private readonly MultiValueIndex<T> _index;

    ...

    public void Enqueue(T item)
    {
        EnsureItemNotNull(item);

        _heap.Add(item);
        _index.AddIndex(item, Count - 1);
        SiftUp(Count - 1);
    }

    ...

    public void Remove(T item)
    {
        EnsureItemNotNull(item);
        EnsureItemExists(item);

        int index = _index.GetIndexes(item).First();
        Switch(index, Count - 1);
        _index.RemoveIndex(item, Count - 1);
        _heap.RemoveAt(Count - 1);

        if (Count == 0 || index == Count)
            return;

        SiftDown(index);
        SiftUp(index);
    }

    public void Reprioritise(T item)
    {
        EnsureItemNotNull(item);
        EnsureItemExists(item);

        int count = _index.GetIndexes(item).Count();

        for (int i = 0; i < count; i++)
            Remove(item);

        for (int i = 0; i < count; i++)
            Enqueue(item);
    }

    ...

    private void Switch(int index1, int index2)
    {
        T item1 = _heap[index1];
        T item2 = _heap[index2];

        _heap[index1] = item2;
        _heap[index2] = item1;

        _index.ReplaceIndex(item1, index1, index2);
        _index.ReplaceIndex(item2, index2, index1);
    }
}

There are 0 comments.


Comments

Leave a Comment

Please register or login to leave a comment.


Older
Roy Osherove's TDD Kata 1

Newer
Codility Challenges

Older
Roy Osherove's TDD Kata 1

Newer
Codility Challenges

browse with Pivot


About


Projects

Building Neno


RSS
Recent Posts

Codility Nitrogenium Challenge
OS X Lock
HACT '13
Codility Challenges
Priority Queue


Tags

Architecture (13)
ASP.NET (2)
ASP.NET MVC (13)
Brisbane Flood (1)
Building Neno (38)
C# (4)
Challenges (3)
Collections (1)
Communicator (1)
Concurrency Control (2)
Configuration (1)
CSS (5)
DataAnnotations (2)
Database (1)
DotNetOpenAuth (2)
Entity Framework (1)
FluentNHibernate (2)
Inversion of Control (5)
JavaScript (1)
jQuery (4)
Kata (2)
Linq (7)
Markdown (4)
Mercurial (5)
NHibernate (20)
Ninject (2)
OpenID (3)
OS X (1)
Pivot (6)
PowerShell (8)
Prettify (2)
RSS (1)
Spring (3)
SQL Server (5)
T-SQL (2)
Validation (2)
Vim (1)
Visual Studio (2)
Windows Forms (3)
Windows Service (1)


Archives


Powered by Neno, ASP.NET MVC, NHibernate, and small furry mammals. Copyright 2010 - 2011 Adam Boddington.
Version 1.0 Alpha (d9e7e4b68c07), Build Date Sunday, 30 January, 2011 @ 11:37 AM