Heap

HomePage | Recent changes | View source | Discuss this page | Page history | Log in |

Printable version | Disclaimers | Privacy policy

In computer science, there is a data structure called a heap. (There's also something else called a heap, which is memory that can be dynamically allocated; the two are unrelated.) The heap data structure is a binary tree in which the value stored at each node is no less than the value stored at either of its two children. (This is called the "heap property".)

A heap is one way to implement a priority queue and is used in the heapsort sort algorithm.

Heaps are commonly represented by arrays of values; the root of the tree is item 1 in the array, and the children of the item n are the items at 2n and 2n+1; so item 1's children are items 2 and 3, 2's children are items 4 and 5, etc. This allows you to regard any 1-based array as a possibly-screwed-up heap.


        1         
       / \        
      /   \       
     2     3      
    / \    /\     
   4   5  6  7    
  /\   /\         
 8  9 10 11       


If you regard the array as a screwed-up heap, you can fix it in O(n) time by restoring the heap property from the bottom up. Consider each trio containing a node and its two children, starting from the end of the array; if the greatest value of the three nodes is not in the top node, exchange it with the top node. This puts the former top node at the top of a subtree which may contain nodes greater than it is; so we must compare it with its new children and possibly repeat the process.

It appears offhand to me that this should make this process take O(n lg n) time, but all the references I can find say it's O(n), although I can't understand their proofs. Once you have a heap, you can quickly find the node with the greatest value in it; it will be the node at the root of the tree. If you want to remove this node (as is necessary for a priority queue), in the array representation, you can swap that node's value with the value of the last node in the array (node 11 in the above example), shorten the array by 1 (removing that last node from the heap), and then restore the heap property, which still holds everywhere but at the top of the heap. The value at the top of the tree is now less than at least one of its children; swap it with the greater of its two children, and the heap property will be true at the top, but now possibly not where the top node was just swapped to; so you have to repeat the process until it is greater than either of its two children, which probably means until it reaches the bottom of the heap --- up to lg n swaps.


So removing the largest node from a heap takes O(lg n) steps.