Lecture 8: Binary Heap#

Trees in Arrays#

The previous lectures store a binary tree with the help of pointer-liked structures, in which each item contains references to its children. If the tree is a complete binary tree, there is a useful array-based alternative.

Definition. A binary tree is complete if every level, except possibly the last, is completely filled, and all the leaves on the last level are placed as far to the left as possible.

A complete binary tree is one that can be obtained by filling the nodes starting with the root, and then each next level in turn, always from the left, until one runs out of nodes. Complete binary trees always have minimal height for their size \(n\), namely \(logn\), and are always perfectly balanced.

One type of Complete Binary Tree: Max-heap

  • where to store in an array P,

    • root is at \(0\)

    • left(i) = \(2i+1\)

    • right(i) = \(2i+2\)

    • parent(i) = \(floor(\frac{i-1}{2})\)

  • Storing a binary tree as an array is not efficient

    • if the tree is not complete, reserve space in the array for every possible node in the tree

    • for binary search tree, insertion or deletion will involve shifting large portions of the array

Heaps#

The (binary) heap data structure is an array object that we can view as a nearly complete binary tree.

  • Each node of the tree corresponds to an element of the array.

  • The tree is completely filled on all levels except possibly the lowest, which is filled from the left up to a point

    • An array A that represents a heap is an object with two attributes: A.length, which (as usual) gives the number of elements in the array, and A.size, which represents how many elements in the heap are stored within array A.

    • Given the index i, we can easily calculate its parent, left and right child based on the way they are stored.

Max-heap:

  • max-heap property: for every node i except the root, \(A[parent(i)] >= A[i]\)

  • max-heap is an array satisfying max-heap property at all nodes

  • Min-heap is symmetric

Heap Operations#

import math
class P():
    def __init__(self, n):
        self.length = n
        self.size = 0
        self.A = [0]*n
    
    def left(self, i):
        return 2*i + 1
    
    def right(self, i):
        return 2*i + 2
    
    def parent(self, i):
        return math.floor((i-1)/2)
    
    # The height of the tree is $floor(log(n))$. The height for each node $i$ is $floor(log(n)) - floor(log(i+1))$
    def getHeight(self, i):
        return math.floor(math.log(n)) - math.floor(math.log(i+1))
    
    # maintain the max-heap property of the `i`th node in heap `P` in $O(log(n))$ time from top to down
    # check whether A[i] >= A[j] for j in {left(i), right(i)}
    #    - if not, swap A[i] with A[j] for child j in {left(i), right(i)} with maximum value and recursively `heapifyDown(A, j)`.
    def isValid(self, parent, left, right):
        return self.A[parent] > self.A[left] and self.A[parent] > self.A[right]

    def heapifyDown(self, i):
        left = self.left(i)
        right = self.right(i)
        
        if not self.isValid(i, left, right)
            if self.isValid(left, i, right):
                self.A[i], self.A[left] = self.A[left], self.A[i]
                j = left
            elif: self.isValid(right, left, i):
                self.A[i], self.A[right] = self.A[right], self.A[i]
                j = right
        
            self.heapifyDown(j)
    # maintain the max-heap property of the `i`th node in heap `P` in $O(log(n))$ time from bottom to up
    # check whether A[i] >= A[j] for j in {left(i), right(i)}
    #    - if not, swap A[i] with A[j] for child j in {left(i), right(i)} with maximum value and recursively `heapifyDown(A, j)`.
    def heapifyUp(self, i):
        parent = self.parent(i)
        
        if self.A[i] > self.A[parent]:
            self.A[i], self.A[parent] = self.A[parent], self.A[i]
            j = parent
            self.heapifyUp(j)

    # insert an item with key k in the heap A ~ $O(logn)$
    #   - append the item with key k to the end of the heap: A[n+1] = k
    #   - heapifyUp(A, n+1)
    def insert(self, key):
        self.A.append(key)
        self.length += 1
        self.size +=1

        self.heapifyUp(self.length-1)

    def getMax(self):
        return P.A[0]
    
    def pop(self):
        self.size -= 1
        self.length -= 1
        self.A.pop()

    # deleteMax:
    #  can only easily delete the last element in a dynamic array, but the max of a max_heap is at the root
    #    - normally it requres $O(n)$ time by removeing the first element in a dynamic array. can we do it in $logn$ time?
    #    - algorithm
    #        - swap the max at root node $i = 0$ with the last item at node $n-1$ in heap array, and then delete the last
    #        - update heap size by -1
    #        - `heapifyDown(A, 0)` after swaping to maintain `max-heap property`
    #        - return the deleted node
    def deleteMax(self):
        self.A[0], self.A[-1] = self.A[-1], self.A[0]
        max = self.pop()
        self.heapifyDown(0)

        return max

Heap Sort#

HEAPSORT: heapsort(A):

  • a in-place sorting algorithm that runs in \(O(nlogn)\). note that merge_sort is also \(O(nlogn)\) but requires additional \(O(n)\) space.

  • algorithm

    • for i in range(n, -1, 0)

    • deleteMax(A)

Priority Queue#

  • one of the most popular applications of a heap: as an efficient priority queue

    • a priority queue is a data structure for maintaining a set S of elements, each with an associated value called a key.

    • when we use a heap to implement a priority queue, therefore, we often need to store a handle (or key) to the corresponding application object in each heap element. The exact makeup of the handle (such as a pointer or an integer) depends on the application. Similarly, we need to store a handle to the corresponding heap element in each application object.

    • operations

      • get_max(A) -> get max of a max-heap A in \(O(1)\)

        • return \(A[0]\)

      • delete_max(A) -> same as a max-heap

      • insert(A, k) -> insert an item with key \(k\) in the heap A ~ \(O(logn)\)

        • append the item with key \(k\) to the end of the heap: \(A[n+1] = k\)

        • max_heapify_up(A, n+1)

      • delete_root() -> To use a binary heap tree as a priority queue, we will regularly need to delete the root, i.e. remove the node with the highest priority.

        • this equals to delete_max()

      • delete(A, k) -> delete an item with key \(k\) in the heap A ~ \(O(logn)\)

        • same as the delete method in a max_heap