Lecture 8: Binary Heap#
Trees in Arrays#
The previous lectures store a binary tree with the help of pointer-liked structures, in which each item contains references to its children.
If the tree is a complete binary tree
, there is a useful array-based
alternative.
Definition. A binary tree is complete if every level, except possibly the last, is completely filled, and all the leaves on the last level are placed as far to the left as possible.
A complete binary tree is one that can be obtained by filling the nodes starting with the root, and then each next level in turn, always from the left, until one runs out of nodes. Complete binary trees always have minimal height for their size \(n\), namely \(logn\), and are always perfectly balanced.
where to store in an array P,
root is at \(0\)
left(i) = \(2i+1\)
right(i) = \(2i+2\)
parent(i) = \(floor(\frac{i-1}{2})\)
Storing a binary tree as an array is not efficient
if the tree is not complete, reserve space in the array for every possible node in the tree
for binary search tree, insertion or deletion will involve shifting large portions of the array
Heaps#
The (binary) heap data structure is an array object that we can view as a nearly complete binary tree.
Each node of the tree corresponds to an element of the array.
The tree is completely filled on all levels except possibly the lowest, which is filled from the left up to a point
An array A that represents a heap is an object with two attributes:
A.length
, which (as usual) gives the number of elements in the array, andA.size
, which represents how many elements in the heap are stored within arrayA
.Given the index
i
, we can easily calculate its parent, left and right child based on the way they are stored.
Max-heap:
max-heap property
: for every nodei
except the root, \(A[parent(i)] >= A[i]\)max-heap
is an array satisfying max-heap property at all nodesMin-heap is symmetric
Heap Operations#
import math
class P():
def __init__(self, n):
self.length = n
self.size = 0
self.A = [0]*n
def left(self, i):
return 2*i + 1
def right(self, i):
return 2*i + 2
def parent(self, i):
return math.floor((i-1)/2)
# The height of the tree is $floor(log(n))$. The height for each node $i$ is $floor(log(n)) - floor(log(i+1))$
def getHeight(self, i):
return math.floor(math.log(n)) - math.floor(math.log(i+1))
# maintain the max-heap property of the `i`th node in heap `P` in $O(log(n))$ time from top to down
# check whether A[i] >= A[j] for j in {left(i), right(i)}
# - if not, swap A[i] with A[j] for child j in {left(i), right(i)} with maximum value and recursively `heapifyDown(A, j)`.
def isValid(self, parent, left, right):
return self.A[parent] > self.A[left] and self.A[parent] > self.A[right]
def heapifyDown(self, i):
left = self.left(i)
right = self.right(i)
if not self.isValid(i, left, right)
if self.isValid(left, i, right):
self.A[i], self.A[left] = self.A[left], self.A[i]
j = left
elif: self.isValid(right, left, i):
self.A[i], self.A[right] = self.A[right], self.A[i]
j = right
self.heapifyDown(j)
# maintain the max-heap property of the `i`th node in heap `P` in $O(log(n))$ time from bottom to up
# check whether A[i] >= A[j] for j in {left(i), right(i)}
# - if not, swap A[i] with A[j] for child j in {left(i), right(i)} with maximum value and recursively `heapifyDown(A, j)`.
def heapifyUp(self, i):
parent = self.parent(i)
if self.A[i] > self.A[parent]:
self.A[i], self.A[parent] = self.A[parent], self.A[i]
j = parent
self.heapifyUp(j)
# insert an item with key k in the heap A ~ $O(logn)$
# - append the item with key k to the end of the heap: A[n+1] = k
# - heapifyUp(A, n+1)
def insert(self, key):
self.A.append(key)
self.length += 1
self.size +=1
self.heapifyUp(self.length-1)
def getMax(self):
return P.A[0]
def pop(self):
self.size -= 1
self.length -= 1
self.A.pop()
# deleteMax:
# can only easily delete the last element in a dynamic array, but the max of a max_heap is at the root
# - normally it requres $O(n)$ time by removeing the first element in a dynamic array. can we do it in $logn$ time?
# - algorithm
# - swap the max at root node $i = 0$ with the last item at node $n-1$ in heap array, and then delete the last
# - update heap size by -1
# - `heapifyDown(A, 0)` after swaping to maintain `max-heap property`
# - return the deleted node
def deleteMax(self):
self.A[0], self.A[-1] = self.A[-1], self.A[0]
max = self.pop()
self.heapifyDown(0)
return max
Heap Sort#
HEAPSORT: heapsort(A)
:
a in-place sorting algorithm that runs in \(O(nlogn)\). note that
merge_sort
is also \(O(nlogn)\) but requires additional \(O(n)\) space.algorithm
for i in range(n, -1, 0)
deleteMax(A)
Priority Queue#
one of the most popular applications of a heap: as an efficient
priority queue
a priority queue is a data structure for maintaining a set
S
of elements, each with an associated value called a key.when we use a heap to implement a priority queue, therefore, we often need to store a handle (or key) to the corresponding application object in each heap element. The exact makeup of the handle (such as a pointer or an integer) depends on the application. Similarly, we need to store a handle to the corresponding heap element in each application object.
operations
get_max(A)
-> get max of a max-heap A in \(O(1)\)return \(A[0]\)
delete_max(A)
-> same as a max-heapinsert(A, k)
-> insert an item with key \(k\) in the heap A ~ \(O(logn)\)append the item with key \(k\) to the end of the heap: \(A[n+1] = k\)
max_heapify_up(A, n+1)
delete_root()
-> To use a binary heap tree as a priority queue, we will regularly need to delete the root, i.e. remove the node with the highest priority.this equals to
delete_max()
delete(A, k)
-> delete an item with key \(k\) in the heap A ~ \(O(logn)\)same as the
delete
method in a max_heap