_    _                  _ _   _                   
	       / \  | | __ _  ___  _ __(_) |_| |__  _ __ ___  ___ 
	      / _ \ | |/ _` |/ _ \| '__| | __| '_ \| '_ ` _ \/ __|
	     / ___ \| | (_| | (_) | |  | | |_| | | | | | | | \__ \
	    /_/   \_\_|\__, |\___/|_|  |_|\__|_| |_|_| |_| |_|___/
		       |___/       

==============================================================================
			      SORTING ALGORITHMS
==============================================================================

Selection Sort:

    Always O(n^2) regardless of data
    Suitable on linked lists where removal and addition are cheap
    Generally dominated by insertion sort, which does half as many comps on avg
    Only advantage is when swaps are expensive; requires O(n) not O(n^2) for ins
    Primarily of historical interest

  Algorithm:

    Find the smallest element
    Swap it with the first element
    Repeat until done

Insertion Sort:

    Desirable when data are already almost sorted; otherwise O(n^2)
    Stable, in-place
    Fewer comparisons than selection sort but more writes
    May be faster than quicksort or mergesort for smallish number of elements
    When comparisons are expensive, a variant with binary search can be used
    CLR page 8

  Algorithm:

    Start at item 2
    Move the item left until it is not smaller than the item left of it
    Repeat until done

Shell Sort:

    Variant of Insertion Sort which moves in decreasing gaps.
    Running time order is a function of gap series.
    O(n^4/3) and O(n*lg(n)^2) is known.
    There is a proof that n*lg(n) cannot be achieved.

    For gap k, sorts 1st, kth, 2kth, elements.  Then sorts 2nd, k+1th, etc.

    A favorable sequence is 1, 4, 10, 23, 57, 132, 301, 701, 1750.

Merge Sort:

  Algorithm:

    If there is more than one element:
	Sort first half (recursive call)
	Sort second half (recursive call)
	Merge two halves

  Notes:

    Requires additional storage (not in-place).  Recursive.
    CLR page 13

Heap Sort:

  O(n lg n).  Generally dominated by quick sort in practice.
  See section on heaps for details.

Quick Sort:
    
    O(n^2) worst case, but O(n lg n) average case, and constant factors low.
    In-place memory usage.  Good performance requires balanced partitioning.

    Naive algorithm has worst-case time on already sorted input by using
    initial element as the pivot.

    CLR page 153; entirety of Chapter 8.

  Quicksort:
    If input array is greater than one element
	Partition it, getting divider as return value
	Run quicksort on each side

  Partition:
    Set pivot = first element (other more intelligent methods possible)
    Take two indices, one at the front and one at the back.
    Advance the front index until it points to an element >= pivot
    Retreat the back index until it points to an element <= pivot
    If the indices have crossed, return index location
    Otherwise, swap elements and repeat

Radix Sort:
    
    Sort on least significant digit first (an O(n) operation).  Then
    next least, etc.  Of course a stable sort must be used.  Not in-place.

    Note that time is a function of the size of the numbers and not just
    the number of items.

    CLR page 178.

Bucket Sort:

    Assumes input is randomly distributed over an interval.  Then divides
    this interval into n buckets and puts the n input numbers into them.
    Use any kind of sort within the bucket, since the number of items will
    be small.  A linked list for buckets is common.

    If the input is *not* evenly distributed, the within-bucket sort
    can dominate.
    
    CLR page 180.

==============================================================================
		     PARTIAL SORTING/SELECTION ALGORITHMS
==============================================================================

Min and max:

    There is a cute trick to reduce the number of comparisons when finding
    both min and max.  Compare each two elements pairwise, and then compare
    only the larger with the max and the smaller with the min.  This gets
    you 3/2n comparisons instead of 2n.  CLR page 186.

Selection Algorithm for Ith Largest Element:

    Proceed like a (randomized) quicksort, but each time recurse only on the
    half which will contain the desired answer.  CLR page 187.

    There is further discussion on ways of guaranteeing a good pivot.

==============================================================================
			MISCELLANEOUS DATA STRUCTURES
==============================================================================

STACKS, QUEUES
~~~~~~~~~~~~~~
CLR chapter 11.  Queues generally implemened in a ring buffer.
Also consider deques, not in CLR.

LINKED LISTS
~~~~~~~~~~~~
Remember that you can make a ring buffer with a sentinel between
the beginning and end, which is sometimes convenient.  Consider the merits
of singly vs. doubly linked lists.  Insertion and deletion are cheap;
indexing is expensive.

Skip lists:

    The bottom layer is an ordinary linked list.  Each higher layer
    contains an element with some probability, often 1/2 or 1/4.
    The expected search time is O(log N); P can be varied to trade
    storage space against search time.

    They have properties comparable to trees with random insertions, but
    without requiring insertions to be random.

    The best place to start a search is the level where we expect 1/p
    nodes, or log_1/p(n).  However, starting at the highest level is
    also generally OK and adds only a small constant factor.

    It is also suggested that the level of a new node could be constrained
    to be at most 1 higher than the existing maximum level, which works
    well in practice but is difficult to analyze.

    The absolute fastest search is with p=1/e, but it is only 94% of the 1/2
    and 1/4 cases.  1/4 is just as fast on average as 1/2, with less storage,
    but the variability is higher.

    See ftp://ftp.cs.umd.edu/pub/skipLists/skiplists.pdf

DISJOINT SETS
~~~~~~~~~~~~~
CLR page 440.  Used in Kruskal's algorithm for minimum spanning trees.
The set has a representative item, which may be arbitrary or may have a
specific algorithm for its selection.  The operations make a set from a
singleton, unite sets, and find the set containing an item.

Linked List Representation:

    Representative is first object.  Each object contains a pointer
    back to the representative.  Thus find-set is trivial because it
    gets the representative from the object (note that you must have
    the set object, not a copy of the key).  Union then just appends
    one linked list to the other, but it must update the representative
    pointers, taking O(m) time.  As a heuristic for a minor improvement,
    always add the smaller set to the larger.  For m operations, of
    which n are make-set, this gives O(m+n lg n); CLR 445.

Forest Representation:

    Each tree is a set.  Each member points only to parent, not to root
    of tree.  Find-set then requires walking up the tree to the root.
    Union points the root of one tree at the root of the other tree (they're
    not binary trees).  This by itself is not necessarily an improvement,
    as no assertions about favorable height of trees can be made; they
    could be degenerate (linked lists).

  Union by Rank:

    Make the root of the tree with fewer nodes point at the root of the
    tree with more nodes instead of the other way around (CLR 447).  To do
    this efficiently, we must maintain the rank as we go instead of trying
    to compute it when doing the union.
    
    The rank is an upper bound on the height, but may conceivably
    overestimate it.  If the ranks of the trees being joined are exactly
    equal, the rank of the new root will go up by 1.

  Path Compression:

    When performing a find, update all nodes on the path to point directly
    to the root.  Ranks records are unchanged, and thus could conceivably
    become an overestimate.  The compression is a two-pass procedure.

  Running Time:

    With both heuristics, O(m alpha(m,n)) for m operations on n elements.
    Alpha is the Ackermann function inverse, <= 4.  Proof is difficult.
    CLR p 453 presents a weaker proof.

HASH TABLES
~~~~~~~~~~~
Overview:

    O(1) if you get it right, but of course O(n) if your hash sucks and
    everything goes in one bin.  In Chaining, singly linked list is used to
    store multiple elements in one bin.  A doubly-linked list makes deletion
    cheap if you already have a pointer to the element.

    Consider a table with n elements and m slots; the hash factor alpha
    is n/m.  An unsuccessful search is then theta(1+alpha) on average.

    Alternatively, you can store everything in the table itself, by moving
    onward when you have a collision (linearly, quadratic, double
    hashing, etc).  Note that special behavior is required on deletion
    so that subsequent values are not hidden by the empty slot stopping the
    search.  CLR prefers double-hashing, which generates the most sequences,
    and describes it on page 235.

Hash Functions:

    CLR has a very short discussion of hash functions on page 226.

    It is common to interpret the key as a natural number, such as by taking
    a string as a number in some radix.  One simple approach is dividing this
    number by a prime which is not too close to a power of 2 (CLR 228) and
    taking the remainder.  The second is to multipley by a floating point
    number A and then multiply by an integer m, often a power of 2.  Without
    explanation, (sqrt(5)-1)/2 is suggested for A, with a reference to Knuth.

    For double hash functions to support open addressing, an example is:
	m is prime, m' is slightly less than m
	h1(k) = k mod m
	h2(k) = 1 + (k mod m')
	h(k,i) = (h1(k) + ih2(k)) mod m

Birthday Problem:

    As a very rough approximation, there is ~50% probability of a collision
    once there are 2^(N/2) things in an N bit hash.

    A usable approximation is
    
	p(n) = 1 - e^(-n^2/(2*size))

	For n=2^32 and size=2^64, this gives p=1-e^(-.5) = 0.4

    Working this the other way around, 0.5 = e^(-0.6931), or
	n^2 = 1.39 * size
    so for a 64-bit hash, 2^(32.24) things result in 50% collision probability.
    http://en.wikipedia.org/wiki/Birthday_problem has several other formulae

Consistent Hashing:

    From the paper "Consistent Hashing and Random Trees".  Assume that each
    person can see at least a constant fraction of the buckets which exist.
    The desired properties are balance (even assignment of items to visible
    buckets), monotonicity (when new buckets become visible, things may move
    to the new ones, but not one of the old ones), spread (the number of
    different buckets to which something gets assigned should be small),
    and load (the buckets should all have the same number of items).

    Suppose rb and rx are random functions.  rb maps buckets to the unit
    interval and rx maps items.  fv(i) is the bucket b in V which minimizes
    the distance between rb(b) and rx(i).  If there are at most C buckets
    across all views, each bucket is actually replicated k*log(C) times, which
    are then randomly mapped with rb().

    A simple implementation uses a binary search tree to store the
    correspondence between unit interval segments and buckets.  There are
    k*C*log(C) intervals, so the depth is log(C); inserting or removing k
    points takes log^2(C).

    To get the hash computation time down to O(1), divide the interval into
    k*C*log(C) equal length segments, and keep a separate tree for each,
    with expected size O(1).  Since these must shrink as buckets are added,
    use intervals of length 1/2^x <= 1/(k*C*log(C), choosing the largest x.
    Then divide them in half.  It is also important that the expected length
    of a run of empty intervals is small, since they must all be updated when
    a bucket is added.

==============================================================================
				    HEAPS
==============================================================================

CLR page 401:
Procedure	Binary Heap	Binomial Heap	Fibonacci Heap
		(worst)		(worst)		(amortized)

Make-Heap	1		1		1
Insert		lg n		lg n		1
Minimum		1		lg n		1
Extract-Min	lg n		lg n		lg n
Union		n		lg n		1
Decrease-Key	lg n		lg n		1
Delete		lg n		lg n		lg n

BINARY HEAPS
~~~~~~~~~~~~
    The heap is organized such that each child is less than its parent.
    It can be stored in an array, with the first row being element 1,
    the second row being elements 2 to 3, etc.  Thus, the left side must
    be populated first.

    Generally, the heap sort is dominated by the quick sort, but heaps
    are more useful as priority queues.
    
    CLR chapter 7.

Heapify:

    Input is a location and heap.  Assumes children are proper heaps, but
    item itself might be too small.

    O(n lg(n))  (from height of heap)

  Algorithm:

    If there is a left child:
	Set largest= the larger of self and left child top
    If there is a right child:
	If top of right child > largest
	    largest=top of right child
    If largest is not self
	Exchange
	Recursively call on new largest
    
Build-Heap:

    Repeatedly calls heapify to make a heap from an unordered array.
    Start at the element which is half the length, and call heapify
    on everything from there decreasing to the first element.

    Running time is O(n), proven by showing that the running time
    of heapify is lower at nodes close to the bottom.

    CLR page 145

Heap Sort:

    O (n lg n)
    
    Build the heap
    For i in last-element to element 2:
	exchange first and ith element (puts largest element last)
	Decrease global heap size by 1
	heapify starting at first element (puts new largest element first)

Heap Insert:

    O(lg n)

    Put new element in at end.  Bubble it up until it is larger than its
    children.

  More formally: 
    Increase length by 1
    i=heap size
    while i>1 and parent of element i < new value
	set element i = element i's parent
	i = parent(i)
    element i = new value

Heap Extract:

    O(lg n)

    Take off first element
    Move last element to front
    Rerun heapfify()

BINOMIAL HEAPS
~~~~~~~~~~~~~~
Binomial heaps are interesting when we need to take the union of them, which
can be done in lg(n) instead of n.  Otherwise, just use ordinary binary
heaps.  CLR chapter 20, page 400.

A binomial heap is made up of binomial trees.  There is at most one tree
of each degree.  The trees are heap-ordered - the key of a node is greater
than or equal to its parent. (isn't this the opposite convention from
that used in describing binary heaps?) See also section on binomial trees.

Representation: Keeps track of parent, left child, and right sibling, and notes
own degree.  Roots of trees are in singly linked list in increasing degree,
giving an alternative meaning to the sibling field.

Make-binomial-heap:
    Creates an empty heap with one null pointer.

Minimum:
    The minimum will be at the root of a tree, so just search the initial
    linked list, which will be of length order lg(n).

Binomial-Link (child y,parent z):
    Set y'z parent to z
    Set y's sibling to z's child
    Set z's child to y
    Increment z's degree

Union (a, b):
    We will first merge the root lists.  This could leave as many as two
    trees of each degree, and we need at most one.  Thus, for each two that
    are the same degree, we merge them into one of the next higher degree,
    working our way upward.  It gets a little complicated because after
    joining two trees, we might temporarily have THREE of the same degree,
    and we have to leave the first as-is and merge the last two.

    e.g. 1,1,2,2,3 -> 2,2,2,3 -> 2,3,3 -> 2,4

    Time proportional to number of trees, or lg(n)

  Details:
    Create a new heap H.
    Merge a and b into H, joining root lists in monotonically inc. degree order
    if H is empty, return

    Set prevx to null
    set x to head of H
    set nextx to x's sibling

    while nextx:
	if degree(x) != degree(nextx) ;1 of this degree
	OR (sibling(nextx) exists AND degree(sibling(nextx))==degree(x)) ;3
	    ; Don't need to merge.  Advance
	    prevx = x
	    x = nextx
	else if key(x) <= key(nextx)
	    ; make nextx a child of x
	    sibling(x)=sibling(nextx)
	    binomial-link(nextx, x)
	else
	    if not prevx:
		head = nextx
	    else
		sibling(prevx)=nextx
	    binomial-link(x, nextx)
	    x=nextx

	nextx = sibling(x)

Insert:
    Make a new heap to contain the element.
    Set its parent, child, and sibling to null
    Set its degree to 0
    Union(existing heap, new one element heap)

Extract-min:
    Find the tree with the smallest root
    Make a new heap out of the remainder of this tree:
	The linked list will be the root's children in reverse order
    Unite the old heap (less the removed tree) and the new heap.

    (CLR page 413, with illustrative figure)
    All O(lg n), proportional to number of trees

Decrease-Key:
    Error if key has gone up
    Otherwise, bubble node up until it is not smaller than parent

Delete:
    Decrease key to -infinity
    Extract minimum (cute eh?)

FIBONACCI HEAPS
~~~~~~~~~~~~~~~
Insert, minimum, and union are O(1).  Extract-min and delete are O(lg n).
Decrease-key is O(1) amortized, significant as a basis in graph algorithms
(see for example Dijkstra's algorithm).  CLR chapter 21, p420.

...

==============================================================================
				    TREES
==============================================================================

Most operations are O(h), where h is height.  If the tree is balanced,
this is lg(n), but it can be n if the tree is degenerate (think of
inserting in sorted order).

BINARY TREES
~~~~~~~~~~~~
Consider how to store equal values - CLR lets them be the left or right child.
If you do not store pointers to parent, successor (and thus deletion) may be
problematic.

Recursive search is trivial; so are min and max.

Search:
    If equal, return
    If empty, return not found
    If value less than current node, search left child
    Otherwise search right child

Successor:
    If right child is not empty, return its minimum.
    Otherwise, move to parent until you find a parent to whom you are the
    left child instead of the right.

    O(h).  Requires ability to move up to parent.

Insertion:
    Like a search, but when you get to the bottom, you insert a new node.

Deletion:
    One of the more complex operations.  CLR page 251.

  Algorithm:
    Node z is to be removed; p is its parent.

    If z has no children:
	Modify p to replace z with a null
    If z has one child:
	Make this child a direct child of p
    If z has two children:
	Take z's successor y (which has no left child - this is always true)
	Remove y
	Replace z with y

RED-BLACK TREES
~~~~~~~~~~~~~~~
Stores one bit of color on each node.  Assurance is that no path is more
than twice as long as any other.

Coloring Invariants:
    Every node is either red or black
    Every leaf is black and is NIL
    If a node is red, both its children are black
    Every simple path to a leaf contains the same number of black nodes
    (some definitions also give the root as always black.  CLR does not.)

    Max height 2*lg(n+1); proof CLR page 264.  General idea is that you
    cannot have two red nodes in a row, so the shortest is all black and
    the longest alternates.

Other variants exist in which leaves are not always NIL.  This changes some
of the fundamentals and is considered more complex.  There are also variants
in which the edges are spoken of as colored instead of notes.  Analgous
to B-trees of order 4 (discussed on wp article)

Basic operation for restoring red-black property after insertions and
deletions is "rotation":

       y                       x
     x/ \g       <------>    a/ \y
   a/ \b                       b/ \g

Left-Rotate:
    y = right(x)
    right(x) = left(y)
    if left(y)
	parent(left(y))=x
    parent(y)=parent(x)
    if x has no parent:
	root=y
    else
	if x is its parent's left child
	    left(parent(x))=y
	else
	    right(parent(x))=y
    left(y)=x
    parent(x)=y

The basic idea is to insert and delete as in a normal binary tree, but then
fix up the tree afterwards.

Insert:
    Insert new node x as in binomial tree
    Color x red
    While x is not the root and x's parent is red:
	If x's parent is the left child of its parent:
	    y = x's parent's right child
	    if y is red:
		Color x's parent black
		Color y black
		Color x's grandparent red
		Move x up two levels (to its current grandparent)
	    else
		if x is the right child of its parent:
		    x = x's parent
		    left-rotate at x
		color x's parent black
		color x's grandparent red
		right-rotate at x's grandparent
	Else:
	    (exactly the same thing, mirrored)
    Color root black.

    O(lg(n)) CLR 268

Delete:
    Keep in mind that on the binary tree, leaves have null pointers to their
    children, but in the red-black tree, the leaves are special nil
    sentinels colored black.

    The transcription of the algorithm here is omitted for reasons of time
    and space.  It is easy to look up online or in standard references.

Augmentations for Order:

    Augmentations to find ith largest element, and the rank of an element you
    have, are presented in CLR chatper 15 (page 281).  Augment the data by
    including the number of nodes in the subtree.  Finding the ith largest is
    obvious.  To find the rank of an element you have, move up, adding the
    value of the left child each time you are a right child of your parent.
    The information can be maintained without increasing the order of the
    updating algorithms.

    Add 1 if you want to count starting at 1.

INTERVAL TREES
~~~~~~~~~~~~~~

CLR describes augmenting an RB tree (or presumably a normal BST) for
intervals:

    Store both endpoints of interval in node; order by low endpoint.  Augment
    by storing maximum high endpoint of any entry in subtree.  When
    searching, go left if there is a left child and the maximum of all
    things in the left subtree is >= the left end of the search interval.
    CLR 290

    This finds ONE interval which overlaps, not ALL, and is thus not the
    most useful.

For a completely different take on interval trees, see
http://www.dgp.toronto.edu/people/JamesStewart/378notes/22intervals/

    The approach here is to break up the space into "elementary" intervals,
    which have no interval endpoints within them (thus each new interval
    creates two new breaks, if the endpoints don't match any other).
    Notionally, each leaf contains one elementary interval and is annotated
    with the intervals that contain it, but this is inefficient.  Instead,
    think of internal nodes as "spanning" the union of all their leaves.
    Store an interval in the highest internal node which is completely
    inside it; each interval can be stored at most twice at each level.
    To query all intervals which overlap a number k, simply walk down to the
    leaf containing k, accumulating all the intervals encountered along
    the way.

B-TREES
~~~~~~~
Designed for use on disk or other slow-access data storage media.  B-trees
can have a branching factor greater than 2; they are not restricted to be
binary.  Each node will store one fewer value than the number of its
children, and are commonly sized to match a disk page.  CLR chapter 19, page
381.

Properties:

    Every node has the number of keys k, the keys, and a flag for being a leaf.
    Internal nodes contain k+1 pointers to children
    The keys separate the groups of children
    Every leaf has the same depth
    Let t be the minimum degree.  Nodes must have t-1 to 2t-1 keys inclusive.
	The root is exempt from the minimum.

Search:
    Straightforward generalization of tree search.  CLR presents search
    of keys as linear, not binary.  Page 388.

Create:
    Create a node.  Set it as leaf.  Set its number of keys to 0.  Write.

Split-Child:
    Inputs: A nonfull internal node x, and index i, and a full node y which
    is the ith child of x.  If y is the root, the tree grows in depth.

    Move median key from full node to parent.  Move second half of keys
    (and children if applicable) to new node.  Insert old median key
    and new node as child after it into parent.  CLR page 390.

Insert:
    This one-pass algorithm splits all full nodes when descending through them,
    so that we do not have to backtrack later.  CLR page 392.

  Algorithm:
    If root is full, allocate new root and make existing root its only child.
    Then call split-child on it.

    If we are at a leaf:
	Insert the new key, shifting the others over.  Return.
    If not:
	Descend to the appropriate child
	If it is full, split it immediately and move to appropriate new node
	Recursively call insert at new location

Delete:
    This one-pass algorithm makes sure all nodes have degree at least t
    (one more than the min) when going through them, to reduce backtracking
    (but it cannot be completely eliminated).  CLR page 395.

    Arguments: key k to delete, node x (initially the root)
  Case 1:
    If key k is in x, and x is a leaf, delete it.  Done.
  Case 2:
    If key k is in x, and x is an internal node:
   2a: If x's child y, preceding key k, has at least t keys, find the
       predecessor k' of k in tree at y.  Recursively delete it, and
       replace k with k'.
   2b: If 2a is not met because there aren't enough keys in y, look
       at the child on the other side (call it z )instead.
   2c: If they both have only t-1 keys, merge k and all of z into y.
       Delete k and pointer to z from x.  Free z and recursively
       delete k from y.
  Case 3:
    If key k is not in x:
	Find the child which must contain k, assuming k exists.
   3a: If this child has only t-1 keys, but a sibling has t or more,
       shift a key downward from x into the child and replace it with
       one from its sibling.
   3b: If this child and all its siblings have only t-1 keys, merge the
       child with a sibling, moving a key down from x to be the new median.
    Then recursively delete on the new child.

BINOMIAL TREES
~~~~~~~~~~~~~~
For a tree of order 0, there is one node.
For a tree of order k>0, its k-1 children are trees of orders 0 through k-1.
The number of nodes must be a power of 2.
CLR page 401, but I found its explanation somewhat opaque.

HIGHER DIMENSIONAL TREES
~~~~~~~~~~~~~~~~~~~~~~~~
For quadtrees, octrees, and R-trees, see the separate section on geometric
algorithms.

==============================================================================
				    GRAPHS
==============================================================================

TERMINOLOGY AND REPRESENTATION
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Let the graph G have set V of vertices and set E of edges.
Standard representations are adjacency lists or adjacency matrix.  The
list is generally preferred for sparse graphs.  For undirected graphs,
you can store only the upper or lower diagonal of the matrix, and for
unweighted graphs, only one bit per entry is required.

BREADTH FIRST SEARCH
~~~~~~~~~~~~~~~~~~~~
    Works on directed and undirected graphs which are unweighted.  Produces a
    distance to each vertex and a tree with the shortest path.  Uses a FIFO to
    manage vertices.  In this algorithm, the tree is implicitly created from the
    list of parents.  Running time is O(V+E).  CLR page 470.

  Algorithm:

    Color all vertices white.
    Set distance to all vertices to infinity.
    Set parent of each vertex to nil.

    Put source vertex s at root of tree.
    Insert s into queue.

    While queue is not empty:
      Take vertex u from head of queue
      For each vertex v adjaced to u:
	if v is white:
	  color v grey
	  distance[v] = distance[u]+1
	  parent[v] = u

      Pop u from the queue
      Color u black

DEPTH FIRST SEARCH
~~~~~~~~~~~~~~~~~~
Multiple trees may be found (a forest).  Each node gets two
timestamps; when it is greyed and when it is blacked.  Time is
a global variable.  Running time O(V+E).  CLR page 478.

DFS:
    Color all vertices white; set predecessor to nil

    For each vertex:
	If vertex is white:
	    DFS-visit(vertex)

DFS-visit (vertex u):
    Color u grey
    set timestamp on disovered_time(u)
    Increment global time
    For each v adjacent to u:
	If v is white:
	    set parent(v)=u
	    DFS-visit(v)
    color u black
    set timestamp on finished_time(u)

Classification of Edges:

    Tree edges are edges in the forst (white when found)
    Back edges are those connecting to an ancestor (gray when found)
    Forward edges are nontree edges to a descendent (black when found)
    Cross edges are all others (black when found)
    
    For undirected graph, convention is to pick first from list which applies.
    CLR page 482

Topological Sort:

    Sorts a DAG such that if there is an edge (u,v), u appears before v.
    Of course this does not work if it is cyclic.  CLR 486.

  Algorithm:
    Perform a depth-first search to get finishing times
    As each vertex is finished, insert it into the front of a linked list

Strongly Connected Components:

    A strongly connected component is one in which each vertex is
    reachable from the others.  Thus this is applicable only to
    directed, cyclic graphs.  We will need the transpose of the graph,
    which is obtained by reversing the direction of all edges.  Proof
    is nonobvious; CLR page 489.

  Algorithm:
    Call DFS on original graph, getting finishing times
    Compute transpose of graph
    Call variant DFS on transpose, visiting nodes in order of
	decreasing finishing time from original DFS
    Output vertices of each tree in this search as strongly con. component

MINIMUM SPANNING TREES
~~~~~~~~~~~~~~~~~~~~~~
The minimum spanning tree is the set of edges which reach all nodes at minimum
cost.  It is always acyclic.  Greedy algorithms are optimal here.  CLR 499; HL
363.

Trivial "generic" algorithm framework:
    Set A to null set
    While A is not a minimum spanning tree:
	Find an edge (u,v) that is "safe" (in tree) and add it to A
    Return A

Selecting an edge:
    Cut the graph in a way that does not cut an edge in A ("respects A").  The
    lowest-cost edge crossing the cut is safe to add to A.

    Proof CLR page 501, not entirely intuitive to me.

Kruskal's Algorithm:
    A is a forest.  Algorithm requires selection of a disjoint-set data
    structure.  CLR 504.  O(E lg E) with correct set representation.

  Algorithm:
    Set A to null set
    Make a one-item set of each vertex
    Sort edges by weight
    For each edge (u,v) in order of nondecreasing weight:
	If u and v are not in the same set:
	    Join the set with u and the set with v
	    Add (u,v) to A

Prim's Algorithm:
    A is maintained as a single tree.  A priority queue is used to store
    vertices to add, and gives an example of why we care about a
    decrease-key operation in heaps.  O (E lg V), improvable to O(E + V lg
    V) with Fibonacci heaps.  CLR 509; HL 363.  The HL treatment is
    graphically intuitive but does not address the details of the required
    data structures and the performance of various choices.

  Algorithm:
    Arbitrary starting node argument r
    Insert all vertices into queue, with infinite key
    Set r's key in queue to 0
    Set r's parent to nil
    While queue is not empty:
	Extract minimum value from queue, call it u
	For each v adjacent to u:
	    if v is in the queue and weight (u,v) is less than v's key:
		Decrease v's key in the queue to weight(u,v)
		set parent of v to u

SINGLE SOURCE SHORTEST PATH
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some similarities to breadth-first search.  Also, maintain similar
list of parents/predecessors.  Shortest paths may not be unique.

If one cares about only a single destination, one can stop the algorithm
when it is reached, but it is not possible to do asymptotically better
by looking for only one destination than for enumerating all the nodes.
Some algorithms allow negative weights as long as there are no negative-weight
cycles reachable from the starting node.

HL 359 (cursory), CLR chapter 25, p514.

We will maintain a bound on the shortest-path weight to each node.  We will
then relax it (reduce it) if we can reach it with lower cost from one of the
adjacent nodes based on *that* node's estimate.

Dijkstra's Algorithm:
    Requires all weights nonnegative.
    Will maintain set S of vertices whose final weights are determined and
    a priority queue Q that contains all vertices NOT in S.  CLR 527.

    Queue in linear array:   O(V^2+E) = O(V^2)
    Queue in binary heap:    O(E lg V)
    Queue in Fibonacci heap: O(V lg V + E)
    The historical development of the Fibonacci heap was motivated by this
    algorithm, with far more decrease-key than extract-min calls.

  Algorithm:
    Set all distances infinite; all parents nil.
    Set S to empty set.
    Put all vertices in Q, prioritized on distance

    While Q is not empty:
	Remove minimum from Q and store it in u
	Add u to set S
	For each vertex v adjacent to u:
	    if distance(v) > distance(u) + weight(u,v):
		parent(v)=u
		distance(v) = distance(u) + weight(u,v)

Bellman-Ford Algorithm:
    Allows negative weights; can detect negative-weight cycle.  CLR 532;
    note their loop variable i is unused.  Generally slower O(VE)

  Algorithm:
    Set all distances infinite; all parents nil.

    Repeat |V|-1 times: (relaxation loop)
	For each edge (u,v):
	    if distance(v) > distance(u) + weight(u,v):
		parent(v)=u
		distance(v) = distance(u) + weight(u,v)

    For each edge (u,v): (cycle detection loop)
	if distance(v) > distance(u) + weight(u,v):
	    throw negative cycle exception
    
Shortest Path in DAGs:
    Can be significantly optimized by knowing it is acyclic.  CLR 536.

  Algorithm:
    Topologically sort vertices
    Set all distances infinite; all parents nil.
    For each vertex u in topological order:
	For each vertex v adjacent to u:
	    if distance(v) > distance(u) + weight(u,v):
		parent(v)=u
		distance(v) = distance(u) + weight(u,v)

SINGLE PAIR SHORTEST PATH
~~~~~~~~~~~~~~~~~~~~~~~~~
These algorithms take advantage of the fact that there is a single destination
to try to search the most promising routes first.  They are not addressed
in CLR or HL, but see also dynamic programming techniques.

Best-First Search (A*, A-Star):
    We need an admissible (does not overestimate) heuristic.  In geometric
    problems, this can be the great-circle distance.  The algorithm can be
    seen as a version of Dijkstra's algorithm with an added heuristic to
    prune dominated possibilities.  It can also be seen as the most general
    search, with breadth-first, depth-first, and Dijkstra's algorithm all
    being special cases with certain heuristics.

    Most significant drawback can be memory usage.
    
    Use of "closed list" is optional if a solution is guaranteed to exist or
    if there is another way of rejecting cycles.  If it is used, the graph
    must be monotonic (e.g.  A->B->C costs no less than A->C) - this is
    obviously satisfied for map routes but might not be the case for
    arbitrary graphs.

    G cost: Actual shortest distance from source to node
    H cost: heuristic (low estimate) cost from node to destination
    F cost: G+H

  Algorithm (http://en.wikipedia.org/wiki/A*_search_algorithm):
    Set closed list to empty set
    Set open list to source node
    g(source)=0

    while open list is not empty:
	set x to node with lowest F score
	if x is dest:
	    return
	remove x from open list
	add x to closed list
	for each y which is a neighbor of x and is not in closed list:
	    temp_g_score = g(x) + weight(x,y)
	    temp_better = 0
	    if y is not in open list:
		insert y into open list
		calculate heuristic h(y)
		temp_better = 1
	    else if temp_g_score < g(y):
		temp_better = 1

	    if temp_better:
		parent(y) = x
		g(y) = temp_g
		f(y) = g(y) + h(y)

  Algorithm (http://www.autonlab.org/tutorials/astar.html):
    Set priority queue empty
    Set visited list empty
    Put source into queue with priority F
    While priority queue is not empty:
	Remove node with lowest F.  Call it n
	If n is the destination, return success
	For each m that is a successor of (adjacent to) n:
	    f(m)=g(m)+h(m) = g(n)+weight(n,m)+h(m)
	    if m has not been seen
	     OR m has lower F now than from before
	     OR m is in the priority queue with heigher F:
		Put m in priority queue with f(m)
		Add/update visited list to include m,f(m),parent(m)=n
	    else take no action on m
    If queue is empty and we have not returned already, fail!


ALL PAIRS SHORTEST PATH
~~~~~~~~~~~~~~~~~~~~~~~
It is possible to do fundamentally better than running a single-source
algorithm for every vertex.  CLR chapter 26, page 550.

Matrix Algorithm Concept:
    First construct all shortest paths of length 1.  Then recursively add 1
    to the maximum number of edges and recalculate until reaching |V|-1.

Extend-shortest-path:
    Input matrix D distances, W weights
    n = rows(D)
    D' = new n by n matrix
    for i in 1 to n:
	for j in 1 to n:
	    d'(i,j) = infinity
		for k = 1 to n
		    d'(i,j) = min(d'(i,j), d(i,k)+w(k,j))

Naive-all-pairs-shortest-path:
    Start with D=W
    Call extend on D,W n-2 times.

To do better, instead of processing by W each time, we can square W,
then square that, etc, until reaching a power of 2 which is equal
to or greater than |V|-1.  We thus get a runtime of O(V^3 lg V)
    
Faster-all-pairs-shortest-path:
    Start with D=W
    for(m=1; m*=2; m < n-1):
	Dnew = extend-shortest-path(D,D) ; note D,D *not* D,W

Floyd-Warshall:
    Achives runtime of O(V^3).  Basic concept is to construct
    shortest paths using vertices {1..k}, and then extend to k+1.

    d(0,i,j) = w (i,j)
    d(k,i,j) = min(d(k-1,i,j), d(k-1, i, j) + d(k-1, k, j)

    It remains to actually construct the paths in addition to calculating
    the distances.  One way is to calculate a matrix of predecessor vertices
    on the shortest path.  It is updated by replacing a value with k when
    the corresponding min distance changes.

    CLR page 558.  A variant for calculating transitive closures only
    (reachable nodes) is given on 562.

Johnson's:
    Algorithm for sparse graphs o(V^2 lg V + VE), so faster if E is
    smaller than V^2.  Uses both Bellman-Ford and Dijkstra as components;
    see CLR 569.

MAXIMUM FLOW
~~~~~~~~~~~~

...

==============================================================================
			     GEOMETRIC ALGORITHMS
==============================================================================

LINE SEGMENTS
~~~~~~~~~~~~~

Cross Product:

    Remember it can be found as the determinant of the matrix:

	[x1   x2]
	[y1   y2]  = x1y2-x2y1

Direction of Turn:

    Take a line segment p0-p1 and p0-p2.  Translate p0 to the origin.
    Then calculate the cross product of p1 and p2.  If it is positive, 
    p0-p1 is clockwise from p0-p2.

Two Line Segment Intersection:

    First, check whether bounding boxes intersect:

	max(x1,x2) >= min(x3,x4) AND
	max(x3,x4) >= min(x1,x2) AND
	max(y1,y2) >= min(y3,y4) AND
	max(y3,y4) >= min(y1,y2)

    Then, if this is true, check line segments for straddling each other
    with cross products.  They intersect if p1-p3 and p1-p4 have different
    orientations relative to p1-p2:

	sign((p3-p1) X (p2-p1)) != sign((p4-p1) X (p2-p1))

    If either of the cross products are exactly zero, the endpoint of
    one is on the other.  If they're both zero, they are colinear.
    CLR 888
	
Many Line Segment Intersection:

    Do any to segments in an arbitrarily large set intersect?  O(n lg n)
    Simplified algorithm assumes no vertical segments and no three
    segments intersecting at one point.  CLR 892.

    The idea is to move a "sweep line" left to right (or any other
    direction).  Each time an endpoint is encountered, check to see whether
    any of the segments which are intersected by the sweep linehave changed
    order relative to each other.  If so, they crossed.

CONVEX HULLS
~~~~~~~~~~~~

Graham's Scan:

    O(n lg n).  CLR 900.  Uses a stack to keep track of verticies which
    are tentatively on the convex hull.

    Start with p0 as the bottommost point (or leftmost, etc)
    Sort points p1 through pn by angle to p0 in clockwise order
    Push p0, p1, p2 onto stack
    for i in 3 to n:
	while the angle from 2nd-from-top, top-in-stack, and pi is nonleft:
	    pop from stack
	push pi onto stack
    return points in stack

Jarvis' March:

    O(nh) where h is the number of points on the hull.

    Start with lowest point.  Calculate the polar angle of all the other
    points to this one.  Pick the one with the lowest value.  Then,
    recalculate with regard to that point.  Repeat until getting back
    to the original one.

    Envision it as sweeping an scan arm in until it hits a point each time,
    then moving to that point and repeating.  CLR 906 advocates an
    approach of keeping a separate left and right side for simplicity in
    calculating angles.

2D POINTS
~~~~~~~~~
Closest Pair of Points:

    CLR 908


LOCALITY SENSITIVE HASHING
~~~~~~~~~~~~~~~~~~~~~~~~~~
Probabilistic assignment of similar items to the same buckets.  It is
primarily used in high-dimensional data; it is asserted that by 10-20
dimensions, k-d trees and friends are searching over a significant fraction
of the space (Gionis, Indyk, Motwani 1999)

http://www.mit.edu/~andoni/LSH/

BSP (BINARY SPACE PARTITION TREE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Overview:

    Interior nodes represent splitting lines (or plans, hyperplanes, etc in
    higher dimensions), and also contain objects which are entirely within
    the splitting plane.  Leaf nodes are convex and contain 0 or 1 objects or
    object fragments (since objects can be split).  A BSP which is built only
    using splitting lines which are extensions of input segments (or planes
    which are extensions of input polygons) is called "auto-partition (BCKO
    262).

    This can be used with the painter's algorithm to select which things to
    paint (BCKO 263).  Other canonical uses include shadow generation,
    set operations on polyhedra, cell decomposition methods for motion
    planning, and as an index in GIS (BCKO 278 has references for each).
    Both K-D trees and quad/octrees are a special case of BSP trees.

In Practice:

    The size of the tree is O(n+2n*ln(n)) in 2D and O(n^2) in 3D space on
    disjoint triangles.  In practice, however, they perform well on low
    density data.  Thus, define the idea of density - the smallest lambda
    such that any ball B intersectis at most lambda objects who have a diameter
    greater than or equal to that of B (i.e. small objects are not counted).
    (BCKO 271).  However, the randomized auto-partition algorithm does not
    perform well in the low-density case.

    Define a "guard" as the corner of an axis-aligned square containing an
    object.  Split initially as if in a quad tree, but then if one quadrant
    contains too many guards, shrink towards a corner using a heuristic k.
    The details are given in BCKO.  For disjoint line segments in the plane,
    there is a tree of size O( log(lambda)), where lambda is the density.
    For triangles in 3D, this is O(n lambda), which varies from n to n^2
    as lambda varies from 1 to n.

K-D Trees
~~~~~~~~~
K-D trees are a special case of BSP trees.  Each node is a point in
K-dimensional space, and each cutting hyperplane is in one of the K dimensions.

Basic Case:

    Assume that no two points exactly match in x or y.  Split first on x,
    then split each of these on y, then on x, etc.  Split on the median
    point in the relevant dimension.  Finding the median is the most
    expensive part; presort the points in both x and y to facilitate this.

    To query for points within a rectangle, recurse down the tree,
    following only children which intersect the rectangle, checking the
    points in regions which are not entirely within the rectangle, and taking
    the union of their results.  For m answers, the time to do the
    query is O(sqrt(n) + m) (BCKO 104).

QUADTREES
~~~~~~~~~
Quadtrees are also a special case of BSP trees.

Basic Version for Points:

    Split until each square contains at most one point.  Make an arbitrary
    decision about <= vs. >=.  It is possible to store the size of each
    square at the nodes, or recompute it while walking the tree. (BCKO 309)

    The tree can be quite unbalanced, and the height is not a trivial function
    of the number of points.  However, it is at most log(s/c) + 3/2, where c
    is the smallest distance between any two points and s is the side length
    of the initial square (BCKO 311)

Finding Neighbors:

    Let's say we want to find the north neighbor.  As long as we are
    NOT one of the south neighbors of our parent, move up.  Then report
    the northern neighbor from us.  If we want a leaf, we have to them
    walk back down, always choosing the souther neighbor.  If we hit the
    root while going up, we are at an edge and there is no neighbor in
    the desired direction. (BCKO 313)

Balanced Quadtree:

    Take the quadtree as balanced if any two neighboring squares differ by
    at most 2x in size.  To balance a tree, put all the leaves into a list.
    For each leaf, find neighbors in all four directions and see if they
    are to small.  If so, the leaf has to be split.  This might cause
    its neighbors that now need to be split, so insert them into the list if
    so.  The balanced version does not have asymptotically more nodes, and
    can be constructed in O((d+1)m) time (BCKO 315).

More Information:

    BCKO 315 talks about mesh generation from quadtrees.
    S. Aluru, "Quadtrees and Octrees" discusses compression.
    Eppstein et al "The skip quadtree" combines with skiplists for 
	efficient insertion and removal.


R-TREES
~~~~~~~
Source for this section is original paper:
http://www.cs.fiu.edu/~vagelis/classes/COP6727/publications/gutman-rtree.pdf
"R-Trees: A Dynamic Index Structure for Sptial Searching" by Antonin Guttman


Overview:

    The R stands for Rectangle, which is used as a bounding box.  R-trees are
    designed for paged storage, and resemble B-trees in this regard.  Each node
    has a maximum number of entries.  Each entry in a non-leaf node stores
    pointers to children and the bounding box of each child.  Individual
    entries can have nonzero size - they are rectangles, not points.

    Each node has a maximum of M entries and a minimum of M/2 or less.  The
    root is exempt from the minimum.

Search:

    The input to search is a query box.  Start at the root, and move down
    into those children which overlap with the query box.

    Because more than one subtree may need to be visited, it is not possible
    to guarantee good worst-case performance.

Insertion:

    Start at the root.  Choose the rectangle which requires the least
    enlargement to hold the new entry.  At the leaf, insert if the leaf
    is not full, or split otherwise.

Adjust:
    
    Start at a leaf N.  If the leaf was just split, save the second one.  Move
    up to the parent and adjust the coverage rectangle to reflect  If there
    is a second node NN saved from the split, try to add it to the parent as
    well.  If the parent is now overfull, split the parent as well and repeat.

Splitting:

    The M+1 items must be divided into 2 new nodes.  It is desirable to
    divide them such that the overall area is lowest, but it is prohibitive
    to exhaustively check all combinations.

    Several algorithms are described in the paper, both linear and quadratic
    with respect to M; the conclusions say the linear algorithm is good
    enough, and the slightly worse quality of the splits is not significant.
    For this reason, I will describe only the linear algorithm.

    Along each dimension, find the rectangles with the highest low side and
    the lowest high side; record the separation.  Normalize by the size of
    the entire set in that dimension.  Choose the most extreme pair in any
    dimension.  These are the two "seeds".  Then, choose any of the remaining
    entries, and add it to the set which is enlarged the least to include it.
    If you get to a point where all entries are done, or where one set must
    have the remainder of the available entries assigned to it to hit the
    minimum, do so and stop.

==============================================================================
			     DYNAMIC PROGRAMMING
==============================================================================

Overview:

    Dynamic programming is a recursive technique in which a subproblem can be
    solved and then used to solve the larger problem.  It is applicable to
    problems like single pair shortest path (HL 425) or ordering matrix
    multiplication (CLR 302).  Traditionally, the dynamic programming problem
    is computed bottom-up instead of actually recursing.  In the matrix case,
    compute all the subsequences of length 2, then use those to compute all the
    length 3, and so on.

    HL suggests that dynamic programming is applicable where the problem has
    multiple stages with a state and a policy decision, the costs of each stage
    of the problem is independent of the others, and there is a recursive
    relationship to find the cost of stage n+1 using the cost of stage n (HL
    430).  CLR puts the same thing a bit differently, referring to optimal
    substructure and overlapping subproblems (CLR 309).  The latter corresponds
    to a recursive solution needing to solve the same subproblems repeatedly.

    "Memoization" - basically caching the results of the subproblems - provides
    much of the benefit while allowing for the traditional top-down approach
    instead of bottom-up.

Other Examples:

    CLR 314: longest common subsequence.  Define the subproblems in terms
    of sequence prefixes, of which there are only m*n for strings of length
    m and n.

    CLR 320: optimal polygon triangulation.

    HL 453: probabilistic dynamic programming with expected values
    
==============================================================================
			       LINEAR PROGRAMS
==============================================================================

CLR 539 (short).  Strang chapter 8.  HL in depth.

SIMPLEX METHOD
~~~~~~~~~~~~~~
Geometric Interpretation (HL 82):

    Start at origin
    Move to a better adject corner point feasible solution
	(select the direction based on greatest slope)
    When no adjacent CPF solutions are better, stop

Slack Variables (HL 87):

    Introduced in order to make all constraints an equality
    New problem is called the augmented form
    Negative slack variable indicates infeasible solution

    Basic solution: Augmented corner point solution
    Basic feasible solution: Augmented corner point feasible (CPF) solution

Algorithmic Interpretation:

    Number of basic variables is number of equations
    Non-basic variables are the others
    Set nonbasic variables to 0
    Solve for basic variables
    To look at next solution, change on basic variable

    Algorithm HL92; tabular form HL95.  Also Strang

==============================================================================
			      NONLINEAR PROGRAMS
==============================================================================

==============================================================================
			       INTEGER PROGRAMS
==============================================================================

==============================================================================
			      MATRIX ALGORITHMS
==============================================================================

LU DECOMPOSITION
~~~~~~~~~~~~~~~~

Use Gaussian elimination to turn original matrix A into an upper diagonal
matrix U.  The coefficients of the rows which were subtracted go into
L.  This gives A=LU.  

Symmetric matrices can be decomposed further into A=LDL' (' is transpose here).
Create D by factoring out a diagonal matrix from U such that the pivots are all
1.  What's left will be L'.

POSITIVE DEFINITE MATRICES
~~~~~~~~~~~~~~~~~~~~~~~~~~

Positive definite matrices indicate the presence of a minimum.

1.  A matrix with positive pivots - which require a positive diagonal
and positive determinant - is a positive definite matrix.

2.  A is positive definite if x'Ax=0 for all nonzero x.  Strang 18

==============================================================================
			      AUCTION ALGORITHMS
==============================================================================

... section pending ...

==============================================================================

Well, that was my file of notes on algorithms.  Parts of it may not be finished
yet.  It may not be very useful to you, because it is so abbreviated, but it is
useful to *me* because it is abbreviated in a way that corresponds well with
the way things are already stored in my memory.

Notation:  O() is used for an upper bound, Omega for a lower bound,
and Theta for both.  lg is a base-2 logarithm.

CLR:	Cormen, Leiserson, Rivest.  Introduction to Algorithms.  1st edition
HL:	Hillier and Lieberman.  Introduction to Operations Research.  6th ed
BCKO:	de Berg, Cheong, van Kreveld, Overmars, Computational Geometry, 3rd Ed
Strang:	Gilbert Strang, Introduction to Applied Mathematics

To print this file nicely:
    enscript -B -o alg.ps -1 algorithms.ref 
    psnup -c -4 -b-.25in -m.25in alg.ps alg2.ps
    lpr alg2.ps

 vim: ai