Wednesday, June 5, 2013

Binary Tree Cycle Detection

    Upon graduation a about a month ago, I began (not really, but began to me more intense about it) to search for a job in software development. Just today, during an interview at a local company, I received a technical question that stumped for a significantly more amount of time than I expected:

Are there any cycles in this tree?

     Now I looked up this question by chance before the interview and the Internet in general and and Stackoverflow in specific wasn't very helpful; either being obstinate in stating that a tree by definition cannot have a cycle or some destructive or overly complicated solution.

   Now when my interviewers referring to a binary-tree with a cycle, they were talking about something similar to figure 1 below. This tree have the normal 2 branch binary-tree structure, with 3 (possibly incorrect; I don't really care) branches that extend to previous nodes in the tree. 
Figure 1
    There are two one main problems that I had when thinking through how to solve this problem:
  1. We only have access to the root node of the tree and thus can't find the strongly connected components or some other algorithm that would find cycles in a graph.
    The simplest tree that would have a cycle of it is just the root with a self-pointing branch (figure 2), so that is where I began.

Figure2


   If we mark each node as we go along, when we reach that node again we believe there is a loop. This works for the very simple case above as well as any of the cycles shown below in figure 3.


Figure3
    The problem with this approach is the connections that aren't necessary cylces, but are possible with edges that point to non-descendant nodes (A simple example shown in Figure 4). At the current working state the method halts as show in Figure 5.
Figure 4



Figure 5
   In order to account for this type of branch, we need to unmark a node when we finish exploring all of its children, which at that point we know there isn't a cycle within that portion of the subtree we traversed, returning false only when we have explored every child (Figure 6 showing the updated method).

Figure 6
    The other way of completing this algorithm that doesn't require changing the actual tree structure itself is to keep a stack of visiting nodes. A node is popped off the stack when all it's children are visited (as above) and we know there is a cycle if there is ever a node that is already in the stack.

    Finally is a method of solving this function in a tortoise and hare style solution that I was getting stuck on trying to create during the interview.
To modify it to work with a tree/graph structure, the two runners will be a list of positions in the tree rather than a single position like what would occur in a linked-list. When there exists a single position that is in both the slow and fast list, there is a cycle. I have found that this solution doesn't work for several cases that the others do.

    For the below example how it should work. I will mark the tortoise as green and the hare as blue.

    An example of a tree where this method won't work. When the first iteration is complete node 2 will be in both the fast and slow set, but the tree has no cycle.


Below is the code for both algorithms that I wrote in C++:

Stack Based:

typedef struct Node {
 Node* left;
 Node* right;
 Node() {
  left = 0;
  right = 0;
 }
 ~Node() {
  left->~Node();
  right->~Node();
 }
} Node;

vector<Node*> v;

static bool has_cycle(Node* root) {
 if( !root ) {
  return false;
 }
 if( find(v.begin(),v.end(),root) != v.end() ) {
  return true;
 }
 v.push_back(root);
 if( root->left == 0 && root->right == 0 ) {
  v.pop_back();
 }
 if( v.empty() ) {
  return false;
 }
 return has_cycle(root->left) || has_cycle(root->right);
}


Marking based:

typedef struct Node {
 Node* left;
 Node* right;
 bool visited;
 Node() {
  left = 0;
  right = 0;
  visited = false;
 }
 ~Node() {
  left->~Node();
  right->~Node();
 }
} Node;

static bool has_cycle(Node* root) {
 if( root == 0 ) {
  return false;
 }
 if( root->visited ) {
  return true;
 }
 root->visited = true;
 if( !has_cycle(root->left) && !has_cycle(root->right) ) {
  root->visited = false;
  return false;
 }
 return true;
}


No comments:

Post a Comment