How do you find all common parents for a set of nodes on a tree?

Medium
9 years ago

Given a tree data structure, and a set of nodes within that tree, how do you find all of their common parents? For example, consider a tree where node A is the root, and it has children B and C. Node B has children D and E, and node C has a child F. If the input set of nodes is {D, E, F}, then the common parents would be {B, C, A}. If the input set of nodes is {D, E}, then the common parents would be {B, A}. Design an algorithm to efficiently find these common parents, considering potential optimizations for different tree structures and sizes.

Sample Answer

Finding Common Parents in a Tree

Problem Description

Given a tree data structure and a set of nodes within that tree, the task is to find all common parents of those nodes. A common parent is any node that is an ancestor of all nodes in the given set.

Example

Consider a tree where node A is the root, and it has children B and C. Node B has children D and E, and node C has a child F.

  • If the input set of nodes is {D, E, F}, then the common parents would be {B, C, A}.
  • If the input set of nodes is {D, E}, then the common parents would be {B, A}.

Naive Solution

The most straightforward approach is to traverse up from each node in the input set to the root, recording all ancestors for each node. Then, find the intersection of all these ancestor sets. This ensures that only nodes that are ancestors of every node in the input set are included.

Algorithm

  1. For each node in the input set:
    • Traverse up to the root, recording all ancestors in a set.
  2. Find the intersection of all ancestor sets.
  3. Return the resulting set of common parents.

Python Code

class Node:
    def __init__(self, data):
        self.data = data
        self.parent = None
        self.children = []

    def add_child(self, child):
        self.children.append(child)
        child.parent = self


def find_common_parents_naive(nodes):
    ancestor_sets = []
    for node in nodes:
        ancestors = set()
        curr = node
        while curr:
            ancestors.add(curr.data)
            curr = curr.parent
        ancestor_sets.append(ancestors)

    if not ancestor_sets:
        return set()

    common_parents = ancestor_sets[0].intersection(*ancestor_sets[1:])
    return common_parents

# Example Usage:
root = Node('A')
node_b = Node('B')
node_c = Node('C')
node_d = Node('D')
node_e = Node('E')
node_f = Node('F')

root.add_child(node_b)
root.add_child(node_c)
node_b.add_child(node_d)
node_b.add_child(node_e)
node_c.add_child(node_f)

nodes1 = [node_d, node_e, node_f]
print(f"Common parents of {[node.data for node in nodes1]}: {find_common_parents_naive(nodes1)}")

nodes2 = [node_d, node_e]
print(f"Common parents of {[node.data for node in nodes2]}: {find_common_parents_naive(nodes2)}")

Complexity Analysis (Naive Solution)

  • Time Complexity: O(n * h), where n is the number of input nodes and h is the maximum height of the tree. For each node, we traverse up to the root (in the worst case), which takes O(h) time. Since we do this for each of the n nodes, the total time complexity is O(n * h). In a balanced tree, h = log(N), where N is the number of nodes in the tree, so it becomes O(n * log(N)). In a skewed tree, h = N, thus the time complexity is O(n * N).
  • Space Complexity: O(n * h), where n is the number of input nodes and h is the height of the tree. This is because, in the worst case, each node can have up to h ancestors, and we are storing the ancestors for each of the n input nodes.

Optimal Solution

A more efficient solution involves finding the Lowest Common Ancestor (LCA) of the input nodes. Once the LCA is found, all ancestors of the LCA are also common parents. This optimization reduces redundant traversals.

Algorithm

  1. Find the Lowest Common Ancestor (LCA) of the input nodes.
  2. Traverse up from the LCA to the root, recording all ancestors.
  3. Return the resulting set of common parents.

Python Code

def find_lca(nodes):
    if not nodes:
        return None

    def get_depth(node):
        depth = 0
        while node:
            depth += 1
            node = node.parent
        return depth

    def move_up(node, diff):
        while diff > 0:
            node = node.parent
            diff -= 1
        return node

    # Find the deepest node
    deepest_node = max(nodes, key=get_depth)
    
    # Align all nodes to the depth of the deepest node
    for i in range(len(nodes)):
        depth_diff = get_depth(deepest_node) - get_depth(nodes[i])
        nodes[i] = move_up(nodes[i], depth_diff)
    
    # Move all nodes up simultaneously until they meet at LCA
    while True:
        first_parent = nodes[0].parent
        if first_parent is None:
            return nodes[0]  # Root node
        
        all_same = True
        for node in nodes[1:]:
            if node.parent != first_parent:
                all_same = False
                break
        
        if all_same:
            return first_parent
        else:
            for i in range(len(nodes)):
                nodes[i] = nodes[i].parent

def find_common_parents_optimal(nodes):
    lca = find_lca(nodes)
    if not lca:
        return set()

    common_parents = set()
    curr = lca
    while curr:
        common_parents.add(curr.data)
        curr = curr.parent

    return common_parents

# Example Usage:
root = Node('A')
node_b = Node('B')
node_c = Node('C')
node_d = Node('D')
node_e = Node('E')
node_f = Node('F')

root.add_child(node_b)
root.add_child(node_c)
node_b.add_child(node_d)
node_b.add_child(node_e)
node_c.add_child(node_f)

nodes1 = [node_d, node_e, node_f]
print(f"Common parents of {[node.data for node in nodes1]}: {find_common_parents_optimal(nodes1)}")

nodes2 = [node_d, node_e]
print(f"Common parents of {[node.data for node in nodes2]}: {find_common_parents_optimal(nodes2)}")

Complexity Analysis (Optimal Solution)

  • Time Complexity: O(m + h), where m is the number of input nodes, and h is the height of the tree. Finding the LCA requires aligning the depths of the nodes, which takes O(m + h) time. Finding all ancestors of the LCA takes O(h) time. Therefore, the overall time complexity is O(m + h).
  • Space Complexity: O(1). The algorithm uses a constant amount of extra space.

Edge Cases

  1. Empty Input Set: If the input set of nodes is empty, the set of common parents is also empty.
  2. One Node in the Input Set: If there is only one node, then all of its ancestors, including itself, are common parents.
  3. Disconnected Tree/Forest: If the nodes are part of different disconnected trees (a forest), then there are no common parents (other than potentially the nodes themselves if they are specified as the input).
  4. Nodes Not in the Tree: If any of the input nodes are not actually in the tree, the algorithm needs to handle this gracefully, potentially by ignoring the node or raising an error.

Optimizations and Considerations

  1. Tree Structure: The height of the tree greatly impacts performance. Balanced trees result in O(log N) height, whereas skewed trees can result in O(N) height.
  2. Caching: Caching ancestor information can be useful if the same nodes are queried repeatedly.
  3. Parallelism: If finding ancestors is computationally intensive, parallelizing the process for each input node can improve performance.
  4. Large Trees: For extremely large trees that do not fit in memory, disk-based or distributed processing techniques may be necessary.