Find the length of the longest consecutive sequence in an unsorted array of integers. Optimize for O(n) time complexity if possible.

Medium
3 years ago

You are given an unsorted array of integers.

  1. Describe an algorithm to find the length of the longest consecutive sequence of elements in the array. The consecutive elements do not need to be adjacent in the array.
  2. What is the time complexity of your solution?
  3. Can you optimize your solution to achieve O(n) time complexity? Explain your approach and provide the optimized code.

Example:

Given the array [100, 4, 200, 1, 3, 2], the longest consecutive sequence is [1, 2, 3, 4]. The length of this sequence is 4. Your algorithm should return 4.

Constraints:

  • The input array can contain positive and negative integers.
  • The input array may contain duplicate values.
  • The algorithm should handle edge cases such as an empty array.

Explain your approach, provide pseudocode if necessary, and then implement your solution in a language of your choice. A brute force solution is acceptable but should be followed by the optimized solution.

Sample Answer

Longest Consecutive Sequence

Problem Description

You are given an unsorted array of integers. The objective is to find the length of the longest consecutive sequence of elements in the array. The consecutive elements do not need to be adjacent in the array.

Example:

Given the array [100, 4, 200, 1, 3, 2], the longest consecutive sequence is [1, 2, 3, 4]. The length of this sequence is 4. Your algorithm should return 4.

Constraints:

  • The input array can contain positive and negative integers.
  • The input array may contain duplicate values.
  • The algorithm should handle edge cases such as an empty array.

Brute Force Approach

The brute force approach involves iterating through each element in the array and checking for consecutive elements. For each element, we attempt to build a sequence by checking if element + 1, element + 2, etc., exist in the array. We keep track of the longest sequence found.

Pseudocode

function longestConsecutiveSequenceBruteForce(array):
  if array is empty:
    return 0

  max_length = 0

  for each element in array:
    current_length = 1
    current_element = element

    while array contains current_element + 1:
      current_length = current_length + 1
      current_element = current_element + 1

    max_length = max(max_length, current_length)

  return max_length

Python Implementation

def longest_consecutive_brute_force(nums):
    if not nums:
        return 0

    max_length = 0
    for num in nums:
        current_length = 1
        current_num = num

        while current_num + 1 in nums:
            current_length += 1
            current_num += 1

        max_length = max(max_length, current_length)

    return max_length

Time Complexity of Brute Force Approach

The time complexity of the brute force approach is O(n^2) in the worst case. For each element in the array (outer loop), we potentially iterate through the array again to check for consecutive elements (inner loop).

Space Complexity of Brute Force Approach

The space complexity of the brute force approach is O(1) because we are not using any additional data structures that scale with the input size.

Optimized Approach

The optimized approach uses a set to store the elements of the array. This allows us to check for the existence of an element in O(1) time. We iterate through the array, and for each element, we check if it's the start of a sequence (i.e., element - 1 is not in the set). If it is, we then find the length of the sequence by checking for element + 1, element + 2, etc., in the set.

Pseudocode

function longestConsecutiveSequenceOptimized(array):
  if array is empty:
    return 0

  number_set = new Set(array)
  max_length = 0

  for each element in array:
    if number_set contains element - 1:
      continue  // This is not the start of a sequence

    current_length = 1
    current_element = element

    while number_set contains current_element + 1:
      current_length = current_length + 1
      current_element = current_element + 1

    max_length = max(max_length, current_length)

  return max_length

Python Implementation

def longest_consecutive_optimized(nums):
    if not nums:
        return 0

    num_set = set(nums)
    max_length = 0

    for num in nums:
        if num - 1 in num_set:
            continue  # This is not the start of a sequence

        current_length = 1
        current_num = num

        while current_num + 1 in num_set:
            current_length += 1
            current_num += 1

        max_length = max(max_length, current_length)

    return max_length

Time Complexity of Optimized Approach

The time complexity of the optimized approach is O(n). Although there is a nested while loop, each element in the array is visited at most twice: once when adding it to the set, and at most once when it is part of a consecutive sequence. The outer loop iterates through each number n in the input array, and the inner while loop iterates through the consecutive sequence starting from n. However, the inner loop only runs when n is the starting number of a sequence (i.e., n - 1 is not in the set). Therefore, the overall time complexity is O(n).

Space Complexity of Optimized Approach

The space complexity of the optimized approach is O(n) because we are using a set to store the elements of the array. In the worst case, the set will contain all the elements of the array.

Edge Cases

  • Empty Array: The algorithm handles an empty array by returning 0.
  • Duplicate Values: The algorithm correctly handles duplicate values because the set data structure automatically removes duplicates.
  • Positive and Negative Integers: The algorithm works correctly for both positive and negative integers.

Testing

# Test cases
print(longest_consecutive_optimized([100, 4, 200, 1, 3, 2]))  # Output: 4
print(longest_consecutive_optimized([0, 3, 7, 2, 5, 8, 4, 6, 0, 1]))  # Output: 9
print(longest_consecutive_optimized([]))  # Output: 0
print(longest_consecutive_optimized([1, 2, 0, 1])) # Output: 3