#187

Repeated DNA Sequences

Medium
Hash TableStringBit ManipulationSliding WindowRolling HashHash FunctionHash MapSliding Window
LeetCode ↗

Approaches

Brute ForceOptimal
Complexity Comparison
Brute ForceOptimal Solution
Time
O(n²)
O(n)
Space
O(n)
O(n)
💡

Intuition

Time O(n)Space O(n)

The optimal solution uses a sliding window approach with a hash set to track seen sequences efficiently. This reduces the time complexity significantly.

⚙️

Algorithm

3 steps
  1. 1Step 1: Initialize two sets: one for seen sequences and one for output.
  2. 2Step 2: Loop through the string, extracting each 10-letter-long substring.
  3. 3Step 3: If the substring is already in the seen set, add it to the output set; otherwise, add it to the seen set.
solution.py9 lines
1def findRepeatedDnaSequences(s):
2    seen, output = set(), set()
3    for i in range(len(s) - 9):
4        seq = s[i:i + 10]
5        if seq in seen:
6            output.add(seq)
7        else:
8            seen.add(seq)
9    return list(output)

Complexity note: The time complexity is O(n) because we only traverse the string once. The space complexity is O(n) for storing the sequences in the sets.

  • 1Using a set allows for O(1) average time complexity for insertions and lookups.
  • 2The problem can be visualized as a sliding window of fixed size (10) moving through the string.

Solutions and explanations are original Tejav content. Problem titles © LeetCode — use the LeetCode button above for the full problem statement.