Saltar al contenido principal

Longest Common Subsequence

This question is asked by Google. Given two strings, s and t, return the length of their longest subsequence.

Note: A subsequence is different from a substring. A subsequence is a sequence that appears in the same relative order, but not necessarily contiguous. For example, "ace" is a subsequence of "abcde".

Example(s)

Example 1:

Input: s = "xyz", t = "xyz"
Output: 3
Explanation:
The longest common subsequence is "xyz" with length 3.

Example 2:

Input: s = "abca", t = "acea"
Output: 3
Explanation:
The longest common subsequence is "aca" or "aea" with length 3.
Possible LCS:
- "aca": appears in both strings
- "aea": appears in both strings

Example 3:

Input: s = "abc", t = "def"
Output: 0
Explanation:
There is no common subsequence between "abc" and "def".

Example 4:

Input: s = "abcde", t = "ace"
Output: 3
Explanation:
The longest common subsequence is "ace" with length 3.

Solution

The solution uses Dynamic Programming:

  1. DP state: dp[i][j] = length of LCS of s[0...i-1] and t[0...j-1]
  2. Base cases: dp[0][j] = 0, dp[i][0] = 0 (empty string has no common subsequence)
  3. Recurrence:
    • If characters match: dp[i][j] = dp[i-1][j-1] + 1
    • Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1])
  4. Result: dp[m][n] where m and n are lengths of s and t

JavaScript Solution - Dynamic Programming

/**
* Find length of longest common subsequence
* @param {string} s - First string
* @param {string} t - Second string
* @return {number} - Length of LCS
*/
function longestCommonSubsequence(s, t) {
const m = s.length;
const n = t.length;

// dp[i][j] = length of LCS of s[0...i-1] and t[0...j-1]
const dp = Array(m + 1).fill().map(() => Array(n + 1).fill(0));

// Base cases: dp[0][j] = 0, dp[i][0] = 0 (already initialized)

// Fill DP table
for (let i = 1; i <= m; i++) {
  for (let j = 1; j <= n; j++) {
    if (s[i - 1] === t[j - 1]) {
      // Characters match, extend LCS
      dp[i][j] = dp[i - 1][j - 1] + 1;
    } else {
      // Characters don't match, take maximum of:
      // 1. Skip s[i-1]: dp[i-1][j]
      // 2. Skip t[j-1]: dp[i][j-1]
      dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
    }
  }
}

return dp[m][n];
}

// Test cases
console.log('Example 1:', longestCommonSubsequence("xyz", "xyz")); 
// 3

console.log('Example 2:', longestCommonSubsequence("abca", "acea")); 
// 3

console.log('Example 3:', longestCommonSubsequence("abc", "def")); 
// 0

console.log('Example 4:', longestCommonSubsequence("abcde", "ace")); 
// 3
Output:
Click "Run Code" to execute the code and see the results.

Alternative Solution (Space-Optimized)

Here's a space-optimized version using only O(min(m, n)) space:

/**
* Space-optimized: O(min(m, n)) space instead of O(m × n)
*/
function longestCommonSubsequenceOptimized(s, t) {
const m = s.length;
const n = t.length;

// Use shorter string for DP array
if (m < n) {
return longestCommonSubsequenceOptimized(t, s);
}

// dp[j] = length of LCS of s[0...i-1] and t[0...j-1]
let prev = Array(n + 1).fill(0);

for (let i = 1; i <= m; i++) {
const curr = [0]; // dp[i][0] = 0

for (let j = 1; j <= n; j++) {
if (s[i - 1] === t[j - 1]) {
curr[j] = prev[j - 1] + 1;
} else {
curr[j] = Math.max(prev[j], curr[j - 1]);
}
}
prev = curr;
}

return prev[n];
}

Complexity

  • Time Complexity: O(m × n) - Where m and n are the lengths of strings s and t. We fill a DP table of size m × n.
  • Space Complexity:
    • Standard DP: O(m × n) - For the DP table
    • Optimized: O(min(m, n)) - Using only one row

Approach

The solution uses Dynamic Programming:

  1. DP state:

    • dp[i][j] = length of LCS of s[0...i-1] and t[0...j-1]
    • We consider prefixes of both strings
  2. Base cases:

    • dp[0][j] = 0 - Empty string has no common subsequence with any string
    • dp[i][0] = 0 - Any string has no common subsequence with empty string
  3. Recurrence:

    • If s[i-1] === t[j-1]: Characters match, extend LCS
      • dp[i][j] = dp[i-1][j-1] + 1
    • Else: Characters don't match, take maximum of:
      • Skip s[i-1]: dp[i-1][j] (LCS of s[0...i-2] and t[0...j-1])
      • Skip t[j-1]: dp[i][j-1] (LCS of s[0...i-1] and t[0...j-2])
  4. Result:

    • dp[m][n] = length of LCS of entire s and entire t

Key Insights

  • Dynamic Programming: Optimal substructure - optimal solution uses optimal subproblems
  • Subsequence vs Substring: Subsequence doesn't need to be contiguous
  • Character matching: If characters match, extend LCS; else skip one character
  • O(m×n) time: Must consider all pairs of prefixes
  • Space optimization: Can reduce to O(min(m, n)) space

Step-by-Step Example

Let's trace through Example 2: s = "abca", t = "acea"

s = "abca" (m=4)
t = "acea" (n=4)

DP table (dp[i][j]):
"" a c e a
"" 0 0 0 0 0
a 0 1 1 1 1
b 0 1 1 1 1
c 0 1 2 2 2
a 0 1 2 2 3

Fill table:
i=1, j=1: s[0]='a', t[0]='a' (match!)
dp[1][1] = dp[0][0] + 1 = 0 + 1 = 1

i=1, j=2: s[0]='a', t[1]='c' (no match)
dp[1][2] = max(dp[0][2]=0, dp[1][1]=1) = 1

i=1, j=3: s[0]='a', t[2]='e' (no match)
dp[1][3] = max(dp[0][3]=0, dp[1][2]=1) = 1

i=1, j=4: s[0]='a', t[3]='a' (match!)
dp[1][4] = dp[0][3] + 1 = 0 + 1 = 1

i=2, j=1: s[1]='b', t[0]='a' (no match)
dp[2][1] = max(dp[1][1]=1, dp[2][0]=0) = 1

i=2, j=2: s[1]='b', t[1]='c' (no match)
dp[2][2] = max(dp[1][2]=1, dp[2][1]=1) = 1

i=3, j=2: s[2]='c', t[1]='c' (match!)
dp[3][2] = dp[2][1] + 1 = 1 + 1 = 2

i=4, j=4: s[3]='a', t[3]='a' (match!)
dp[4][4] = dp[3][3] + 1 = 2 + 1 = 3

Result: dp[4][4] = 3

LCS: "aca" or "aea" (both have length 3)

Visual Representation

Example 2: s = "abca", t = "acea"

DP table:
"" a c e a
"" 0 0 0 0 0
a 0 1 1 1 1
b 0 1 1 1 1
c 0 1 2 2 2
a 0 1 2 2 3

LCS: "aca"
s: a b c a
↑ ↑ ↑
t: a c e a
↑ ↑ ↑

Or LCS: "aea"
s: a b c a
↑ ↑
t: a c e a
↑ ↑ ↑

Result: 3

Edge Cases

  • Empty strings:
    • s = "", t = "abc" → 0
    • s = "abc", t = "" → 0
    • s = "", t = "" → 0
  • Identical strings: Return length of string
  • No common characters: Return 0
  • One string is subsequence of other: Return length of shorter string

Important Notes

  • Subsequence vs Substring:
    • Subsequence: maintains relative order, not necessarily contiguous
    • Substring: must be contiguous
  • Character matching: If characters match, extend LCS; else skip one
  • O(m×n) time: Must fill entire DP table
  • Space optimization: Can reduce to O(min(m, n)) space
  • Symmetric: LCS of s and t equals LCS of t and s

Why Dynamic Programming Works

Optimal Substructure:

  • To find LCS of s[0...i-1] and t[0...j-1]:
    • If s[i-1] === t[j-1]: LCS includes this character, then find LCS of s[0...i-2] and t[0...j-2]
    • Else: LCS is max of (LCS of s[0...i-2] and t[0...j-1], LCS of s[0...i-1] and t[0...j-2])
  • The optimal solution uses optimal solutions to subproblems

Overlapping Subproblems:

  • Multiple paths may lead to the same state
  • DP stores and reuses these optimal values

Difference: Subsequence vs Substring

Subsequence:

  • Maintains relative order
  • Not necessarily contiguous
  • Example: "ace" is subsequence of "abcde"

Substring:

  • Must be contiguous
  • Example: "bcd" is substring of "abcde", but "ace" is not
  • Longest Common Substring: Contiguous common sequence
  • Edit Distance: Related DP problem
  • Longest Increasing Subsequence: Similar DP pattern
  • Delete Operation for Two Strings: Uses LCS

Takeaways

  • Dynamic Programming solves LCS efficiently
  • Subsequence doesn't need to be contiguous
  • Character matching extends LCS, else skip one
  • O(m×n) time to fill DP table
  • Space can be optimized to O(min(m, n))
  • Classic DP problem with many applications (DNA alignment, version control, etc.)