Saltar al contenido principal

Edit Distance (Levenshtein Distance)

This question is asked by Google. Given two strings s and t, return the minimum number of operations needed to convert s into t where a single operation consists of inserting a character, deleting a character, or replacing a character.

Example(s)

Example 1:

Input: s = "cat", t = "bat"
Output: 1
Explanation:
Replace 'c' with 'b': "cat" → "bat"
One operation needed.

Example 2:

Input: s = "beach", t = "batch"
Output: 2
Explanation:
Option 1:
Delete 'e': "beach" → "bach"
Insert 't': "bach" → "batch"
Total: 2 operations

Option 2:
Replace 'e' with 't': "beach" → "btach"
Replace 'a' with 'a': "btach" → "batch" (no change needed)
Wait, that's not right...

Actually:
Delete 'e': "beach" → "bach"
Insert 't' after 'b': "bach" → "batch"
Total: 2 operations

Example 3:

Input: s = "horse", t = "ros"
Output: 3
Explanation:
Delete 'h': "horse" → "orse"
Delete 'r': "orse" → "ose"
Replace 'e' with 's': "ose" → "ros"
Total: 3 operations

Example 4:

Input: s = "intention", t = "execution"
Output: 5
Explanation:
Multiple operations needed to transform one word to another.

Solution

The solution uses Dynamic Programming:

  1. DP state: dp[i][j] = minimum operations to convert s[0...i-1] to t[0...j-1]
  2. Base cases:
    • dp[0][j] = j (insert j characters)
    • dp[i][0] = i (delete i characters)
  3. Recurrence:
    • If characters match: dp[i][j] = dp[i-1][j-1]
    • Else: dp[i][j] = 1 + min(delete, insert, replace)
  4. Result: dp[m][n] where m and n are lengths of s and t

JavaScript Solution - Dynamic Programming

/**
* Find minimum edit distance between two strings
* @param {string} s - Source string
* @param {string} t - Target string
* @return {number} - Minimum number of operations
*/
function minDistance(s, t) {
const m = s.length;
const n = t.length;

// dp[i][j] = minimum operations to convert s[0...i-1] to t[0...j-1]
const dp = Array(m + 1).fill().map(() => Array(n + 1).fill(0));

// Base cases
// dp[0][j] = j (insert j characters to empty string)
for (let j = 0; j <= n; j++) {
  dp[0][j] = j;
}

// dp[i][0] = i (delete i characters from s)
for (let i = 0; i <= m; i++) {
  dp[i][0] = i;
}

// Fill DP table
for (let i = 1; i <= m; i++) {
  for (let j = 1; j <= n; j++) {
    if (s[i - 1] === t[j - 1]) {
      // Characters match, no operation needed
      dp[i][j] = dp[i - 1][j - 1];
    } else {
      // Three operations:
      // 1. Delete: dp[i-1][j] (delete s[i-1])
      // 2. Insert: dp[i][j-1] (insert t[j-1] into s)
      // 3. Replace: dp[i-1][j-1] (replace s[i-1] with t[j-1])
      dp[i][j] = 1 + Math.min(
        dp[i - 1][j],     // delete
        dp[i][j - 1],     // insert
        dp[i - 1][j - 1]  // replace
      );
    }
  }
}

return dp[m][n];
}

// Test cases
console.log('Example 1:', minDistance("cat", "bat")); 
// 1

console.log('Example 2:', minDistance("beach", "batch")); 
// 2

console.log('Example 3:', minDistance("horse", "ros")); 
// 3

console.log('Example 4:', minDistance("intention", "execution")); 
// 5
Output:
Click "Run Code" to execute the code and see the results.

Alternative Solution (Space-Optimized)

Here's a space-optimized version using only O(min(m, n)) space:

/**
* Space-optimized: O(min(m, n)) space instead of O(m × n)
*/
function minDistanceOptimized(s, t) {
const m = s.length;
const n = t.length;

// Use shorter string for DP array
if (m < n) {
return minDistanceOptimized(t, s); // Swap to use shorter string
}

// dp[j] = minimum operations to convert s[0...i-1] to t[0...j-1]
let prev = Array(n + 1).fill(0).map((_, i) => i);

for (let i = 1; i <= m; i++) {
const curr = [i]; // dp[i][0] = i

for (let j = 1; j <= n; j++) {
if (s[i - 1] === t[j - 1]) {
curr[j] = prev[j - 1];
} else {
curr[j] = 1 + Math.min(
prev[j], // delete
curr[j - 1], // insert
prev[j - 1] // replace
);
}
}
prev = curr;
}

return prev[n];
}

Complexity

  • Time Complexity: O(m × n) - Where m and n are the lengths of strings s and t. We fill a DP table of size m × n.
  • Space Complexity:
    • Standard DP: O(m × n) - For the DP table
    • Optimized: O(min(m, n)) - Using only one row

Approach

The solution uses Dynamic Programming:

  1. DP state:

    • dp[i][j] = minimum operations to convert s[0...i-1] to t[0...j-1]
    • We consider prefixes of both strings
  2. Base cases:

    • dp[0][j] = j - Convert empty string to t[0...j-1] by inserting j characters
    • dp[i][0] = i - Convert s[0...i-1] to empty string by deleting i characters
  3. Recurrence:

    • If s[i-1] === t[j-1]: Characters match, no operation needed
      • dp[i][j] = dp[i-1][j-1]
    • Else: Choose minimum of three operations:
      • Delete: dp[i-1][j] - Delete s[i-1], then convert s[0...i-2] to t[0...j-1]
      • Insert: dp[i][j-1] - Convert s[0...i-1] to t[0...j-2], then insert t[j-1]
      • Replace: dp[i-1][j-1] - Replace s[i-1] with t[j-1], then convert s[0...i-2] to t[0...j-2]
  4. Result:

    • dp[m][n] = minimum operations to convert entire s to entire t

Key Insights

  • Dynamic Programming: Optimal substructure - optimal solution uses optimal subproblems
  • Three operations: Insert, delete, or replace a character
  • Character matching: If characters match, no operation needed
  • O(m×n) time: Must consider all pairs of prefixes
  • Space optimization: Can reduce to O(min(m, n)) space

Step-by-Step Example

Let's trace through Example 1: s = "cat", t = "bat"

s = "cat" (m=3)
t = "bat" (n=3)

DP table (dp[i][j]):
"" b a t
"" 0 1 2 3
c 1 1 2 3
a 2 2 1 2
t 3 3 2 1

Base cases:
dp[0][0] = 0 (empty to empty)
dp[0][1] = 1 ("" → "b": insert 'b')
dp[0][2] = 2 ("" → "ba": insert 'b', 'a')
dp[0][3] = 3 ("" → "bat": insert 'b', 'a', 't')
dp[1][0] = 1 ("c" → "": delete 'c')
dp[2][0] = 2 ("ca" → "": delete 'c', 'a')
dp[3][0] = 3 ("cat" → "": delete 'c', 'a', 't')

Fill table:
i=1, j=1: s[0]='c', t[0]='b' (different)
dp[1][1] = 1 + min(dp[0][1]=1, dp[1][0]=1, dp[0][0]=0) = 1 + 0 = 1

i=1, j=2: s[0]='c', t[1]='a' (different)
dp[1][2] = 1 + min(dp[0][2]=2, dp[1][1]=1, dp[0][1]=1) = 1 + 1 = 2

i=1, j=3: s[0]='c', t[2]='t' (different)
dp[1][3] = 1 + min(dp[0][3]=3, dp[1][2]=2, dp[0][2]=2) = 1 + 2 = 3

i=2, j=1: s[1]='a', t[0]='b' (different)
dp[2][1] = 1 + min(dp[1][1]=1, dp[2][0]=2, dp[1][0]=1) = 1 + 1 = 2

i=2, j=2: s[1]='a', t[1]='a' (match!)
dp[2][2] = dp[1][1] = 1

i=2, j=3: s[1]='a', t[2]='t' (different)
dp[2][3] = 1 + min(dp[1][3]=3, dp[2][2]=1, dp[1][2]=2) = 1 + 1 = 2

i=3, j=1: s[2]='t', t[0]='b' (different)
dp[3][1] = 1 + min(dp[2][1]=2, dp[3][0]=3, dp[2][0]=2) = 1 + 2 = 3

i=3, j=2: s[2]='t', t[1]='a' (different)
dp[3][2] = 1 + min(dp[2][2]=1, dp[3][1]=3, dp[2][1]=2) = 1 + 1 = 2

i=3, j=3: s[2]='t', t[2]='t' (match!)
dp[3][3] = dp[2][2] = 1

Result: dp[3][3] = 1

Operation: Replace 'c' with 'b': "cat" → "bat"

Visual Representation

Example 1: s = "cat", t = "bat"

DP table:
"" b a t
"" 0 1 2 3
c 1 1 2 3
a 2 2 1 2
t 3 3 2 1

Operations:
Replace 'c' with 'b': "cat" → "bat"
Total: 1 operation

Example 2: s = "beach", t = "batch"

DP table (simplified):
"" b a t c h
"" 0 1 2 3 4 5
b 1 0 1 2 3 4
e 2 1 1 2 3 4
a 3 2 1 2 3 4
c 4 3 2 2 2 3
h 5 4 3 3 3 2

Operations:
Delete 'e': "beach" → "bach"
Insert 't' after 'b': "bach" → "batch"
Total: 2 operations

Edge Cases

  • Empty strings:
    • s = "", t = "abc" → 3 (insert 3 characters)
    • s = "abc", t = "" → 3 (delete 3 characters)
    • s = "", t = "" → 0 (no operations)
  • Identical strings: Return 0 (no operations needed)
  • One character different: Return 1 (one replace)
  • Completely different: Return max(m, n) operations

Important Notes

  • Three operations: Insert, delete, or replace
  • Character matching: If characters match, no operation needed
  • O(m×n) time: Must fill entire DP table
  • Space optimization: Can reduce to O(min(m, n)) space
  • Symmetric: Edit distance from s to t equals t to s

Why Dynamic Programming Works

Optimal Substructure:

  • To convert s[0...i-1] to t[0...j-1], we need to:
    • Convert s[0...i-2] to t[0...j-2] (if last chars match)
    • Or perform one operation and use optimal solution for smaller subproblems
  • The optimal solution uses optimal solutions to subproblems

Overlapping Subproblems:

  • Multiple paths may lead to the same state
  • DP stores and reuses these optimal values
  • One Edit Distance: Check if edit distance is exactly 1
  • Delete Operation for Two Strings: Only deletion allowed
  • Minimum ASCII Delete Sum: Weighted deletion
  • Longest Common Subsequence: Related DP problem

Takeaways

  • Dynamic Programming solves edit distance efficiently
  • Three operations per character: insert, delete, replace
  • O(m×n) time to fill DP table
  • Space can be optimized to O(min(m, n))
  • Classic DP problem with many applications (spell check, DNA alignment, etc.)