MinimumASCIIDeleteSumForTwoStrings [source code]

public class MinimumASCIIDeleteSumForTwoStrings {
static
/******************************************************************************/
class Solution {
    public int minimumDeleteSum(String s1, String s2) {
        char[] chs1 = s1.toCharArray (), chs2 = s2.toCharArray ();
        int R = chs1.length, C = chs2.length; 
        int[][] dp = new int[R + 1][C + 1];
        for (int i = 0; i < R + 1; i++) {
            for (int j = 0; j < C + 1; j++) {
                if (i == 0 && j == 0)
                    dp[i][j] = 0;
                // Pay attention to the base cases: not so trivial after all
                else if (i == 0)
                    dp[i][j] = chs2[j - 1] + dp[i][j - 1];
                else if (j == 0)
                    dp[i][j] = chs1[i - 1] + dp[i - 1][j];
                else if (chs1[i - 1] == chs2[j - 1])
                    dp[i][j] = dp[i - 1][j - 1];
                else
                    dp[i][j] = Math.min (chs1[i - 1] + dp[i - 1][j], chs2[j - 1] + dp[i][j - 1]);
            }
        }
        return dp[R][C];
    }
}
/******************************************************************************/

    public static void main(String[] args) {
        MinimumASCIIDeleteSumForTwoStrings.Solution tester = new MinimumASCIIDeleteSumForTwoStrings.Solution();
    }
}

题目意图本身应该还是一个比较直接的DP题目, 不过最后怎么写可能要稍微讲究一下;

DP题目以前讲过, 切入点是考虑:

  • 什么样的表格, 几个维度, 每一个维度代表什么, entry里面存放的是什么样的recursive value;
  • 怎样展现一个DP过程, 也就是怎样寻找一个downsize的过程;

但是这样的切入点很多时候是想要直接写Bottom-Up的算法; 但是如果是一个不熟悉的题目, Bottom-Up未必知道怎么写的时候, 或许可以先用Top-Down的方式来思考一下题目的意思, 就跟我刚开始学习edit distance的时候, 也是先用Top-Down的方式来帮助理解之后才开始可以熟练的直接写Bottom-Up的;

当我们用Top-Down的方式的思考的时候, 往往思考的是我们需要哪几个subproblem, 类似的, 当我们在Bottom-Up填表的时候, 其实也是站在一个格子的位置, 然后思考我们需要看向其他的哪几个格子;

最后还是有惊无险的做出来了, 速度是35ms (91%), 不错;

这个题目最后在做的过程当中, 还是收获了不少教训;

首先, 就是一个类比已知的能力. 这个题目明显是一个和edit distance非常类似的题目, 最后的解法思路也是非常的类似, 虽然稍微有点差别;

其次, 在这个类比的过程中, 我们也自动触发了另外一个技能, 就是add/delete equivalence, 这题里面就真的是用到了这个; 虽然这个题目看起来要你求的是两边都是delete; 但是实际上只要你理解了add和delete的对应性, 这题求的就是edit distance, 只不过distance的计算稍微换了个表达方式;

另外, 在解这个题目的过程当中, 我也是对ED算法加深了理解; 这题的一个要求是一个equal, 当时我半天没有理解这个equal到底怎么解决; 其实你仔细回忆一下edit distance问题, 计算的其实equal! 不然这个distance本身都无法定义;

当然了, 就算是谈到了edit distance, 也不代表我就非常熟悉; 比如ED的表格的画法, 我其实掌握的也不是那么对; 这里最好的参考资料还是Jason的NLP课件上面的画法, 也就是表格的index, 对应的其实是dot的位置, 而不是单独的char; 这样也是呼应了之前的letter corresponds to edge的思路;

在计算表格的时候, 要精确思考index之间的对应性, 一个比较好的visual帮助就是:

   0   1   2            // index of string  
   s   e   a            // string  
0   1   2   3           // index of ED table

有这个的话, 计算index的边界关系应该不是很难了;


Editorial

Approach #1: Dynamic Programming [Accepted]

Intuition and Algorithm

Let dp[i][j] be the answer to the problem for the strings s1[i:], s2[j:].

When one of the input strings is empty, the answer is the ASCII-sum of the other string. We can calculate this cumulatively using code like dp[i][s2.length()] = dp[i+1][s2.length()] + s1.codePointAt(i).

When s1[i] == s2[j], we have dp[i][j] = dp[i+1][j+1] as we can ignore these two characters.

When s1[i] != s2[j], we will have to delete at least one of them. We'll have dp[i][j] as the minimum of the answers after both deletion options.

The solutions presented will use bottom-up dynamic programming.

class Solution {  
    public int minimumDeleteSum(String s1, String s2) {  
        int[][] dp = new int[s1.length() + 1][s2.length() + 1];  

        for (int i = s1.length() - 1; i >= 0; i--) {  
            dp[i][s2.length()] = dp[i+1][s2.length()] + s1.codePointAt(i);  
        }  
        for (int j = s2.length() - 1; j >= 0; j--) {  
            dp[s1.length()][j] = dp[s1.length()][j+1] + s2.codePointAt(j);  
        }  
        for (int i = s1.length() - 1; i >= 0; i--) {  
            for (int j = s2.length() - 1; j >= 0; j--) {  
                if (s1.charAt(i) == s2.charAt(j)) {  
                    dp[i][j] = dp[i+1][j+1];  
                } else {  
                    dp[i][j] = Math.min(dp[i+1][j] + s1.codePointAt(i),  
                                        dp[i][j+1] + s2.codePointAt(j));  
                }  
            }  
        }  
        return dp[0][0];  
    }  
}

跟我的思路应该是一样的; 这里awice还是讲的比较快的, 比如他定义entry里面存放的是answer for, 这个实际上是一个很模糊的定义; 当然, 如果你熟悉ED, 你肯定知道, 说的其实就是distance;


discussion大部分的解法都是类似的;

一个倒着写的解法:

class Solution {  
    public int minimumDeleteSum(String s1, String s2) {  
        int m = s1.length(), n = s2.length(), MAX = Integer.MAX_VALUE;  
        char[] a = s1.toCharArray(), b = s2.toCharArray();  
        int[][] dp = new int[m + 1][n + 1];  
        for (int i = m; i >= 0; i--) {  
            for (int j = n; j >= 0; j--) {  
                if (i < m || j < n)  
                    dp[i][j] = i < m && j < n && a[i] == b[j] ?  
                        dp[i + 1][j + 1] : Math.min((i < m ? a[i] + dp[i + 1][j] : MAX), (j < n ? b[j] + dp[i][j + 1] : MAX));  
            }  
        }  
        return dp[0][0];  
    }  
}

本质上还是一样的;

注意, ED算法都可以继续优化一下, 让空间cost减少到O(N), 不过这个反正就这么一回事儿了;

另外一个基于LCS的算法:

@elastico said in C++ O(nm) based on longest common subsequence:

We calculate the LCS of two strings, except that now each character as a weight. The algorithm is essentially the same. The delete sum is then the ASCII sum of the two strings minus the LCS value.

class Solution {  
public:  
    int minimumDeleteSum(string s1, string s2) {  
        int n1 = s1.size(), n2 = s2.size();  
        int dp[n1+1][n2+1] = {};  
        for (int i1 = 1; i1 <=n1; ++i1) for (int i2 = 1; i2 <=n2; ++i2)  {  
            int ans = 0;  
            ans = max(dp[i1][i2-1], dp[i1-1][i2]);  
            if (s1[i1-1]==s2[i2-1]) ans = max(ans, (int)s1[i1-1] + dp[i1-1][i2-1]);  
            dp[i1][i2] = ans;  
        }  
        int ret = 0;  
        for (auto c:s1) ret +=c;  
        for (auto c:s2) ret +=c;  
        ret -= 2*dp[n1][n2];  
        return ret;  
    }  
};

总体上本质其实是一样的, LCS算法本身其实就是ED的特例;


submission最优解: 23(100):

class Solution {  
    public int minimumDeleteSum(String s1, String s2) {  
        char[] c1 = s1.toCharArray();  
        char[] c2 = s2.toCharArray();  
        int[] pre = new int[c2.length + 1];  
        int[] cur = new int[pre.length], tmp = null;  
        for (int i = 0; i < c2.length; i++) {  
            pre[i + 1] = pre[i] + c2[i];  
        }  
        for (int i = 0; i < c1.length; i++) {  
            cur[0] = pre[0] + c1[i];  
            for (int j = 0; j < c2.length; j++) {  
                if (c1[i] == c2[j]) cur[j + 1] = pre[j];  
                else {  
                    cur[j + 1] = Math.min(pre[j + 1] + c1[i], cur[j] + c2[j]);  
                }  
            }  
            tmp = pre;  
            pre = cur;  
            cur = tmp;  
        }  
        return pre[pre.length - 1];  
    }  
}

看起来其实就是空间优化了一下的ED, 不知道为什么这么快;


Problem Description

Given two strings s1, s2, find the lowest ASCII sum of deleted characters to make two strings equal.

Example 1:

Input: s1 = "sea", s2 = "eat"  
Output: 231  
Explanation: Deleting "s" from "sea" adds the ASCII value of "s" (115) to the sum.  
Deleting "t" from "eat" adds 116 to the sum.  
At the end, both strings are equal, and 115 + 116 = 231 is the minimum sum possible to achieve this.

Example 2:

Input: s1 = "delete", s2 = "leet"  
Output: 403  
Explanation: Deleting "dee" from "delete" to turn the string into "let",  
adds 100[d]+101[e]+101[e] to the sum.  Deleting "e" from "leet" adds 101[e] to the sum.  
At the end, both strings are equal to "let", and the answer is 100+101+101+101 = 403.  
If instead we turned both strings into "lee" or "eet", we would get answers of 433 or 417, which are higher.

Note:

  • 0 < s1.length, s2.length <= 1000.
  • All elements of each string will have an ASCII value in [97, 122].

Difficulty:Medium
Total Accepted:4.8K
Total Submissions:9.2K
Contributor:m_deepakraja
Companies
triplebyte
Related Topics
dynamic programming
Similar Questions
Edit DistanceLongest Increasing SubsequenceDelete Operation for Two Strings

Hint 1
Let dp(i, j) be the answer for inputs s1[i:] and s2[j:].

results matching ""

    No results matching ""