LongestWordInDictionaryThroughDeleting [source code]


public class LongestWordInDictionaryThroughDeleting {
static
/******************************************************************************/
class Solution {
    public String findLongestWord(String s, List<String> d) {
        String longest_word = "";
        for (String word : d) {
            if (subSeq (s, word) && (word.length () > longest_word.length () || word.length () == longest_word.length () && word.compareTo (longest_word) < 0))
                longest_word = word;
        }
        return longest_word;
    }

    boolean subSeq (String haystack, String needle) {
        if (haystack.length () < needle.length () || haystack.length () == needle.length () && !haystack.equals (needle))
            return false;
        char[] chh = haystack.toCharArray (), chn = needle.toCharArray ();
        int i = 0, j = 0;
        while (j < chn.length) {
            while (i < chh.length && chh[i] != chn[j])
                i++;
            if (i == chh.length)
                return false;
            i++;        // don't forget this!
            j++;
        }
        return true;
    }
}
/******************************************************************************/

    public static void main(String[] args) {
        LongestWordInDictionaryThroughDeleting.Solution tester = new LongestWordInDictionaryThroughDeleting.Solution();
        String[][] inputs = {
            {"abpcplea"}, {"ale","apple","monkey","plea"}, {"apple"},
            {"abpcplea"}, {"a","b","c"}, {"a"},
            {"bab"}, {"ba","ab","a","b"}, {"ab"},
            {"aewfafwafjlwajflwajflwafj"}, {"apple","ewaf","awefawfwaf","awef","awefe","ewafeffewafewf"}, {"ewaf"},
            {"apple"}, {"zxc","vbn"}, {""},
            {"wordgoodgoodgoodbestword"}, {"word","good","best","good"}, {"best"}, 
        };
        for (int i = 0; i < inputs.length / 3; i++) {
            String s = inputs[3 * i][0], ans = inputs[3 * i + 2][0];
            List<String> d = Arrays.asList (inputs[3 * i + 1]);
            System.out.println (Printer.separator ());
            String output = tester.findLongestWord (s, d);
            System.out.printf ("[%s] and %s -> %s, expected: %s\n",
                Printer.array (inputs[3 * i + 1]), s, Printer.wrapColor (output, output.equals (ans) ? "green" : "red"), ans
            );
        }
    }

    static String printWord (char[] chs, int i, int j) {
        StringBuilder sb = new StringBuilder ();
        for (int k = 0; k < chs.length; k++) {
            if (k == i && k == j)
                sb.append (Printer.wrapColor (String.valueOf (chs[k]), "yellow"));
            else if (k == i)
                sb.append (Printer.wrapColor (String.valueOf (chs[i]), "magenta"));
            else if (k == j)
                sb.append (Printer.wrapColor (String.valueOf (chs[j]), "cyan"));
            else
                sb.append (chs[k]);
        }
        return sb.toString ();
    }
}

虽然题目给的tag没有给trie, 不过我还是想用trie来尝试一下;

最后花了很多时间, trie的做法还是没有做出来, 被最后一个case给break掉了, 代码:

class Solution {  
    String longest_word;  
    int min_delete;  

    public String findLongestWord(String s, List<String> d) {  
        // Good practice to initialize the instance variables in the entry function, rather than in constructor or statically.  
        //The latter way won't matter to OJ, but will hurt local results.  
        longest_word = "";  
        min_delete = Integer.MAX_VALUE;  
        TrieNode root = new TrieNode ();  
        root.word = "-";  
        for (String word : d)  
            root.insert (word);  
        dfs (root, s.toCharArray (), 0, 0);  
        return longest_word.equals ("-") ? "" : longest_word;  
    }  

    void dfs (TrieNode node, char[] chs, int index, int skip) {  
        if (skip > min_delete)  
            return;  
        if (index >= chs.length) {  
            if (node.word.length () == 0)  
                return;  
            if (skip < min_delete) {  
                min_delete = skip;  
                longest_word = node.word;  
            } else if (skip == min_delete && node.word.compareTo (longest_word) < 0) {  
                longest_word = node.word;  
            }  
            return;  
        }  
        if (node.link_count > 0) {  
            for (int i = 0; i < 26; i++) {  
                if (node.links[i] != null) {  
                    if (i + 'a' == chs[index])  
                        dfs (node.links[i], chs, index + 1, skip);  
                    else  
                        dfs (node, chs, index + 1, skip + 1);  
                }  
            }  
        } else {  
            if (node.equals (""))  
                return;  
            skip += chs.length - index;  
            if (skip < min_delete) {  
                min_delete = skip;  
                longest_word = node.word;  
                return;  
            } else if (skip == min_delete && node.word.compareTo (longest_word) < 0) {  
                longest_word = node.word;  
            }  
        }  
    }  

    static class TrieNode {  
        String word = "";  
        TrieNode[] links = new TrieNode[26];  
        int link_count = 0;  

        void insert (String s) {  
            char[] chs = s.toCharArray ();  
            TrieNode cur = this;  
            // Easy mistake: use `cur.links`, not `links` in the loop below!!!  
            for (char c : chs) {  
                if (cur.links[c - 'a'] == null) {  
                    cur.links[c - 'a'] = new TrieNode ();  
                    cur.link_count++;  
                }  
                cur = cur.links[c - 'a'];  
            }  
            cur.word = s;  
        }  

        // easy testing utility to see the content of the trie  
        void test () {  
            if (this.word.length () > 0)  
                System.out.println (Printer.wrapColor (this.word, "magenta"));  
            for (int i = 0; i < 26; i++) if (this.links[i] != null)  
                this.links[i].test ();  
        }  
    }  
}

注意, 之所以这里我在找min_delete, 是因为因为给的s的长度是一定的, 所以最后找到的如果是最长的一个word, 那么他对应的delete肯定也是最少的;

然后想用2pointer写, 写出来的是这个, 但是不对, 被最后几个超级大的case给break掉了, 毕竟Google;

class Solution {  
    public String findLongestWord(String s, List<String> d) {  
        String longest_word = "";  
        int min_delete = s.length ();  
        char[] chs = s.toCharArray ();  
        for (String word : d) {  
            char[] chd = word.toCharArray ();  
            int delete = 0;  
            if (s.contains (word))  
                delete = chs.length - chd.length;  
            else {  
                int s_lo = 0, s_hi = chs.length - 1, d_lo = 0, d_hi = chd.length - 1;  
                while (s_lo < s_hi && d_lo < d_hi) {  
                    if (chs[s_lo] == chd[d_lo])  
                        d_lo++;  
                    else  
                        delete++;  
                    s_lo++;  
                    if (chs[s_hi] == chd[d_hi])   
                        d_hi--;  
                    else  
                        delete++;  
                    s_hi--;  
                }  
                if (d_lo < d_hi)  
                    continue;  
                if (s_lo < s_hi)  
                    delete += s_hi - s_lo + 1;  
                if (d_lo == d_hi) {  
                    for (int i = s_lo; i <= s_hi; i++) {  
                        if (chs[i] == chd[d_lo]) {  
                            delete--;  
                            break;  
                        }  
                    }  
                }  
            }  
            if (delete < min_delete) {  
                min_delete = delete;  
                longest_word = word;  
            } else if (delete == min_delete && word.compareTo (longest_word) < 0) {  
                longest_word = word;  
            }  
        }  
        return longest_word;  
    }  
}

最后上面的解法是看了editorial之后写的一个算法, 速度是29ms (93%). 感觉我的subSequence写的还是没有他的简练;

另外, 既然讨论到了subSequence问题, 其实subSequence问题还有更笨的办法, 就是text和pattern都进行一个count, 然后对每一个letter进行比较; 这个方法就太笨重了, 应该知道去避免它.


editorial

看完了这个editorial之后, 真的是非常的懊恼, 怎么这么简单的题目都没有做出来, 这个题目包括editorial3给出来的做法, 简直也就是easy题的水平.

后来反思了一下, 其实是我对subSeq这个题型太不敏感了. 给你两个string, 然后确定是否一个是另外一个的subSeq, 哪还有比这个更应该敏感的题型了.

Approach #1 Brute Force [Time Limit Exceeded]

public class Solution {  
    public String findLongestWord(String s, List < String > d) {  
        HashSet < String > set = new HashSet < > (d);  
        List < String > l = new ArrayList < > ();  
        generate(s, "", 0, l);  
        String max_str = "";  
        for (String str: l) {  
            if (set.contains(str))  
                if (str.length() > max_str.length() || (str.length() == max_str.length() && str.compareTo(max_str) < 0))  
                    max_str = str;  
        }  
        return max_str;  
    }  
    public void generate(String s, String str, int i, List < String > l) {  
        if (i == s.length())  
            l.add(str);  
        else {  
            generate(s, str + s.charAt(i), i + 1, l);  
            generate(s, str, i + 1, l);  
        }  
    }  
}

其实背后的思路还是很简单的, 一个很直接的DFS找到所有的可能的subseq然后最后跟d比较就行了; 算是比较直接暴力的做法;

Approach #2 Iterative Brute Force [Time Limit Exceeded]

他们认为这里用了一个叫做binary number generation的思路, 这个好像以前什么题目里面见到过;

A problem with this method is that the maximum length of the string can be 32 only, since we make use of an integer and perform the shift operations on it to generate the binary numbers.

public class Solution {  
    public String findLongestWord(String s, List < String > d) {  
        HashSet < String > set = new HashSet < > (d);  
        List < String > l = new ArrayList < > ();  
        for (int i = 0; i < (1 << s.length()); i++) {  
            String t = "";  
            for (int j = 0; j < s.length(); j++) {  
                if (((i >> j) & 1) != 0)  
                    t += s.charAt(j);  
            }  
            l.add(t);  
        }  
        String max_str = "";  
        for (String str: l) {  
            if (set.contains(str))  
                if (str.length() > max_str.length() || (str.length() == max_str.length() && str.compareTo(max_str) < 0))  
                    max_str = str;  
        }  
        return max_str;  
    }  
}

我自己尝试了一下, 把他的string操作换成了StringBuilder操作, 但是还是照样TLE;

上面两个方法都是指数复杂度;

Approach #3 Sorting and checking Subsequence [Accepted]

public class Solution {  
    public boolean isSubsequence(String x, String y) {  
        int j = 0;  
        for (int i = 0; i < y.length() && j < x.length(); i++)  
            if (x.charAt(j) == y.charAt(i))  
                j++;  
        return j == x.length();  
    }  
    public String findLongestWord(String s, List < String > d) {  
        Collections.sort(d, new Comparator < String > () {  
            public int compare(String s1, String s2) {  
                return s2.length() != s1.length() ? s2.length() - s1.length() : s1.compareTo(s2);  
            }  
        });  
        for (String str: d) {  
            if (isSubsequence(str, s))  
                return str;  
        }  
        return "";  
    }  
}

熟悉一下这里这个sort API的使用, 我好像并没有见过这样的用处? 我以前见过自己定义comparator的好像只有要么是PriorityQueue, 要么是...好像见过? 我自己忘记了而已; 以前有一个sort interval的题目好像碰到过这个的应用;

Approach #4 Without Sorting [Accepted]:

public class Solution {  
    public boolean isSubsequence(String x, String y) {  
        int j = 0;  
        for (int i = 0; i < y.length() && j < x.length(); i++)  
            if (x.charAt(j) == y.charAt(i))  
                j++;  
        return j == x.length();  
    }  
    public String findLongestWord(String s, List < String > d) {  
        String max_str = "";  
        for (String str: d) {  
            if (isSubsequence(str, s)) {  
                if (str.length() > max_str.length() || (str.length() == max_str.length() && str.compareTo(max_str) < 0))  
                    max_str = str;  
            }  
        }  
        return max_str;  
    }  
}

首先,这个算法真的不算是很难想到; 只能说我自己做的不足 我觉得一个可能的原因可能就是我对常见题型不熟悉; subSequence问题其实是一个很经典很常见的问题, 但是我并没有立刻和这个题目联想起来;

我看到这个题目, 尤其是这种dictionary题目, 立刻就自作聪明的认为trie可以做, 结果也是没有做出来; 后来自己想的这个2pointer因为设计的太复杂最后也是没有做出来, 很丢人;

他的这个subSequence比我后来自己写的那个版本要简练一些; 我的循环写法是用的fast forward的思路, 最后效果并不如这里直接一个iteration一个iteration的conditional increment的写法;


discussion一个解法, 类似的:

@konglingfan said in Fast java solution 19ms beats 97% using indexOf:

My solution is very simple. Compare s with every words in dictionary using String.indexOf(). If matched, then compare size and lexicographical order with the candidate.

public class Solution {  
    public String findLongestWord(String s, List<String> d) {  
        String longest=null;  
        Iterator<String> itr=d.iterator();  
        while(itr.hasNext()){  
            String dd=itr.next();  
            int start=-1;  
            boolean flag=true;  
            for(int i=0;i<dd.length();i++){  
                start=s.indexOf(dd.charAt(i),start+1);  
                if(start<0){  
                    flag=false;  
                    break;  
                }  
            }  
            if(!flag)   continue;  
            if(longest==null)   longest=dd;  
            else{  
                if(dd.length()>longest.length())    longest=dd;  
                if(dd.length()==longest.length()&&dd.compareTo(longest)<0)   longest=dd;  
            }  
        }  
        return longest==null?"":longest;  
    }  
}

被Stefan吐槽写的太乱:

@StefanPochmann said in Fast java solution 19ms beats 97% using indexOf:

For "very simple" you're making a few things rather complicated. Especially that iterator and that "flag" thing. The result default can also be simpler. And it might be faster (at least with certain test cases) to do the length checks first.

Just a little rewrite:

    public String findLongestWord(String s, List<String> d) {  
        String longest = "";  
    words:  
        for (String dd : d) {  
            if (dd.length() < longest.length() ||  
                dd.length() == longest.length() && dd.compareTo(longest) > 0)  
                continue;  
            int start = -1;  
            for (int i = 0; i < dd.length(); i++){  
                start = s.indexOf(dd.charAt(i), start + 1);  
                if (start < 0)  
                    continue words;  
            }  
            longest = dd;  
        }  
        return longest;  
    }

Bigger rewrite:

    public String findLongestWord(String s, List<String> d) {  
        String longest = "";  
        for (String word : d)  
            if (isBetter(word, longest) && isSubsequence(word, s))  
                longest = word;  
        return longest;  
    }  

    private boolean isBetter(String a, String b) {  
        return a.length() > b.length() ||  
               a.length() == b.length() && a.compareTo(b) < 0;  
    }  

    private boolean isSubsequence(String a, String b) {  
        int start = -1;  
        for (int i = 0; i < a.length(); i++){  
            start = b.indexOf(a.charAt(i), start + 1);  
            if (start < 0)  
                return false;  
        }  
        return true;  
    }

这里想要你学的一个东西就是你看, java里面, 合理的factor出函数关系是很重要的(包括我之前的用factor function来解决multiple break问题的技巧), 不然的话, 你很有可能要依赖于类似Stefan的第一个方法那样的goto style的编程方法, 虽然他这里的这个goto并不是非常的严重;

@StefanPochmann said in Fast java solution 19ms beats 97% using indexOf:

@konglingfan Really not "goto style". But I almost never use it, either. In Python (my main language) there are better ways (like all(...) or for ... else ...), and in any language it's cleaner to just use such an isSubsequence helper function.

The "handmade" ones are probably mostly slower because indexOf has direct access to the characters whereas charAt is an indirection that costs time.

另外一个奇怪的事情:

@odin said in Fast java solution 19ms beats 97% using indexOf:

@StefanPochmann

I have read the original code for String.indexOf. However I still cannot find a clue why that runs much faster than the highest voted solution so far. Could you please explain it more? Thank you.

FYI, using indexOf is much fast than doing it this way:

    private boolean isSubseq(String str, String s) {  
        int m = str.length(), n = s.length(), strPtr = 0;  
        for (int i = 0; i < n; i++) {  
            if (str.charAt(strPtr) == s.charAt(i)) {  
                strPtr++;  
                if (strPtr == m) {  
                    return true;  
                }  
            }  
        }  
        return false;  
    }

这个就很奇怪了, 意思是用indexOf反而更快? 而不对, 可能是因为他这里用了charAt的原因;

另外下面还有一些无关讨论:

@StefanPochmann said in Fast java solution 19ms beats 97% using indexOf:

@yixuanwang.start said in Fast java solution 19ms beats 97% using indexOf:

about iterator remove part, this is what I mean

Ah, right, of course. Yeah, itr.remove(); works.

in java8 we don't have to explicitly call iterator to do that:

collect.removeIf(str -> str.startsWith("a"))

if we're using arraylist, the java8 way can mimimize the time complexity from O(n^2) to O(n)

Nice. Didn't know that method yet, though then again, I'm not a Java guy :-)

in the last part I was trying to compare the normal for loop vs. iterator(including enhanced for loop and forEach)

Ah, you meant "normal loop". I thought you were comparing "iterator-loop" with "for-each-loop", as those were the ones we were discussing.

enhanced for loop and forEach

Aren't they the same thing?

这些东西有时间看看不错, 尤其是用iterator完成remove, 这个我以前还真不知道;


discussion最优解:

@compton_scatter said in Short Java Solutions - Sorting Dictionary and Without Sorting:

We sort the input dictionary by longest length and lexicography. Then, we iterate through the dictionary exactly once. In the process, the first dictionary word in the sorted dictionary which appears as a subsequence in the input string s must be the desired solution.

public String findLongestWord(String s, List<String> d) {  
    Collections.sort(d, (a,b) -> a.length() != b.length() ? -Integer.compare(a.length(), b.length()) :  a.compareTo(b));  
    for (String dictWord : d) {  
        int i = 0;  
        for (char c : s.toCharArray())   
            if (i < dictWord.length() && c == dictWord.charAt(i)) i++;  
        if (i == dictWord.length()) return dictWord;  
    }  
    return "";  
}

An alternate, more efficient solution which avoids sorting the dictionary:

public String findLongestWord(String s, List<String> d) {  
    String longest = "";  
    for (String dictWord : d) {  
        int i = 0;  
        for (char c : s.toCharArray())   
            if (i < dictWord.length() && c == dictWord.charAt(i)) i++;  

        if (i == dictWord.length() && dictWord.length() >= longest.length())   
            if (dictWord.length() > longest.length() || dictWord.compareTo(longest) < 0)  
                longest = dictWord;  
    }  
    return longest;  
}

Time Complexity: O(nk), where n is the length of string s and k is the number of words in the dictionary.

总体思路是类似的;

下面一个比较奇怪的回复:

@WN_Amos said in Short Java Solutions - Sorting Dictionary and Without Sorting:

I was asked this question during intern phone interview with Google, and gave the similar answer after a few optimization, but sadly the interviewer are not satisfied at all, he wants optimization to improve time complexity...

看了一下时间, 这个时间是针对上面完整的没有sort的版本的; 意思是这个算法真的还可以继续提速?


discussion上面看到的另外一个trie的做法, 虽然也TLE了:

String max;  
public String findLongestWord(String s, List<String> d) {  
    max = new String();;  
    TrieNode root = new TrieNode();  
    buildTrie(root,d);  
    helper(s,0,root);  
    return max;  
}  

private void helper(String s, int i, TrieNode curr){  
    if(curr.word!=null){  
        if(curr.word.length()>max.length()){  
            max = curr.word;  
        }  
        else if(curr.word.length()==max.length()){  
            if(curr.word.compareTo(max)<0){  
                max = curr.word;  
            }  
        }  
    }  
    if(i==s.length()||curr.maxlen<max.length()) return;  
    char c = s.charAt(i);  
    if(curr.children[c-'a']!=null){  
        helper(s,i+1,curr.children[c-'a']);  
    }  
    helper(s,i+1,curr);  
}  

private void buildTrie(TrieNode root, List<String> d){  
    for(String s: d){  
        TrieNode curr = root;  
        curr.maxlen = Math.max(curr.maxlen,s.length());  
        for(int i = 0;i<s.length();i++){  
            char c = s.charAt(i);  
            if(curr.children[c-'a']==null){  
                curr.children[c-'a'] = new TrieNode();  
            }  
            curr = curr.children[c-'a'];  
            curr.maxlen = Math.max(curr.maxlen,s.length());  
        }  
        curr.word = s;  
    }  
}  


class TrieNode{  
    String word;  
    TrieNode[] children;  
    int maxlen;//even adding this var, trying to make the search end earlier, still TLE.  
    public TrieNode(){  
        word = null;  
        maxlen = 0;  
        children = new TrieNode[26];  
    }  
}

awice大神的post:

@awice said in Python Simple (Two pointer):

Let's check whether each word is a subsequence of S individually by "best" order (largest size, then lexicographically smallest.) Then if we find a match, we know the word being considered must be the best possible answer, since better answers were already considered beforehand.

Let's figure out how to check if a needle (word) is a subsequence of a haystack (S). This is a classic problem with the following solution: walk through S, keeping track of the position (i) of the needle that indicates that word[i:] still remains to be matched to S at this point in time. Whenever word[i] matches the current character in S, we only have to match word[i+1:], so we increment i. At the end of this process, i == len(word) if and only if we've matched every character in word to some character in S in order of our walk.

def findLongestWord(self, S, D):  
    D.sort(key = lambda x: (-len(x), x))  
    for word in D:  
        i = 0  
        for c in S:  
            if i < len(word) and word[i] == c:  
                i += 1  
        if i == len(word):  
            return word  
    return ""

discussion下面基本就没有更好的解法了;


submission最优解: 11(99):

class Solution {  
    public String findLongestWord(String s, List<String> d) {  
        int maxlen=0;  
        String res="";  
        for(String str: d)  
            if( containing(s, str) ){  
                if(str.length() >res.length() )  
                    res=str;  
                else if( str.length()== res.length() && str.compareTo(res)<0 )  
                    res=str;  
            }  

        return res;  

    }  

    private boolean containing(String s, String t){  
        if(t.length()>s.length())return false;  
        int pos=0;  
        for( char c: t.toCharArray() ){  
            pos = s.indexOf( c, pos);  
            if(pos==-1)return false;  
            pos++;  

        }  
        return true;  
    }  

}

大概思路跟之前discussion看到的一个帖子里面的思路是一样的, 反正就是用indexOf来加速; 奇怪的是, 这个算法的思路居然比我的char array的思路还要快, 不知道为什么; 而且是快很多;


Problem Description

Given a string and a string dictionary, find the longest string in the dictionary that can be formed by deleting some characters of the given string. If there are more than one possible results, return the longest word with the smallest lexicographical order. If there is no possible result, return the empty string.

Example 1:

Input:  
s = "abpcplea", d = ["ale","apple","monkey","plea"]  

Output:   
"apple"

Example 2:

Input:  
s = "abpcplea", d = ["a","b","c"]  

Output:   
"a"

Note:

  1. All the strings in the input will only contain lower-case letters.
  2. The size of the dictionary won't exceed 1,000.
  3. The length of all the strings in the input won't exceed 1,000.

Difficulty:Medium
Total Accepted:18K
Total Submissions:41.3K
Contributor:今が最高
Companies
google
Related Topics
two pointerssort
Similar Questions
Longest Word in Dictionary

results matching ""

    No results matching ""