ReplaceWords [source code]


public class ReplaceWords {
static
/******************************************************************************/
public class Solution {
    public static class Node {
        String word = null;
        Node[] ar = new Node[26];
    }

    public String replaceWords(List<String> dict, String sentence) {
        if (dict.size() == 0 || sentence.length() == 0)
            return "";
        StringBuilder sb = new StringBuilder();
        String[] words = sentence.split(" ");
        Node root = new Node();
        for (String s : dict) {
            insert(root, s.toCharArray(), 0);
        }
        for (int i = 0; i < words.length; i++) {
            sb.append(get(root, words[i].toCharArray(), 0));
            if (i < words.length - 1)
                sb.append(" ");
        }
        return sb.toString();
    }

    public void insert(Node node, char[] chs, int index) {
        if (node.word != null)
            return;
        if (index == chs.length) {
            node.word = String.valueOf(chs);
            return;
        }
        char ch = chs[index];
        if (node.ar[ch - 'a'] == null) {
            node.ar[ch - 'a'] = new Node();
        }
        insert(node.ar[ch - 'a'], chs, index + 1);
    }

    public String get(Node node, char[] chs, int index) {
        if (node.word != null)
            return node.word;
        if (index >= chs.length)
            return String.valueOf(chs);
        char ch = chs[index];
        if (node.ar[ch - 'a'] == null)
            return String.valueOf(chs);
        return get(node.ar[ch - 'a'], chs, index + 1);
    }
}
/******************************************************************************/

    public static void main(String[] args) {
        ReplaceWords.Solution tester = new ReplaceWords.Solution();
        String[][] inputs = {
            {"cat", "bat", "rat"}, {"the cattle was rattled by the battery"}, {"the cat was rat by the bat"},//1
            {"e","k","c","harqp","h","gsafc","vn","lqp","soy","mr","x","iitgm","sb","oo","spj","gwmly","iu","z","f","ha","vds","v","vpx","fir","t","xo","apifm","tlznm","kkv","nxyud","j","qp","omn","zoxp","mutu","i","nxth","dwuer","sadl","pv","w","mding","mubem","xsmwc","vl","farov","twfmq","ljhmr","q","bbzs","kd","kwc","a","buq","sm","yi","nypa","xwz","si","amqx","iy","eb","qvgt","twy","rf","dc","utt","mxjfu","hm","trz","lzh","lref","qbx","fmemr","gil","go","qggh","uud","trnhf","gels","dfdq","qzkx","qxw"}, {"ikkbp miszkays wqjferqoxjwvbieyk gvcfldkiavww vhokchxz dvypwyb bxahfzcfanteibiltins ueebf lqhflvwxksi dco kddxmckhvqifbuzkhstp wc ytzzlm gximjuhzfdjuamhsu gdkbmhpnvy ifvifheoxqlbosfww mengfdydekwttkhbzenk wjhmmyltmeufqvcpcxg hthcuovils ldipovluo aiprogn nusquzpmnogtjkklfhta klxvvlvyh nxzgnrveghc mpppfhzjkbucv cqcft uwmahhqradjtf iaaasabqqzmbcig zcpvpyypsmodtoiif qjuiqtfhzcpnmtk yzfragcextvx ivnvgkaqs iplazv jurtsyh gzixfeugj rnukjgtjpim hscyhgoru aledyrmzwhsz xbahcwfwm hzd ygelddphxnbh rvjxtlqfnlmwdoezh zawfkko iwhkcddxgpqtdrjrcv bbfj mhs nenrqfkbf spfpazr wrkjiwyf cw dtd cqibzmuuhukwylrnld dtaxhddidfwqs bgnnoxgyynol hg dijhrrpnwjlju muzzrrsypzgwvblf zbugltrnyzbg hktdviastoireyiqf qvufxgcixvhrjqtna ipfzhuvgo daee r nlipyfszvxlwqw yoq dewpgtcrzausqwhh qzsaobsghgm ichlpsjlsrwzhbyfhm ksenb bqprarpgnyemzwifqzz oai pnqottd nygesjtlpala qmxixtooxtbrzyorn gyvukjpc s mxhlkdaycskj uvwmerplaibeknltuvd ocnn frotscysdyclrc ckcttaceuuxzcghw pxbd oklwhcppuziixpvihihp"}, {"i miszkays w gvcfldkiavww v dvypwyb bxahfzcfanteibiltins ueebf lqhflvwxksi dc k w ytzzlm gximjuhzfdjuamhsu gdkbmhpnvy i mengfdydekwttkhbzenk w h ldipovluo a nusquzpmnogtjkklfhta k nxzgnrveghc mpppfhzjkbucv c uwmahhqradjtf i z q yzfragcextvx i i j gzixfeugj rnukjgtjpim h a x h ygelddphxnbh rvjxtlqfnlmwdoezh z i bbfj mhs nenrqfkbf spfpazr w c dtd c dtaxhddidfwqs bgnnoxgyynol h dijhrrpnwjlju muzzrrsypzgwvblf z h q i daee r nlipyfszvxlwqw yoq dewpgtcrzausqwhh q i k bqprarpgnyemzwifqzz oai pnqottd nygesjtlpala q gyvukjpc s mxhlkdaycskj uvwmerplaibeknltuvd ocnn f c pxbd oklwhcppuziixpvihihp"},//2
        };
        for (int i = 0; i < inputs.length / 3; i++) {
            List<String> dict = Arrays.asList(inputs[3 * i]);
            String sentence = inputs[1 + 3 * i][0];
            String ans = inputs[2 + 3 * i][0];
            System.out.println(Printer.seperator());
            String output = tester.replaceWords(dict, sentence);
            System.out.println(
                Printer.wrapColor(Printer.array(inputs[3 * i]) + " AND " + sentence, "magenta") +
                "\n -> " + output +
                Printer.wrapColor(", expected: " + ans, ans.equals(output) ? "green" : "red")
                );
        }
    }
}

注意这里题目描述上的表达方式: 这里说given a dictionary, 实际上给的就是一个list; 要对这种设定有所熟悉;

感觉这个题目就是一个标准的查词典问题? 一个trie好像就可以解决? 瞄了一眼discussion, 果然是有trie的;

尝试自己写写看, 写不出来再看答案, 思考了一下, trie好像我其实不是特别陌生;

关于null base case的处理, 常见的两种思路, 之前总结过, 一种就是直接到了leaf再处理, 还有一种就是在leaf的parent处处理. 这个题目感觉用第二种思路更好处理一些;

第一次写trie算法, 自己摸索这回忆当时Sedgwick书上的做法, 最后写出来几乎是很快就AC, 而且速度是24ms (99%), submission最优解, 很开心.

两个核心操作就是insert和get; 写的时候注意对三个方面进行判断就行了:

index. 当index正好达到length的时候是termination点, 这个时候插入key到当前的node. 这个是Sedgwick上面的做法, 比较方便;
当前node的word: get的时候这个很好理解, 当前word有值的话, 那么一定是最短的root, 也就是我们需要的结果, 直接return就行了; 那么为什么insert的时候也是直接就return呢? 因为如果你insert到一个node, 结果发现当前node的word有值, 那么肯定你当前的这个key是node.word的冗余版本, 比如你手上的key是"ratt", 而node.word是"rat". 那么你这个时候还是直接abort就行了, 因为后面get的时候, 永远都不可能查到"ratt", 你存了也白存, 直接扔掉就可以了;
还是上面讲过的一个问题, Null base case什么时候判断. 这个题目我认为在leaf的上层判断比较好, 一个原因就是你insert的时候如果判断是Null, 还需要站在上层的位置来new一个出来. 如果你这个时候已经跑到leaf的位置了, 那么就有点来不及了;

我get和insert里面都有一个

        char ch = chs[index];

的操作, 注意这里因为有index access, 而我们这个题目里面index的取值范围其实是可以超界的, 所以你用这个array access之前记得要确定合法才行. 比如premature exit due to index == length一定要在这一行之前, 这样才能保证index合法(连续if给后续代码留下的隐藏条件);

这个是discussion的最优解:

public String replaceWords(List<String> dict, String sentence) {  
    Trie trie = new Trie(256);  
    dict.forEach(s -> trie.insert(s));  
    List<String> res = new ArrayList<>();  
    Arrays.stream(sentence.split(" ")).forEach(str -> res.add(trie.getShortestPrefix(str)));  
    return res.stream().collect(Collectors.joining(" "));  
}  


class Trie {  
    private int R;  
    private TrieNode root;  

    public Trie(int R) {  
        this.R = R;  
        root = new TrieNode();  
    }  

    // Returns the shortest prefix of the word that is there in the trie  
    // If no such prefix exists, return the original word  
    public String getShortestPrefix(String word) {  
        int len =  getShortestPrefix(root, word, -1);  
        return (len < 1) ? word : word.substring(0, len);  
    }  

    private int getShortestPrefix(TrieNode root, String word, int res) {  
        if(root == null || word.isEmpty()) return 0;  
        if(root.isWord) return res + 1;  
        return getShortestPrefix(root.next[word.charAt(0)], word.substring(1), res+1);  
    }  

    // Inserts a word into the trie.  
    public void insert(String word) {  
        insert(root, word);  
    }  

    private void insert(TrieNode root, String word) {  
        if (word.isEmpty()) { root.isWord = true; return; }  
        if (root.next[word.charAt(0)] == null) root.next[word.charAt(0)] = new TrieNode();  
        insert(root.next[word.charAt(0)], word.substring(1));  
    }  

    private class TrieNode {  
        private TrieNode[] next = new TrieNode[R];  
        private boolean isWord;  
    }  
}

虽然我其实并不认为这个算法有我的算法好, 不过这里也是提供了一个新的思路; 这里提供的主要的思路就是node怎么设计的. 我把node的val设计成是一个word本身, 而他们则是设计成一个boolean, 表示当前位置有一个root存在. 然后在搜索的过程中, recursion的返回值直接就是最后实际到达的深度. 如果你到达了a node at a certain depth, 并且成功返回, 那么最后你实际上就知道了你找到的root是什么: 从0到这个depth(从左到右的prefix)其实就是你被match成功的部分;

另外他这个代码并没有加我insert那里的那个优化, 不过加起来也不麻烦;

这个人还认为, 不应该用我这种直接把word存在node里面的做法, 因为这样最后就:

one possible issue in your implementation of Trie is, you are storing the entire word in the TrieNode that ends the word, this can be avoided as it defeats the space saving advantages of a Trie.

我反正不是很赞同这个说法, Sedgwick毕竟都是这么写的. 而且也没有人trie的主要目的是为了save space吧. 在特定问题比如这个问题里面, 实际上的目的就是为了save time;

这个是另外一个:

    public String replaceWords(List<String> dict, String sentence) {  
        String[] tokens = sentence.split(" ");  
        TrieNode trie = buildTrie(dict);  
        return replaceWords(tokens, trie);  
    }  

    private String replaceWords(String[] tokens, TrieNode root) {  
        StringBuilder stringBuilder = new StringBuilder();  
        for (String token : tokens) {  
            stringBuilder.append(getShortestReplacement(token, root));  
            stringBuilder.append(" ");  
        }  
        return stringBuilder.substring(0, stringBuilder.length()-1);  
    }  

    private String getShortestReplacement(String token, final TrieNode root) {  
        TrieNode temp = root;  
        StringBuilder stringBuilder = new StringBuilder();  
        for (char c : token.toCharArray()) {  
            stringBuilder.append(c);  
            if (temp.children[c - 'a'] != null) {  
                if (temp.children[c - 'a'].isWord) {  
                    return stringBuilder.toString();  
                }  
                temp = temp.children[c - 'a'];  
            } else {  
                return token;  
            }  
        }  
        return token;  
    }  

    private TrieNode buildTrie(List<String> dict) {  
        TrieNode root = new TrieNode(' ');  
        for (String word : dict) {  
            TrieNode temp = root;  
            for (char c : word.toCharArray()) {  
                if (temp.children[c - 'a'] == null) {  
                    temp.children[c - 'a'] = new TrieNode(c);  
                }  
                temp = temp.children[c - 'a'];  
            }  
            temp.isWord = true;  
        }  
        return root;  
    }  

    public class TrieNode {  
        char val;  
        TrieNode[] children;  
        boolean isWord;  

        public TrieNode(char val) {  
            this.val = val;  
            this.children = new TrieNode[26];  
            this.isWord = false;  
        }  
    }

这个比较奇葩的地方是他trie的操作全都是用iteration而不是recursion来做的. 这个不知道他是什么目的, 不过可以大概看一看, 虽然我觉得也不是太难. trie这个东西, 类似tree一样, 感觉recursion本能上应该还是方便一些;

不过好像node里面存boolean的做法实际上是主流?

discussion里面还有人认为这个题目实际上可以用暴力做:

public class Solution {  
    public String replaceWords(List<String> dict, String sentence) {  
        if (dict == null || dict.size() == 0) return sentence;  

        Set<String> set = new HashSet<>();  
        for (String s : dict) set.add(s);  

        StringBuilder sb = new StringBuilder();  
        String[] words = sentence.split("\\s+");  

        for (String word : words) {  
            String prefix = "";  
            for (int i = 1; i <= word.length(); i++) {  
                prefix = word.substring(0, i);  
                if (set.contains(prefix)) break;  
            }  
            sb.append(" " + prefix);  
        }  

        return sb.deleteCharAt(0).toString();  
    }  
}

因为题目条件里面就说了, 长度其实很短, 都. 不过我感觉还是学着用先进算法写更好;

另外这个暴力解笨办法其实你会写吗? 我感觉未必哦;

Problem Description

In English, we have a concept called root, which can be followed by some other words to form another longer word - let's call this word successor. For example, the root an, followed by other, which can form another word another.

Now, given a dictionary consisting of many roots and a sentence. You need to replace all the successor in the sentence with the root forming it. If a successor has many roots can form it, replace it with the root with the shortest length.

You need to output the sentence after the replacement.

Example 1:  
Input: dict = ["cat", "bat", "rat"]  
sentence = "the cattle was rattled by the battery"  
Output: "the cat was rat by the bat"

Note:

The input will only have lower-case letters.
1 <= dict words number <= 1000
1 <= sentence words number <= 1000
1 <= root length <= 100
1 <= sentence words length <= 1000

Difficulty:Medium
Total Accepted:4.2K
Total Submissions:8.6K
Contributor: maxsteel
Companies
uber
Related Topics
hash table trie
Similar Questions
Implement Trie (Prefix Tree)

Problem Description

这个是我在discussion发的帖子, 新题目, 好像没有多少特别好的解法, 所以发一个:

This is a typical trie problem and the algorithm required to solve this one is also very typical. You only need to implement insert so that you can build a trie, and then get so that you can get the root for every successor.

My approach of handling the trie is inspired by the approach taken by Robert Sedgwick in his book of Algorithms

Each node has 26 pointers downward. Also each node stands for a particular character on a particular index.
Each node can store a word. It is the accumulation of all the characters on the path from root to itself. It is only stored here, when we walk down from the root, and reaches the end of the key on this node. So a lot of nodes actually have their word attribute as null.

This is not a hard trie problem, and the code is quite clear. One little trick: in get, when you found that the current node has a word, then you return since this is the shortest root found, this is trivially obvious; but in insert, you should actually do the same, because for example: if you have the key as ratt and is trying to insert it, and you arrived at a node whose word is rat, then you should just abort here: there is no need to store ratt in some node downward anyway, since it will never be retrieved during the get loop.

The performance is 24ms (99%) @ 2017-08-04 20:10:47.

The complexity IMO is O(size_of_dict) (in the unit of character, so it's like sum of all root_lengths) for building the trie, and O(number_of_words_in_sentence * length_of_longest_root_in_dict) for the lookup.

LC648 ReplaceWords

ReplaceWords [source code]

Problem Description

Problem Description

results matching ""

No results matching ""