ShortEncodingOfWords [source code]
public class ShortEncodingOfWords {
static
/******************************************************************************/
class Solution {
public int minimumLengthEncoding(String[] words) {
Set<String> encoded = new HashSet<> ();
int total_len = 0;
Arrays.sort (words, (a, b) -> b.length () - a.length ());
loop_words: for (String word : words) {
if (!encoded.contains (word)) {
for (String encoded_word : encoded) {
if (encoded_word.endsWith (word)) {
continue loop_words;
}
}
encoded.add (word);
total_len += word.length () + 1;
}
}
return total_len;
}
}
/******************************************************************************/
public static void main(String[] args) {
ShortEncodingOfWords.Solution tester = new ShortEncodingOfWords.Solution();
String[][] inputs = {
{"time", "me", "bell"}, {"10"},
{"me", "time"}, {"5"},
};
for (int i = 0; i < inputs.length / 2; i++) {
System.out.println (Printer.separator ());
String[] words = inputs[2 * i];
int ans = Integer.parseInt (inputs[2 * i + 1][0]), output = tester.minimumLengthEncoding (words);
System.out.printf (
"%s\n%s\n%d\n",
Printer.array (words), Printer.wrap (output + "", output == ans ? 92 : 91), ans
);
}
}
}
这题目有点意思哈; 尤其是为什么需要这个分界呢: 这样就不用保存每一个word的length, 只要保存start就行了;
这个分界也是问题的关键;
如果是用trie的思路来做, 这里实际上是有两个trie不是吗? 这个感觉会相对复杂了一些;
没思路, 笨办法: 每一个word直接给一个自己的分段, 比如: bell#time#me#. 当然, 这个肯定是很笨的; 注意可以看出来, order does not matter for the min encoding length;
能不能针对上面的笨办法找到一个压缩的思路?
注意啊, 加入我们有一个单词, tim, 这个是不能包含在time#里面的, 因为我们是从左向右的阅读; 这个观察能够带来什么改变?
一个稍微可以压缩空间的方法, 每次要加一个新的单词, 就检查所有已经在encoding里面的单词, 是否endsWith他, 如果是, 就不用加, 否则就加一个单独的单词? 感觉这个速度会很坑爹; 写一下看看; 感觉还是能用trie来做的; 代码量有点大; 先试试笨办法;
笨办法居然AC了; 好吧; 注意这个排序; 这里有点贪心在里面的, 要从大的开始插入;
UNFINISHED
uwi:
class Solution {
public int minimumLengthEncoding(String[] words) {
int n = words.length;
boolean[] sup = new boolean[n];
long[] hs = new long[n];
for(int i = 0;i < n;i++){
for(int j = 0;j < words[i].length();j++){
hs[i] = hs[i]<<5|words[i].charAt(j)-'a'+1;
}
}
for(int i = 0;i < n;i++){
for(int j = 0;j < n;j++){
if(i != j){
if(hs[i] == hs[j]){
if(i < j){
sup[j] = true;
}
}else{
if(words[i].length() > words[j].length()
&& (hs[i]&(1L<<words[j].length()*5)-1) == hs[j]){
sup[j] = true;
}
}
}
}
}
int ret = 0;
for(int i = 0;i < n;i++){
if(!sup[i]){
ret += words[i].length() + 1;
}
}
return ret;
}
}
写的有点乱, 不想看;
Problem Description
Given a list of words, we may encode it by writing a reference string S and a list of indexes A.
For example, if the list of words is ["time", "me", "bell"], we can write it as S = "time#bell#" and indexes = [0, 2, 5].
Then for each index, we will recover the word by reading from the reference string from that index until we reach a "#" character.
What is the length of the shortest reference string S possible that encodes the given words?
Example:
Input: words = ["time", "me", "bell"]
Output: 10
Explanation: S = "time#bell#" and indexes = [0, 2, 5].
Note:
- 1 <= words.length <= 2000.
- 1 <= words[i].length <= 7.
- Each word has only lowercase letters.
Difficulty:Medium
Total Accepted:738
Total Submissions:2.5K
Contributor:awice