BoldWordsInString [source code]
public class BoldWordsInString {
static
/******************************************************************************/
class Solution {
public String boldWords(String[] words, String S) {
boolean[] bold = new boolean[S.length ()];
for (int i = 0; i < words.length; i++) {
List<Integer> indices = kmp (S, words[i]);
if (!indices.isEmpty ()) {
for (int idx : indices) {
for (int j = 0; j < words[i].length (); j++)
bold[idx + j] = true;
}
}
}
char[] chs = S.toCharArray ();
StringBuilder sb = new StringBuilder ();
if (bold[0])
sb.append ("<b>");
for (int i = 0; i < chs.length; i++) {
if (bold[i] && i > 0 && !bold[i - 1])
sb.append ("<b>");
sb.append (String.valueOf (chs[i]));
if (bold[i] && i < chs.length - 1 && !bold[i + 1])
sb.append ("</b>");
}
if (bold[chs.length - 1])
sb.append ("</b>");
return sb.toString ();
}
List<Integer> kmp (String A, String B) {
List<Integer> res = new ArrayList<> ();
char[] cha = A.toCharArray (), chb = B.toCharArray ();
int[] lps = new int[chb.length];
for (int len = 0, i = 1; i < chb.length; ) {
if (chb[len] == chb[i]) lps[i++] = ++len;
else if (len > 0) len = lps[len - 1];
else i++;
}
for (int i = 0, j = 0; i < cha.length; i += j - lps[j - 1], j = lps[j - 1]) {
while (j < chb.length && i + j < cha.length && cha[i + j] == chb[j]) j++;
if (j == chb.length) {
res.add (i);
continue;
}
if (i + j >= cha.length) break;
if (j == 0) j++;
}
return res;
}
}
/******************************************************************************/
public static void main(String[] args) {
BoldWordsInString.Solution tester = new BoldWordsInString.Solution();
String[][] inputs = {
{"ab", "bc"}, {"aabcd"}, {"a<b>abc</b>d"},
{"ccb","b","d","cba","dc"}, {"eeaadadadc"}, {"eeaa<b>d</b>a<b>d</b>a<b>dc</b>"},
{"b","dee","a","ee","c"}, {"cebcecceab"}, {"<b>c</b>e<b>bc</b>e<b>cc</b>e<b>ab</b>"},
};
for (int i = 0; i < inputs.length / 3; i++) {
String[] words = inputs[3 * i];
String S = inputs[3 * i + 1][0], ans = inputs[3 * i + 2][0];
System.out.println (Printer.separator ());
String output = tester.boldWords (words, S);
System.out.printf ("[%s] and %s -> %s, expected: %s\n",
Printer.array (words), S, Printer.wrapColor (output, output.equals (ans) ? "green" : "red"), ans
);
}
}
}
题目意思很直接, 但是感觉会有很多的Corner Case, 加上又是Google, 几乎可以确定; 当然, 这种情况, 与其坐在这里担心uncertainty, 不如先笔算笨办法, 看看从哪里切入;
主要是想, 题目给的这个反例, 要怎么解决? 当然, 最好的肯定是你的算法能够直接就处理掉; 但是如果写不出来这样的算法, 那么一个办法是可以加一个POST processing? 这个用一个Stack就可以解决, 不过感觉还是有点heavyweight了; 当然, 如果真的要用这个方法, 那么一个需要想到的点是, 不要每次有一个keyword匹配成功就立刻加tag, 最好的办法我认为应该是记录你需要加tag的index, 然后最后统一最后一步再加tag. 这个index记录的方式要注意统一, 比如统一记录需要加的tag之后的那个char的index. 但是这样行吗? 统一记录index然后最后再加tag? 要知道最后这一步统一tag好像并不是那么trivial, 因为一旦你加了任何一个tag, 其他记录的index全部都失效了; 感觉这个想法有问题;
想不出来好办法, 先随便写一个笨办法;
最后发现笨办法其实也没有那么容易就写出来; 最后还是闲着蛋疼用KMP做了一遍; 做完之后自己回头思考, 如果我有把握的说能够很自在的5分钟甚至更短的时间内毫无错误的写出来KMP, 那么这个题目确实是不难的;
撇开KMP的部分, 这个题目本身也是有很多教育意义的; 比如这里的一些数据处理, 我只想到的唯一的办法就是把所有的字母全都拆分开来; 然后一个一个的根据tag的flag来进行修改;
另外, 毕竟是Google的题目, 最后两个test把我一开始的算法给break掉了; 原因其实也是很简单, 因为最后加tag这一步其实是一个分段算法, 我没有考虑到收尾问题; 事实上这一题不仅要考虑收尾问题, 还要考虑开头问题;
最后的速度是33ms (NA), 不知道算是什么样的;
另外, 为什么这题我这个思路需要用KMP? 因为string本身的API只能返回leftmost或者rightmost的index, 这个不满足我这里的要求, 我要所有的occurrence. 当然, 就算你最后决定不用KMP, 可能还是要用naive的方法实现一个find all occurrences
的过程; 如果我能够做到随手写出KMP, 那么就没有必要写naive的了;
editorial
Approach #1: Brute Force [Accepted]
Intuition
Let's try to learn which letters end up bold, since the resulting answer will just be the canonical one - we put bold tags around each group of bold letters.
To do this, we'll check for all occurrences of each word and mark the corresponding letters bold.
Algorithm
Let's work on first setting mask[i] = true if and only if the i-th letter is bold. For each starting position i in S, for each word, if S[i] starts with word, we'll set the appropriate letters bold.
Now armed with the correct mask, let's try to output the answer. A letter in position i is the first bold letter of the group if mask[i] && (i == 0 || !mask[i-1]), and is the last bold letter if mask[i] && (i == N-1 || !mask[i+1]). Alternatively, we could use itertools.groupby in Python.
Once we know which letters are the first and last bold letters of a group, we know where to put the "" and "" tags.
class Solution {
public String boldWords(String[] words, String S) {
int N = S.length();
boolean[] mask = new boolean[N];
for (int i = 0; i < N; ++i)
for (String word: words) search: {
for (int k = 0; k < word.length(); ++k)
if (k+i >= S.length() || S.charAt(k+i) != word.charAt(k))
break search;
for (int j = i; j < i+word.length(); ++j)
mask[j] = true;
}
StringBuilder ans = new StringBuilder();
int anchor = 0;
for (int i = 0; i < N; ++i) {
if (mask[i] && (i == 0 || !mask[i-1]))
ans.append("<b>");
ans.append(S.charAt(i));
if (mask[i] && (i == N-1 || !mask[i+1]))
ans.append("</b>");
}
return ans.toString();
}
public boolean match(String S, int i, int j, String T) {
for (int k = i; k < j; ++k)
if (k >= S.length() || S.charAt(k) != T.charAt(k-i))
return false;
return true;
}
}
速度比我的稍微慢一点: 37ms;
discussion最优解, 几乎是一样的思路:
@lily4ever said in Clean Java solution using only boolean array and StringBuilder:
The main idea is to use a boolean array to mark the words at the corresponding positions in S. Then build the final string based on the marked positions.
public String boldWords(String[] words, String S) {
if (words == null || words.length == 0) return "";
boolean[] marked = new boolean[S.length()];
for (String word : words) {
markWords(S, word, marked);
}
StringBuilder sb = new StringBuilder();
for (int i = 0; i < S.length(); i++) {
if (marked[i] && (i == 0 || !marked[i - 1])) {
sb.append("<b>");
}
sb.append(S.charAt(i));
if (marked[i] && (i == S.length() - 1 || !marked[i + 1])) {
sb.append("</b>");
}
}
return sb.toString();
}
void markWords(String S, String word, boolean[] marked) {
for (int i = 0; i <= S.length() - word.length(); i++) {
String substr = S.substring(i, i + word.length());
if (substr.equals(word)) {
for (int j = i; j < i + word.length(); j++) {
marked[j] = true;
}
}
}
}
基本就这些了, submission也看不到;
Problem Description
Given a set of keywords words and a string S, make all appearances of all keywords in S bold. Any letters between and tags become bold.
The returned string should use the least number of tags possible, and of course the tags should form a valid combination.
For example, given that words = ["ab", "bc"] and S = "aabcd", we should return "aabcd". Note that returning "aabc</b>d" would use more tags, so it is incorrect.
Note:
- words has length in range [0, 50].
- words[i] has length in range [1, 10].
- S has length in range [0, 500].
- All characters in words[i] and S are lowercase letters.
Difficulty:Easy
Total Accepted:1.5K
Total Submissions:4.5K
Contributor:tinylic
Companies
google
Related Topics
string
Hint 1
First, determine which letters are bold and store that information in mask[i] = if i-th character is bold. Then, insert the tags at the beginning and end of groups. The start of a group is if and only if (mask[i] and (i == 0 or not mask[i-1])), and the end of a group is similar.