RepeatedSubstringPatternOPT [source code]

public class RepeatedSubstringPatternOPT {
static
/******************************************************************************/
public class Solution {
    public boolean repeatedSubstringPattern(String s) {
        if (s == null || s.length() <= 1) return false;

        int len = s.length();
        int mid = len / 2;
        if (s.substring(0, mid).equals(s.substring(mid))) return true;

        int one_third = len / 3;
        String sub = s.substring(0, one_third);
        if (sub.equals(s.substring(one_third, one_third * 2))
                && sub.equals(s.substring(one_third * 2))) return true;

        if (len % 2 == 1) {
            char c = s.charAt(0);
            for (int i = 1; i < len; i++) {
                if (s.charAt(i) != c) return false;
            }
            return true;
        }

        return false;
    }
}
/******************************************************************************/

    public static void main(String[] args) {
        RepeatedSubstringPatternOPT.Solution tester = new RepeatedSubstringPatternOPT.Solution();
        String[] inputs = {"abab", "aba", "abcabcabcabc", "abcabcabcabcabc"};
        for (String s : inputs) System.out.printf("%s -> %s%n", s, tester.repeatedSubstringPattern(s));
    }
}

这个是 submission 最优解: 9(99.94);

不是特别看的懂这个算法;

看了一圈 discussion, 还是没有看到对这个算法的解释. 另外也感觉到这个算法的速度实在是太快了;

不过这个算法的核心思路还是很清晰的, 因为我们这个问题是一个return boolean的算法, 所以你首先应该决定的是你判断的核心点应该是 TRUE 还是 FALSE; 也就是你find true or find false;
他这里的做法就是find true: 大概意思就是, 如果是满足 repeated, 那么肯定怎样怎样;

比如他这里的逻辑是, 如果是满足 repeated, 并且 repeated 的次数(也就是 interval 的数量)是2的倍数, 那么我们可以判断左右两段相等;
如果repeat 次数不是2的倍数, 那么如果是3的倍数, 我们再怎么比较;
然后如果既不是2也不是3的倍数, 那么只可能是长度为奇数, 而且全部都相等;
all possible cases of true found;
否则就不可能是repeated, default就直接 return FALSE 就行了;

至于为什么这个逻辑, 还是有点不清楚;

这个是一个错误的算法, 直接试了一下, "abcabcabcabcabc"这样重复次数是5的倍数的就直接 break 了; 恶心人啊卧槽. 不过还是能学到一点东西的;

重温一下substring 的用法:

java> "012345".substring(0,3)  
java.lang.String res0 = "012"

第二个参数是右边界, 但是是不包含的;
另外, 这个是只有一个 int arg 的时候的用法:

java> "012345".substring(2)  
java.lang.String res1 = "2345"

这个唯一的 int 是起点; 而且是包括的;

这个是 discussion 上面一个很聪明的解法:

public boolean repeatedSubstringPattern(String str) {  
    String s = str + str;  
    return s.substring(1, s.length() - 1).contains(str);  
}

这个是作者的解释:
Basic idea:

First char of input string is first char of repeated substring
Last char of input string is last char of repeated substring
Let S1 = S + S (where S in input string)
Remove 1 and last char of S1. Let this be S2
If S exists in S2 then return true else false

这个是下面评论里面给出的一个证明:
Let's say T = S + S.
"S is Repeated => From T[1:-1] we can find S" is obvious.

If from T[1:-1] we found S at index p-1, which is index p in T and S.
let s1 = S[:p], S can be represented as s1s2...sn, where si stands for substring rather than character.
then we know T[p:len(S) + p] = s2s3...sn-1sns1 = S = s1s2...sn-2sn-1sn.
So s1 = s2, s2 = s3, ..., sn-1 = sn, sn = s1,Which means S is Repeated.

edit:
I should fix this because len(si) == len(sj) has not be proved. The procedure becomes complicated :-(
Let s1 = S[:p], S can be represented as s1X1.
then S = X1s1 = s1X1 (A).

len(X1) < len(s1) is false because if so we should find S in T[1:-1] at index len(X1) rather than len(s1) = p.
len(X1) == len(s1) and (A) => X1 = s1 => S is Repeated.
len(X1) > len(s1) and (A) => X1 has prefix s1 so can be represented as s1X2.
X1 = s1X2 and (A) => s1X2s1= s1s1X2 =>X2s1 = s1X2 (B).
len(X2) < len(s1) is false because if so (A) and (B) => S = X2s1s1 = s1s1X2 and we should find S at index len(X2) rather than p.

As long as len(Xi-1) > len(s1), we can get Xis1 = s1Xi through the prefix step and prove len(Xi) < len(s1) is false through the index step. Eventually we can get a Xn, len(Xn) == len(s1) and Xns1 = s1Xn => Xn = s1 => S = s1X1 = s1s1X2 = ... = s1s1...s1Xn-1 = s1s1...s1s1Xn => S is Repeated.

他 edit 后面的部分证明的其实很啰嗦, 其实如果证明了S=s1X1=X1s1, 就可以直接证明 S is a multiple of s1. suppose len(s1) < len(X1), 画画图很容易证明X1[ka, (k+1)a].equals(s1) for all k as long as not out of boundary. 这个证明我也贴上去了.

Problem Description

Given a non-empty string check if it can be constructed by taking a substring of it and appending multiple copies of the substring together. You may assume the given string consists of lowercase English letters only and its length will not exceed 10000.

Example 1:
Input: "abab"

Output: True

Explanation: It's the substring "ab" twice.
Example 2:
Input: "aba"

Output: False
Example 3:
Input: "abcabcabcabc"

Output: True

Explanation: It's the substring "abc" four times. (And the substring "abcabc" twice.)
Difficulty:Easy
Category:Algorithms
Acceptance:38.17%
Contributor: YuhanXu
Companies
amazon google
Related Topics
string
Similar Questions
Implement strStr()

LC459 RepeatedSubstringPatternOPT

RepeatedSubstringPatternOPT [source code]

Problem Description

results matching ""

No results matching ""