ArrayNesting [source code]

public class ArrayNesting {
static
/******************************************************************************/
public class Solution {
    public int arrayNesting(int[] nums) {
        int n = nums.length, max = 0;
        for (int i = 0; i < n; i++) {
            int index = i, count = 0;
            while (nums[index] < n && nums[index] != i) {
                count++;
                nums[index] += n;
                index = nums[index] - n;
            }
            if (++count > max)
                max = count;
            if (max >= (n + 1) / 2)
                return max;
        }
        return max;
    }
}
/******************************************************************************/

    public static void main(String[] args) {
        ArrayNesting.Solution tester = new ArrayNesting.Solution();
        int[][] inputs = {
            {5,4,0,3,1,6,2}, {4},
        };
        for (int i = 0; i < inputs.length / 2; i++) {
            int[] nums = inputs[2 * i];
            String numsStr = Printer.array(nums);
            int ans = inputs[1 + 2 * i][0];
            System.out.println(Printer.seperator());
            int output = tester.arrayNesting(nums);
            System.out.println(
                Printer.wrapColor(numsStr, "magenta") +
                " -> " + output + 
                Printer.wrapColor(", expected: " + ans, ans == output ? "green" : "red") +
                "\nAfter: " + Printer.array(nums)
                );
        }
    }
}

举例子, 笨办法, 人逻辑

这个题举了例子之后是比较好理解的, S[K]对于不同的K之间是disjoint的, 也就是不互相重叠. 考虑到这个问题还是一个index domain equals value domain的老问题了, 可以直接在input array上面做flag, 所以这个算法本身应该是不难的;

传统的/2, with truncation, 计算出来的其实是mid, or its floor, 比如7/2=3, 8/2=4;

假如我们想要计算mid, or its ceiling, 怎么计算? 这个问题其实困扰过我好久. 突然想到做法其实很简单: (n + 1) / 2就行了; 你的ceiling就是楼上的人的floor;

这里之所以计算这个是因为要有一个premature exit的作用.

很奇怪的一个事情, 这个算法最后居然非常的慢? 卧槽忘记加flag优化了; 加了这个防止重复的优化之后, 速度就好了: 40ms (64%); 虽然其实也不算是特别快的;

另外这个小算法真正写的时候也不是这么轻松的, 你写这个while循环的时候, 还是要有一个主要的想法: when to stop? 这里用的是一种探路的思路来判断termination; 感觉直接用变量来判断termination写边界和更新比较麻烦; 好像之前很多我写header写的很麻烦的while循环最后别人的大部分好的做法都是直接就探路来terminate做的;

这个题目剩下来的就是一系列的证明的: 不要以为不重要, 这样一个小算法, 你咔咔写完了, 面试官肯定会问你证明的;

真正需要证明的其实是从任何一个K出发, 一定会最后回到K. 这个你想想怎么证明;

这里总结一个思考这种证明问题的思路:

  • 如果感觉有uncertainty, 不要就直接放弃, 如果宏观咕噜了一会儿感觉没有什么可以高层次无视这个uncertainty的办法, 那么就
  • 分情况讨论: 这个提到过很多次了;
  • 当你分情况讨论的时候, 相当于你有了一个if elif. 你的这个if其实就是你的一个多出来的extra given assumption, 额外的条件, 所以问题难度实际上会有所简化;

我这里一个uncertainty就是, 到底有没有可能有一个A[k] = k? 也就是说有没有可能有自己Map到自己的?

首先假设没有嘛, 那么很自然的, 就很好证. 每一个K都Map到另外一个数字. 总共就N个数字, 不cycle是不可能的;

如果有k -> k的Map呢? 其实这个也简单, 我们就考虑其他的N-1个就行了: 如果我们从这个k自己出发, 那肯定就是一个长度为0的cycle, 对应的set大小就是0, 而其他的N-1个数字也不可能Map到这个k, 因为没有重复;

这样的考虑得到的逻辑就是, 我们相当于把problem缩小了: N缩小到了N-1.

这样基本就证明了;

还有怎么证明所有的cycle都是disjoint? formalize一下: 如果你从一个k出发, 结果碰到了一个visited number t, 那么k和t一定在同一个cycle里面; (学会把你要证明的statement转化为数学或者计算机上好表述和推敲的statement)

这个map移动的感觉, 就跟一个FSM一样, 状态机, 是definite的. 这个感觉证明起来也挺trivial的? t已经在一个cycle里面了, 那么k能够到t, t最后肯定也可以到k; 感觉这种很trivial的东西FSM理论里面估计有个啥定理? 猜的.

好像其实没有那么trivial? 你要证明的是如果有一个t在一个cycle里面, 你一个k能够走到他, 那么k就一定在cycle. 这个好像并不是那么obvious的;

Contradiction: 假设k不在这个cycle里面, 但是k能走到t, 那么k肯定有一个进入cycle的入口, 假如这个入口是t', 那么我们可以证明t'不存在: t'在自己的cycle里面有一个predecessor, 而k也是它的predecessor, 那么意思就是有两个mapping到同一个数, 这个按照题目的设计是不可能的(这个就是利用这里每一个数字只可能有一个parent, 也只可能有一个child的性质);


editorial里面也没有给出什么特别好的解法, 一个是O(N)的, 其实就是用一个shadow flag的技术, 来保存一个visited就行了; 另外他也做了一个InPlace的, 不过用的方法不太好, 是无法还原的: 直接设置成infinity, 这个就很尴尬了;


discussion有一个人给的说法很好:

From the statement "The array of size N contains number [0, N-1]", we can know that in-degree of all nodes are exactly one (n different edges for n nodes).

Therefore the graph should consist of several cycles and the cycles have no "tails". That's why we can skip the visited nodes, where to begin in a visited cycle doesn't matter in this circumstance.

这个就很专业了;


InPlace flagging的话, 想到三个点:

  • negate
  • module
  • sorting

他这里用的就是个sorting的解法:

public class Solution   
{  
    public int arrayNesting(int[] nums)   
    {  
        if (nums.Length == 0) return 0;  

        var global = 1;  
        for (int i = 0; i < nums.Length; i++)  
        {  
            var local = 1;  
            while(nums[i] != i)  
            {  
                var temp = nums[i];  
                nums[i] = nums[temp];  
                nums[temp] = temp;      //头尾相接的三行, 明显的swap  

                local++;  
            }  

            global = Math.Max(local, global);  
        }  

        return global;  
    }  
}

大概了解一下还是有意思的. 他这个是没有visited优化的, 所以最后的速度其实不行. 而且虽然做到了O(1)空间, 但是无法recover.

在value as index的前提下, 这个swap好像有点不是特别好理解: 考虑这个: i -> nums[i] (temp) -> nums[temp];

我想了一下, 他好像其实并没有完成一个用sorting做InPlace flag的东西? 就是自己swap的玩的?

看了一下, 这个算法实际上好蠢. 这个严格来说不是一个swap, 而像是一个ListNode的操作, 把nums[a] = b理解成 a -> b就行了. 他这个做法就是让i的箭头的末端不停向前移动, 直到走到头. 真是不嫌麻烦好吧. 不过这个思路还是挺有意思的;

这个人写的代码好像不是JAVA, 大小写不太对头;


看了一下, submission特别快的那些基本都是什么visit之后直接改成-1的pollution做法;


discussion里面还有个大神直接用UF来做了, 可以学习一下:

public class Solution {  
    class UnionFind {  
        private int count = 0;  
        private int[] parent, rank;  

        public UnionFind(int n) {  
            count = n;  
            parent = new int[n];  
            rank = new int[n];  
            for (int i = 0; i < n; i++) {  
                parent[i] = i;  
            }  
        }  

        public int find(int p) {  
            int q = parent[p];  
            while (q != parent[q]) {  
                q = parent[q];  
            }  
            parent[p] = q;  
            return q;  
        }  

        public void union(int p, int q) {  
            int rootP = find(p);  
            int rootQ = find(q);  
            if (rootP == rootQ) return;  
            if (rank[rootQ] > rank[rootP]) {  
                parent[rootP] = rootQ;  
            }  
            else {  
                parent[rootQ] = rootP;  
                if (rank[rootP] == rank[rootQ]) {  
                    rank[rootP]++;  
                }  
            }  
            count--;  
        }  

        public int count() {  
            return count;  
        }  

        public int getMaxUnion() {  
            Map<Integer, Integer> map = new HashMap<>();  
            int max = 1;  
            for (int i = 0; i < parent.length; i++) {  
                int p = find(i);  
                map.put(p, map.getOrDefault(p, 0) + 1);  
                max = Math.max(max, map.get(p));  
            }  
            return max;  
        }  
    }  

    public int arrayNesting(int[] nums) {  
        int n = nums.length;  
        UnionFind uf = new UnionFind(n);  
        for (int i = 0; i < n; i++) {  
            uf.union(i, nums[i]);  
        }  
        return uf.getMaxUnion();  
    }  
}

暂时好像没有时间看了, 搁置一下. 下次做题碰到一次UF之后再来看这个.


Problem Description

A zero-indexed array A consisting of N different integers is given. The array contains all integers in the range [0, N - 1].

Sets S[K] for 0 <= K < N are defined as follows:

S[K] = { A[K], A[A[K]], A[A[A[K]]], ... }.

Sets S[K] are finite for each K and should NOT contain duplicates.

Write a function that given an array A consisting of N integers, return the size of the largest set S[K] for this array.

Example 1:

Input: A = [5,4,0,3,1,6,2]  
Output: 4  
Explanation:   
A[0] = 5, A[1] = 4, A[2] = 0, A[3] = 3, A[4] = 1, A[5] = 6, A[6] = 2.  

One of the longest S[K]:  
S[0] = {A[0], A[5], A[6], A[2]} = {5, 6, 2, 0}

Note:

  1. N is an integer within the range [1, 20,000].
  2. The elements of A are all distinct.
  3. Each element of array A is an integer within the range [0, N-1].

Difficulty:Medium
Total Accepted:7.9K
Total Submissions:15.9K
Contributor: fallcreek
Companies
apple
Related Topics
array
Similar Questions
Nested List Weight Sum Flatten Nested List Iterator Nested List Weight Sum II


Problem Description

这个是我发在discussion的帖子:

The given condition that the index domain is the same as the value domain is frequently encountered in Leetcode problems. When we have this condition, I tend to think of using a technique I call inplace flag. Instead of using an extra visited array, we can just mutate the input array to reflect the flag information we want.

The usual inplace flagging techniques include:

  • negating: negate to reflect one visit, use abs to get the original value. This only works to detect odd numbers of previous occurrence. It is not appropriate for this problem also because the value can actually be 0 in this problem, so negating 0 does not really flag anything.
  • module (which I used here): add one n to reflect a visit. Mod n with the entry to get the original value.
  • sorting: put each number into its corresponding index. This is not recoverable with O(1) space, so not considered here.

Regarding the problem of Input Pollution, it is actually easily fixable when using inplace flagging, albeit in a way some may argue imperfect. We can just recover the original input array before returning at the end. For negating, just do abs; for module method, just mod n at the end. This method fixes the problem in some situation but may still cause problem in a multithreading context.

Proof of this method

First, I personally like this explanation:

@szlghl1 said in [C++] [Java] Clean Code - O(N):

Just some supplement.

From the statement "The array of size N contains number [0, N-1]", we can know that in-degree of all nodes are exactly one (n different edges for n nodes).

Therefore the graph should consist of several cycles and the cycles have no "tails". That's why we can skip the visited nodes, where to begin in a visited cycle doesn't matter in this circumstance.

It's concise and good enough for this problem. Also, I have not taken any automata course or learned anything about FSM theory. I have a feeling there are some theorems there to make the proofs here trivially easy. But again, I don't know any.

But in case you are into a more verbose proof, here goes.

Lemma 1: when we start from any k, we will eventually go back to k.

Proof:
First, let's understand each entry of A[i] = e as a mapping i -> e. Let's simplify. Suppose that there is no k -> k mapping for any k. That is no k is its own value. Then we ca easily see that there will be a cycle and when we start from k we will get back: there are only n distinct numbers in total. We get a different number each iteration, and then we are guaranteed to have a cycle somewhere.
One caveat: what if there was a cycle when we start from k, but the cycle does not actually contain k itself? like k -> a -> b -> c -> a? This is also impossible, because in such a situation you will see that a actually belongs to two different mappings where it's a descendant: k -> a and c -> a. No number can have two different parents because that would mean in the original array, two different indices have the same value, that is just illegal.

Now, what if there are actually one k -> k mappings? Just think of the subproblem formed by the N-1 numbers: the lemma is valid in this subproblem. Regarding the sole k -> k itself, it's a cycle already, and does not undermine the proof of the lemma.
What if there are more such k -> k mappings? Just think inductively. They don't matter.
End Of Proof

Now, lemma 1 proved why we do the duplicate detection as the loop header. But what about the visited flag* part? Why is it that as long as we found a visited number, we can leave it there because it's already covered by some cycle from a previous i?

Lemma 2: All cycles are disjoint.

Proof: to make the lemma easier to prove, we rephrase:
If we start from k and reached a visited t, then k must belong to the same cycle as t.

This is actually easily provable using the same thought regarding the k -> a -> b -> c -> a above: now that t is visited, we know that t is in a cycle. And now that k can reach t, then if we start from k, we must enter this very cycle at some point. Say this point is t' (t' can be the same as t):

  • if t' is not t: then we know that we must have a mapping k -> t'. Suppose k is not in this cycle for the purpose of contradiction. Then there must be a node t'' in the cycle which is not k but which also provides t'' -> t'. Now t' has two distinct parents, and that is illegal. Thus contradiction.
  • if t' equals t: so we have k -> t. Do something similar to the above case and it's also easily provable.
    End of Proof

I think these two lemmas suffice for understanding the problem and the algorithm now. Hope this helps.

results matching ""

    No results matching ""