MakingALargeIsland [source code]


public class MakingALargeIsland {
static
/****************************************************************************/
class Solution {
    static int[] dirs = {-1, 0, 1, 0, -1};

    public int largestIsland(int[][] grid) {
        UnionFind uf = new UnionFind (grid);
        int R = grid.length, C = grid[0].length;
        for (int i = 0; i < R; i++) for (int j = 0; j < C; j++) if (grid[i][j] == 1) {
            for (int k = 0; k < 4; k++) {
                int nei = i + dirs[k], nej = j + dirs[k + 1];
                if (nei >= 0 && nej >= 0 && nei < R && nej < C && grid[nei][nej] == 1)
                    uf.union (i * C + j, nei * C + nej);
            }
        }
        int res = -1;
        for (int i = 0; i < R; i++) for (int j = 0; j < C; j++) if (grid[i][j] == 0) {
            int[] neighbors = new int[4];
            int count = 0;
            for (int k = 0; k < 4; k++) {
                int nei = i + dirs[k], nej = j + dirs[k + 1];
                if (nei >= 0 && nej >= 0 && nei < R && nej < C && grid[nei][nej] == 1) {
                    neighbors[count++] = nei * C + nej;
                }
            }
            res = Math.max (res, uf.collectSize (neighbors, count));
        }
        return Math.max (uf.maxSize (), res + 1);
    }

    class UnionFind {
        int[] parent;
        int[] size;
        int[] rank;
        int count;
        int max_size;
        int C;

        UnionFind (int[][] grid) {
            int R = grid.length;
            C = grid[0].length;
            parent = new int[R * C];
            size = new int[R * C];
            rank = new int[R * C];
            for (int r = 0; r < R; r++) {
                for (int c = 0; c < C; c++) {
                    if (grid[r][c] == 1) {
                        parent[r * C + c] = r * C + c;
                        count++;
                    } else {
                        parent[r * C + c] = -1;
                    }
                    size[r * C + c] = 1;
                    rank[r * C + c] = 0;
                }
            }
            max_size = 1;
        }

        int find (int x) {
            if (parent[x] != x)
                parent[x] = find (parent[x]);
            return parent[x];
        }

        boolean union (int a, int b) {
            int ra = find (a), rb = find (b);
            if (ra != rb) {
                if (rank[ra] >= rank[rb]) {
                    parent[rb] = ra; size[ra] += size[rb];
                    max_size = Math.max (max_size, size[ra]);
                    if (rank[ra] == rank[rb])
                        rank[ra]++;
                } else {
                    parent[ra] = rb; size[rb] += size[ra];
                    max_size = Math.max (max_size, size[rb]);
                }
                count--;
            }
            return ra != rb;
        }

        int collectSize (int[] nums, int count) {
            Set<Integer> ids = new HashSet<> ();
            int res = 0;
            for (int i = 0; i < count; i++) {
                int id = find (nums[i]);
                if (id >= 0 && ids.add (id)) {
                    res += size [id];
                }
            }
            return res;
        }

        int maxSize () {
            return max_size;
        }

        public String toString () {
            return String.format ("parent:\n%s\nrank\n%s\nsize:\n%s\nmax_size:%d", 
                matrix (parent), matrix (rank), matrix (size), max_size
            );
        }

        String matrix (int[] grid) {
            StringBuilder sb = new StringBuilder ();
            for (int i = 0; i < parent.length; i++) {
                sb.append (grid[i] + "\t");
                if (i % C == C - 1)     // proper row end newline
                    sb.append ("\n");
            }
            return sb.toString ();
        }
    }
}
/****************************************************************************/

    public static void main(String[] args) {
        MakingALargeIsland.Solution tester = new MakingALargeIsland.Solution();
        int[][][] inputs = {
            {{1, 0}, {0, 1}}, {{3}},
            {{1, 1}, {1, 0}}, {{4}},
            {{1, 1}, {1, 1}}, {{4}},
            {{0, 0}, {0, 0}}, {{1}},
        };
        for (int i = 0; i < inputs.length / 2; i++) {
            System.out.println (Printer.separator ());
            int[][] grid = inputs[2 * i];
            int ans = inputs[2 * i + 1][0][0], output = tester.largestIsland (grid);
            System.out.printf ("%s%s\n%d\n", 
                Matrix.printMatrix (grid), Printer.wrap (output + "", output == ans ? 92 : 91), ans
            );
        }
    }
}

看起来应该是用UF来写的, 不过具体怎么实现不是很有思路; 尤其是最后这个change的点, 在哪里呢;

本来以为这题属于难题, 没想到看了一下, 最后contest结束的时候AC rate居然有40%. 只能说是我自己菜吧.

看完editorial才发现这题根本就不算难题; 用DFS做CC可以很轻松的解决; 为了复习UF, 我决定这题自己用UF来实现看看; 虽然这题用UF实际上根本没有什么优势; DFS应该是更快的我感觉: 每一个cell只需要被遍历一次;

为了加强复习, 这次的UF完全手写, 不复制粘贴;

另外, 这次仍然是一个二维问题, 我依然尝试手写二维的版本; 这个事情好像之前做过一次, 这里重新做一次; 写二维的一个优势是client代码(也就是这里题目的主函数的代码)可以更加简化, 虽然其实差别不是很大;

缺点就是二维的实现稍微麻烦一些, 需要熟练; 关键点就是group id, group size, group rank这些核心的数据, 全都要用1D来表示; 所以中间一些转换是少不了的; 算了, 还是用1D的算了, 突然发现bear里面用来做LeetCode200的那个UF, 实际上也是一个本身需求是2D的题目, 但是最后还是完全用1D的接口也能搞定; 这个转换就交给client自己搞算了;

reminder: rank的初始值应该是0; 因为2^0 = 1 (size初始值);

这题我不维护cc count了, 不过一般也有很多是维护这个count的;

bear上面那个给LeetCode200的UF代码里面, 有一个优化, 就是UF的constructor里面, 只给1的格子独立的ID, 否则如果是0, 也就是water的格子, 直接就是用默认的ID是0, 这里事实上类似于一个最开始的默认union操作: 所有的water cell默认一开始就是union起来的;
我这里有没有必要写这个优化呢? 说实话虽然我自称熟悉UF, 但是这个优化一开始我还真的没有想到;

好像有必要实现这个优化, 为什么呢, 加入我最后flip的时候, 我站在一个0, 看自己的四周: 如果我不实现这个优化, 那么就有可能虽然我这个0周围也全都是0, 但是他们的ID全都是不一样的; 那我可能就还觉得他们其实都是不同的island, 但是实际上他们都是属于water而已, 从来没有来得及给他们union起来;

这个优化为什么我漏掉了? 如果想要避免漏掉, 后面应该怎么总结这个? 暂时还真的想不到;
另外, bear上面这个LeetCode200代码, 给那些1cell的初始ID就是i C + j这样的东西, 那么这个能保证和water区分开吗? 假如最左上角的这个cell是一个1, 但是他这个i C + j算出来就是0, 那么他最后不就被认为是和water一样了吗? 当时这题这样写为什么没有出问题?

我怎么感觉这个是一个bug啊? 当时没有被检测出来吗; 比如

1 1 0  
0 0 0

这样一个grid;

那么一开始UF给分的就是(转换成为2D表示):

0 1 0  
0 0 0

结果就认为(0, 0)这个点跟其他的0cell都连起来了; 确认过代码, 真的是这个实现;

反正这题肯定不能这么写;

union water first

另外关于怎么想到这个initial water union的方法, 一个简单的就是, 这些题其实都是island类的题目, 有明确的boundary, 这种情况下在初始化的时候, 就想到在constructor里面把其中的一个类别(water or land)用不分配独立ID的方式先implicit的union在一起;

这题我就把这些water默认给-1算了, 为了跟有效ID区间进行区分;

注意这个water判断只需要应用到parent的初始化上面就行了; rank和size其实不用管: water里面的格子反正后面也不会给他们union了, 所以rank和size就都给默认值(0和1)就行了, 无所谓;

最后调了半天也是AC了, 速度是27(NA);

在调的过程中发现LeetCode200那个写法, 如果说water全都给ID 0, 其实是无所谓的, 因为我们最后永远不会对water cell进行query; 所以即使是出现这样的parent数组:

0 1 0  
0 0 0

实际上我们最后唯一的一个union就是(0, 0)和(0, 1)之间的, (0, 0)位置存储的rank和size其实也是正确的;

当然, 还是像我这里这样明确的用-1来区分有效ID范围我感觉还是更安全一些;

不过类似LeetCode200那个写法, 我这里还是要注意, 不要对water cell进行find之类的操作; 这个一开始也是主要的一个bug;

写这个UF的时候, 我还是非常希望不保留一个封装, 比如, 我不希望UF返回一个cell对应的id, 因为以前讲过, 这个id应该是internal的;
所以我就这样设计了这么一个collect函数, 这个函数的参数表用数组+size这样的经典组合来实现一个可变长度的接口, 因为一个0cell的有效neighbor可能是4个, 也可能不到4个(出界);

最后看来用1D的UF internal来实现2D的接口, 最后并不会让client代码复杂多少; 相反, 回忆上一次实现完整的2D接口的UF, 倒是非常麻烦; 所以这个办法实际上是更好的;

这里的这个UF是一个相当完整的实现, 大部分有用的功能都提供了;

不过代码确实是有点长; 真正面试的时候除非飞速把UF的boilerplate代码写出来, 不然估计是写不完的;

count维护方式稍微注意一下, 包括constructor里面的初始化; 我知道不难, 不过这个变量我实现的比较少, 就怕手生.

UNFINISHED

editorial

Approach #1: (Naive) Depth First Search [Time Limit Exceeded]

Intuition

For each 0 in the grid, let's temporarily change it to a 1, then count the size of the group from that square.

Algorithm

For each 0, change it to a 1, then do a depth first search to find the size of that component. The answer is the maximum size component found.

Of course, if there is no 0 in the grid, then the answer is the size of the whole grid.

class Solution {  
    int[] dr = new int[]{-1, 0, 1, 0};  
    int[] dc = new int[]{0, -1, 0, 1};  

    public int largestIsland(int[][] grid) {  
        int N = grid.length;  

        int ans = 0;  
        boolean hasZero = false;  
        for (int r = 0; r < N; ++r)  
            for (int c = 0; c < N; ++c)  
                if (grid[r][c] == 0) {  
                    hasZero = true;  
                    grid[r][c] = 1;  
                    ans = Math.max(ans, check(grid, r, c));  
                    grid[r][c] = 0;  
                }  

        return hasZero ? ans : N*N;  
    }  

    public int check(int[][] grid, int r0, int c0) {  
        int N = grid.length;  
        Stack<Integer> stack = new Stack();  
        Set<Integer> seen = new HashSet();  
        stack.push(r0 * N + c0);  
        seen.add(r0 * N + c0);  

        while (!stack.isEmpty()) {  
            int code = stack.pop();  
            int r = code / N, c = code % N;  
            for (int k = 0; k < 4; ++k) {  
                int nr = r + dr[k], nc = c + dc[k];  
                if (!seen.contains(nr * N + nc) && 0 <= nr && nr < N &&  
                        0 <= nc && nc < N && grid[nr][nc] == 1) {  
                    stack.push(nr * N + nc);  
                    seen.add(nr * N + nc);  
                }  
            }  
        }  

        return seen.size();  
    }  
}

没有太多滑头, 基本是一个穷搜的过程; Stack写DFS, 也是awice喜欢干的事情;

Approach #2: Component Sizes [Accepted]

Intuition

As in the previous solution, we check every 0. However, we also store the size of each group, so that we do not have to use depth-first search to repeatedly calculate the same size.

However, this idea fails when the 0 touches the same group. For example, consider grid = [[0,1],[1,1]]. The answer is 4, not 1 + 3 + 3, since the right neighbor and the bottom neighbor of the 0 belong to the same group.

We can remedy this problem by keeping track of a group id (or index), that is unique for each group. Then, we'll only add areas of neighboring groups with different ids.

这个题目其实我自己有稍微思考一下, 很简单的, 先不考虑flip, 先观察给你的这个grid, 有很多的CC, 然后你可以先统计这个信息: 每一个CC给一个ID, 然后知道他的size;

然后考虑flip;
again, 这个问题我当时没有想到很好的思路的一个原因还是我根本没有get my hands dirty.
不要有点uncertainty就直接开始立刻放弃当前的方向, 去尝试另一个方向寻找高层次的思路;
有uncertainty就枚举啊; 排除一些可能性啊;

这里谈到这个flip, 其实有两种可能性, 一种是这个flip, 会将两个不同的CC进行连接, 这个时候就是一个加法;
另外一种可能心就是flip之后, 并没有联通两个之前的CC; 这种情况下只能是当前CC周围距离的+1;

我这里实际上还是认为, 我flip的位置应该限制在所有CC的边缘; 但是实际上这个边缘不是特别好找? 不知道他是怎么做的?

回过头来说, 如果不管边缘, 只要是0, 就尝试flip, 难道不行吗?
我这题上来就想要局限在只flip边缘这个思考范围内, 其实是有一点Premature Optmization的嫌疑的;

如果是真正面试的时候, 有这种情况怎么办? 有一个可能可以做的优化, 比如这里的只flip边缘, 但是也不好直接就丢弃啊?
一个办法就是在旁边空白处先加一个TODO; 后面再来考虑; 如果是平时IDE做题, 那么直接代码里面加TODO就行了;

这个优化awice这个解法做没做我还不知道, 继续往下看;

Algorithm

For each group, fill it with value index and remember it's size as area[index] = dfs(...).

Then for each 0, look at the neighboring group ids seen and add the area of those groups, plus 1 for the 0 we are toggling. This gives us a candidate answer, and we take the maximum.

To solve the issue of having potentially no 0, we take the maximum of the previously calculated areas.

class Solution {  
    int[] dr = new int[]{-1, 0, 1, 0};  
    int[] dc = new int[]{0, -1, 0, 1};  
    int[][] grid;  
    int N;  

    public int largestIsland(int[][] grid) {  
        this.grid = grid;  
        N = grid.length;  

        int index = 2;  
        int[] area = new int[N*N + 2];  
        for (int r = 0; r < N; ++r)  
            for (int c = 0; c < N; ++c)  
                if (grid[r][c] == 1)  
                    area[index] = dfs(r, c, index++);  

        int ans = 0;  
        for (int x: area) ans = Math.max(ans, x);  
        for (int r = 0; r < N; ++r)  
            for (int c = 0; c < N; ++c)  
                if (grid[r][c] == 0) {  
                    Set<Integer> seen = new HashSet();  
                    for (Integer move: neighbors(r, c))  
                        if (grid[move / N][move % N] > 1)  
                            seen.add(grid[move / N][move % N]);  

                    int bns = 1;  
                    for (int i: seen) bns += area[i];  
                    ans = Math.max(ans, bns);  
                }  

        return ans;  
    }  

    public int dfs(int r, int c, int index) {  
        int ans = 1;  
        grid[r][c] = index;  
        for (Integer move: neighbors(r, c)) {  
            if (grid[move / N][move % N] == 1) {  
                grid[move / N][move % N] = index;  
                ans += dfs(move / N, move % N, index);  
            }  
        }  

        return ans;  
    }  

    public List<Integer> neighbors(int r, int c) {  
        List<Integer> ans = new ArrayList();  
        for (int k = 0; k < 4; ++k) {  
            int nr = r + dr[k];  
            int nc = c + dc[k];  
            if (0 <= nr && nr < N && 0 <= nc && nc < N)  
                ans.add(nr * N + nc);  
        }  

        return ans;  
    }  
}

他DFS里面是把所有这个CC的grid都给染色成为这个group的id的; 为了跟一开始的0/1进行区分, 这个index是从2开始的;

DFS的返回值返回的是size, 这个也是不难理解的; 这个DFS应该能够很快的设计出来, 虽然是mutation+returnvalue并存的一个, 但是实际上本身难度并不高;

最后一个循环的inner loop是他的逻辑的核心部分; 找到(r, c)是0, 那么就先找到他的所有的neighbor, 如果他们已经在一个CC里面了, 把所有(r, c)的neighbor的CC的ID用一个set装起来; 最后用bns把所有(r, c) neighbor CC size之和+1给返回出来用来更新ans就行了; 整体的逻辑实际上根本没有很多的弯弯绕;

neighbors这个函数也没有什么特别的, 就是把一个四方向前进和出界检查放在一起了; 一个不是非常重要的封装;

没想到这题直接做其实也是很简单的; 这题说到底还是怪我自己根本就没有去思考;

另外, 看完这个DFS思路之后, 可以确定UF是肯定可以做的, 因为这题最后的需求就是一个非常简单的CC的检测问题; 这个UF是完全可以handle的;

uwi: 5分钟. 他自己的代码本身最后这个DJSet是非常乱的, 估计是粘贴的模板代码;

class Solution {  
    public int largestIsland(int[][] grid) {  
        int n = grid.length, m = grid[0].length;  
        int ret = count(grid);  
        for (int i = 0; i < n; i++) {  
            for (int j = 0; j < m; j++) {  
                if (grid[i][j] == 0) {  
                    grid[i][j] = 1;  
                    ret = Math.max(ret, count(grid));  
                    grid[i][j] = 0;  
                }  
            }  
        }  
        return ret;  
    }  

    int count(int[][] grid) {  
        int n = grid.length, m = grid[0].length;  
        DJSet ds = new DJSet(n * m);  
        for (int i = 0; i < n; i++) {  
            for (int j = 0; j < m; j++) {  
                if (j + 1 < m && grid[i][j] + grid[i][j + 1] == 2) {  
                    ds.union(i * m + j, i * m + j + 1);  
                }  
                if (i + 1 < n && grid[i][j] + grid[i + 1][j] == 2) {  
                    ds.union(i * m + j, (i + 1) * m + j);  
                }  
            }  
        }  
        int ret = 0;  
        for (int i = 0; i < n * m; i++) {  
            ret = Math.max(ret, -ds.upper[i]);  
        }  
        return ret;  
    }  

    class DJSet {  
        public int[] upper;  
        public DJSet(int n) {  
            upper = new int[n];  
            Arrays.fill(upper, -1);  
        }  
        public int root(int x) {  
            return upper[x] < 0 ? x : (upper[x] = root(upper[x]));  
        }  
        public boolean equiv(int x, int y) {  
            return root(x) == root(y);  
        }  
        public boolean union(int x, int y) {  
            x = root(x);  
            y = root(y);  
            if (x != y) {  
                if (upper[y] < upper[x]) {  
                    int d = x;  
                    x = y;  
                    y = d;  
                }  
                upper[x] += upper[y];  
                upper[y] = x;  
            }  
            return x == y;  
        }  
        public int count() {  
            int ct = 0;  
            for (int u: upper)  
                if (u < 0) ct++;  
            return ct;  
        }  
    }  
}

看起来好像有点像UF的一个结构.

应该是UF了, 不过更新方式有点奇怪; 代码量也很少;
不是很看得懂; 不要抽象思考, 想一个小的比如2x2的小矩阵, 然后配合他的这个代码, 看看最后你得到的是什么; 比如就是全都是1cell嘛;

看懂了, 这个就是一个UF, 不过他的实现方式很有意思; 首先, upper也就是parent, 保存的是一个负值, 初始值是-1, 代表的实际上是I am a group of size 1; 对的, 他这里的upper实际上是一个双用途;

当union的时候, x和y这两个, 记得都是1D化的坐标, 是在>= 0的范围的; 这两个去找自己的upper, 这里有两种情况, 一种是, upper不是负数, 这种情况下, 说明x有爸爸, 那么需要进行继续向上查询加上path compression: 他这里写的比较简单;
为什么upper[x] >= 0就说明x有爸爸, 这个要看后面union的实现就理解了;

注意他在union里面直接用upper的绝对值来作为size实现一个ranked union的操作;

比如一个2x2的矩阵, 我们进行两次union, union((0, 0), (0, 1)) and union ((0, 1), (1, 1)). 当然, 实际上传的参数是1D的;
第一个union结束之后, upper:

-2 0  
-1 -1

看到没有, 这里(0, 1)的upper就变成非负数了; 这个就是因为他现在的爸爸被给到了(0, 0);
然后我们进行第二个union; 这个时候首先root ((0, 1)), 这个发现upper[1] = 0, 然后recurse, 进入upper[1] = root (upper[1]) = root (0) = upper[0] = -2; 这样path compression结束之后, upper:

-2 -2  
-1 -1

也就是上一次union结束之后, 其实(0, 0)和(0, 1)的group ID还没有统一; 下一个find之后, 才统一了; 为什么这样设计?
我们传统的UF不是这样设计的: UF如果产生了merge操作, 那么ID这些当时就是立刻更新的;

总体来说算是开眼界了, 不过没有必要去模仿; UF这个东西, 有自己一套成熟能够快速默写的写法就行了; 他这个写法我只能说感觉大概是正确的, 不过这个lazy path compression我真的是无法理解为什么要这样做; 以及到底安不安全;

注意他union里面的一个小技巧: 他不想要判断x和y谁家更大, 而是直接用一个conditional swap来保证x家始终比y家大就行了;

其他部分没有特别仔细的看了, 我就觉得你看他count实际上调用了好几次, 这个不是特别好; union其实只要最开始全盘union一次就行了; 他每一次count都是一次全盘union;

单向union

一个需要学习的小技巧, 你看他count的实现; 一个是他的这个+起来==2的操作, 只能说这个人的脑子是真的快, 当时那么大的时间压力下这个也是很自在的想到;
另外一点, 就是看他的union操作, 规定只向下和向右进行: 这个是完全正确而且是面试可能被优化的一个操作: union实际上是一个undirected connect, 所以在grid问题下面, 两个端点分别向对面进行一次union是完全没有必要的; 虽然最后不影响复杂度, 但是对于实际的运行时间还是有影响的;

Problem Description

In a 2D grid of 0s and 1s, we change at most one 0 to a 1.

After, what is the size of the largest island? (An island is a 4-directionally connected group of 1s).

Example 1:

Input: [[1, 0], [0, 1]]  
Output: 3  
Explanation: Change one 0 to 1 and connect two 1s, then we get an island with area = 3.

Example 2:

Input: [[1, 1], [1, 0]]  
Output: 4  
Explanation: Change the 0 to 1 and make the island bigger, only one island with area = 1.

Example 3:

Input: [[1, 1], [1, 1]]  
Output: 4  
Explanation: Can't change any 0 to 1, only one island with area = 1.

Notes:

1 <= grid.length = grid[0].length <= 50.
0 <= grid[i][j] <= 1.

Difficulty:Hard
Total Accepted:316
Total Submissions:816
Contributor:awice

Companies
uber
Related Topics
depth-first search

LC854 MakingALargeIsland