RaceCar [source code]


public class RaceCar {
static
/******************************************************************************/
class Solution {
    public int racecar(int target) {
        int K = 32 - Integer.numberOfLeadingZeros (target) + 1;
        int barrier = 1 << K;
        int[][] dist = new int[barrier + 1][2];
        for (int i = 0; i < dist.length; i++) for (int j = 0; j < 2; j++)
            dist[i][j] = Integer.MAX_VALUE;
        dist[target][0] = 0;
        PriorityQueue<Guide> pq = new PriorityQueue <> ((a, b) -> (a.cost - b.cost));
        pq.offer (new Guide (target, 0, 0));
        while (!pq.isEmpty ()) {
            Guide cur = pq.poll ();
            if (dist[cur.pos][cur.face] < cur.cost)
                continue;

            for (int i = 0; i <= K; i++) {
                int walk = (1 << i) - 1;
                int next_pos = cur.pos + (cur.face == 0 ? -1 : 1) * walk;
                int next_face = 1 - cur.face;
                // invariant: all pos non-negative
                if (next_pos < 0) {
                    next_pos = -next_pos;
                    next_face = 1 - next_face;
                }
                int next_cost = cur.cost + i + 1;
                if (next_pos <= barrier && dist[next_pos][next_face] > next_cost) {
                    dist[next_pos][next_face] = next_cost;
                    pq.offer (new Guide (next_pos, next_cost, next_face));
                }
            }
        }
        return Math.min (dist[0][0], dist[0][1]) - 1;
    }

    class Guide {
        int pos, cost, face;   // 0: goto origin, 1: go away from origin
        Guide (int p, int c, int d) {pos = p; cost = c; face = d; }
        public String toString () {return String.format ("(%d,%d,%d)", pos, cost, face); }
    }
}
/******************************************************************************/

    public static void main(String[] args) {
        RaceCar.Solution tester = new RaceCar.Solution();
        int[] inputs = {
            6,5,
            3,2,
            4,5,
            5,7,
        };
        for (int i = 0; i < inputs.length / 2; i++) {
            System.out.println (Printer.separator ());
            int target = inputs[2 * i], ans = inputs[2 * i + 1], output = tester.racecar (target);
            System.out.printf (
                "%d\n%s\n%d\n", target, Printer.wrap (output + "", output == ans ? 92 : 91), ans
            );
        }
    }
}

感觉像是个DP问题, 那么一开始通过dimension analysis, table entry analysis, root involvement这些全都用上之后, 还是觉得无法得到合理的算法的时候, 就只能自己穷举几个path来找规律了;

think out loud

做这题的时候大声的在跟自己沟通, 感觉效果很好, 当然也可能只是当天的状态;

另外, 草稿纸可以看出来, 这题实际上是一个greedy的算数问题; 不知道能不能做出来;

找不到一个general的factor思路, 感觉要大胆猜测一下? hard题很多时候就是要多做simplifying observation, 而不是去做最general的思路; 因为太general而难写的hard题实际上是不多的(比如之前的median of two sorted, 实际上是很个性的一个题目);

按照这个思路写出来的greedy算法, 被test3给break掉了;

4为什么是5下呢; 7 - 3这种, 肯定是6下了, 那么有其他的走法?

知道了, AARRA, 这个骚操作啊, 用两个R来完成一个同向速度归一的操作. Google的题目真的是不讲道理;

那我之前那个greedy的思路就错了啊, 我那个思路默认的就是一直是走一个连续的A段, 然后反向走一会儿, 再反向; 但是这里看来, 居然是可以同向清空速度的;

那么这个multiple path的可能性看到之后, 大概也是知道这题最后实际上还是依赖DP操作了; 不过contest看到好像很多纯粹的搜索算法也做的出来; 我当时怎么没有往这个方向想呢;

总体来说还是很难的一个题目, contest结束的时候的情况:

这个rate算是很吓人了;

草稿:

这题总体来说其实我是已经思考了很多东西的了, 不过这题的难度是真的高.

2018-05-08 00:56:11

这个时候来看AC rate已经有1/7左右了; 所以平时题目的AC rate真的是不可靠的; 按照之前contest当时的情况来说, 大概是跟真是rate差两倍的样子; 这个还是没有考虑contest的参与者平均水平实际上还相对高于总体水平.

回到正题;

最终自己失败版本的代码; 配合草稿看, 应该是不难理解的;

class Solution {  
    public int racecar(int target) {  
        int res = 0;  
        if (target < 0) {  
            target = -target;  
            res++;  
        }  
        int x = target;  
        while (x != 0) {  
            int bit = roundUpLog2 (x + 1);  
            x = ((1 << bit) - 1) - x;  
            res += bit;  
            if (x == 0)  
                break;  
            res++;  
        }  
        return res;  
    }  

    int roundUpLog2 (int x) {  
        return (int) Math.ceil (Math.log (x) / Math.log (2));  
    }  
}

看完editorial, 还是不满意, 自己尝试用BFS写一个:

class Solution {  
    public int racecar(int target) {  
        Queue<Guide> queue = new LinkedList<> ();  
        queue.offer (new Guide (0, 0));  
        int res = Integer.MAX_VALUE;  
        while (!queue.isEmpty ()) {  
            int size = queue.size ();  
            for (int i = 0; i < size; i++) {  
                Guide cur = queue.poll ();  
                int dist = Math.abs (target - cur.pos);  
                int bit = 32 - Integer.numberOfLeadingZeros (dist);  
                if ((1 << bit) - 1 == dist) {  
                    int hit_cost = cur.cost + bit;  
                    res = Math.min (res, hit_cost);  
                    continue;  
                }  
                if (cur.pos < target) {  
                    queue.offer (new Guide ((1 << (bit - 1)) - 1 + cur.pos, cur.cost + 2 + bit - 1));    //A(bit-1) RR  
                    queue.offer (new Guide ((1 << bit) - 1 + cur.pos, cur.cost + bit + 1));              // A(bit) R  
                } else {  
                    queue.offer (new Guide (cur.pos - ((1 << bit) - 1), cur.cost + bit + 1));             // A(bit) R  
                    queue.offer (new Guide (cur.pos - (1 << (bit - 1) - 1), cur.cost + bit - 1 + 2));   // A(bit - 1) RR  
                }  
            }  
            if (res < Integer.MAX_VALUE)  
                break;  
        }  
        return res;  
    }  

    class Guide {  
        int pos, cost;  
        Guide (int p, int c) {pos = p; cost = c; }  
        public String toString () {return String.format ("(%d,%d)", pos, cost); }  
    }  
}

被test4给break了; 5居然只要7? 想了半天想不通啊, 打trace也是对的;


cur:(0,0), dist:(5), bit:(3)  
[(3,4), (7,4)]  
level end, queue: [(3,4), (7,4)], res:(2147483647)  

cur:(3,4), dist:(2), bit:(2)  
[(7,4), (4,7), (6,7)]  

cur:(7,4), dist:(2), bit:(2)  
[(4,7), (6,7), (4,7), (6,7)]  
level end, queue: [(4,7), (6,7), (4,7), (6,7)], res:(2147483647)  

cur:(4,7), dist:(1), bit:(1)  
hit with distance: 8  

cur:(6,7), dist:(1), bit:(1)  
hit with distance: 8  

cur:(4,7), dist:(1), bit:(1)  
hit with distance: 8  

cur:(6,7), dist:(1), bit:(1)  
hit with distance: 8  
level end, queue: [], res:(8)  
5  
8  
7

实在想不到怎么得到7, 干脆自己穷搜一个看看;

想不通啊, 为什么是7啊? 怎么都不可能是7啊?

discuss有人讨论这个了:

https://leetcode.com/problems/race-car/discuss/125742/How-to-reach-5-in-7-steps/136613

0-1-3-3-2-2-3-5 (AARARAA)
I was confused by the same case and use a trick "if target == 5 return 7" to pass this case. The next case is "8 steps to reach 9" and I came out with the idea that the last part should be 2-2-3-5.

也就是0->3->2->3, 这个好像我的代码不cover啊, 走到3怎么还回头了啊;
感觉就是用来break greedy的思路的; 我现在的思路就很简单, 到了3, 要么去4或者7; 但是这里看来居然可以一个后撤步;

面试的时候这个要是没想到基本就是重大bug了;

感觉我这个greedy的BFS应该是凉了, 本来还准备发去discuss装一波的, 因为这个方法如果成功, 最后复杂度其实是O(N): 一个高度为logN的binary tree, 有N个node; 可惜了, 因为这个代码其实还挺花了点功夫的, 包括你看我其实是每一个Ak走完之后, 只要还没有到target, 那么就立刻就用R或者RR保持方向(始终朝向target)

基本是放弃治疗了, discuss看完也不知道怎么证明, 不过还是模仿他们的写法实现一遍; 因为DP写法我现在完全无法证明, 所以我就不实现那个了; 实现本身就是帮助理解和记忆, 无法证明的东西先补记忆也罢;

话说回来啊, DJ算法那个, 其实那个barrier我也没有证明啊; 如果我要从6到0, 那么我肯定局限在-16..16这个范围内活动, why?

不主动往远离target的方向走, 是不可靠的; 比如5这个例子, 你在3的时候, 要去5, 不就是先回头远离5了吗;
顺便说一下, 3的时候, 我本来的错误办法是RR A2 R A1, 但是正确的做法是R A1 R A2:

可以看到, 红色部分和紫色这两条路径, 实际上是对称的; discuss1的解释里面好像给了一个类似这个的例子, 我没仔细看; 不过大概意思就是单纯的RR应该被合并到RAmR这样的pattern里面去;
红色路线, 在3这里之所以先要RR一下, 就是因为我们在这里速度不理想(不是1), 所以不能直接A2 R A1 (which is 4, equal to the correct purple path's cost). 正是因为这样一个速度不合理的问题, 导致需要reset, 导致红色路线不对;

回过头来看, 我还是无法证明比如6->0这个input的话, 只要管住-16..16这个范围就行了; awice当时计算这个范围的时候其实也是有点粗枝大叶的;

要求-12..12行不行? 这个就有点类似discuss2里面那个优化, which I can not prove either;

不管了, 先假设可以这么写了, 我先用-12..12写写看, 因为这个没有-16..16这个那么slick; 如果出界到时候再说;

回头想想, awice的framework描述里面最后一段, 说是, 你所有的Ak的k, 不可能超过a+1, a就是比如如果target是6, 那么a就是3这样的; 这个实际上跟后面discuss2的那个0..2target的是类似的: 2^a首先是直接bound了tareget本身的, 然后他多给你+1, 实际上就是告诉你Ak能够走的距离, 最多不超过两倍target: 指数+1就是实际上整个*2.
当然, 他这个bound实际上比discuss2那里讲的更加松一点;

我觉得我这个算是可以解释了? 只要面试官认可这个两倍的论断, 是可以继续写的;
如果不认可, 这个两倍bound我暂时是真的没有办法证明;

写的过程中发现就算是editorial1的DJ算法其实也不好写; 记代码本身是没有出路的; 还是要记trace;

回头自己写的时候, 感觉trace不是很好理解, 比如这个:

K:(4), barrier:(16), dist length:(33)  
after initialization:  
pq: [(0,6)]  
dist: 6=0   

node: (0,6)  
check: dist[6] = 0 ≤ 0  
    k:(0), walk:(0), targ2:(-6), steps2:(1)  
    Update dist for -6 at dist[27]  
        pq: [(1,-6)]  
        dist: 6=0 27=1   
    k:(1), walk:(1), targ2:(-5), steps2:(2)  
    Update dist for -5 at dist[28]  
        pq: [(1,-6), (2,-5)]  
        dist: 6=0 27=1 28=2   
    k:(2), walk:(3), targ2:(-3), steps2:(3)  
    Update dist for -3 at dist[30]  
        pq: [(1,-6), (2,-5), (3,-3)]  
        dist: 6=0 27=1 28=2 30=3   
    k:(3), walk:(7), targ2:(1), steps2:(4)  
    Update dist for 1 at dist[1]  
        pq: [(1,-6), (2,-5), (3,-3), (4,1)]  
        dist: 1=4 6=0 27=1 28=2 30=3   
    k:(4), walk:(15), targ2:(9), steps2:(5)  
    Update dist for 9 at dist[9]  
        pq: [(1,-6), (2,-5), (3,-3), (4,1), (5,9)]  
        dist: 1=4 6=0 9=5 27=1 28=2 30=3

所以可以看到, 为什么我在6, 我走了A1R, 然后我到了-5? 为什么不能说我到了5? 如果我说我到了5, 就有一个问题: 你自己想想这个数轴, 你在6, 一开始朝0走, 然后走了A1=1的距离, 你这个时候是在5, 但是他awice这里的逻辑是, 你一旦这个Ak停止了, 立刻就给我R, 所以这个时候虽然我在5, 但是我是背对着0的;
当然, 正确的做法是从6开始, A1R, 然后RAk之类的, 但是RR是可以看成RA0R的, 所以我们从6, A1R到5, 背对, 然后下一个回合只要k继续选0, 我们就又在5原地转身, 这个情况就涵盖了;

所以我认为他这里的这个正负号并不能被舍弃: 我之前自己看editorial的时候的总结有点问题;
他这里的正负号实际上就是代表方向, 而且是一个相对方向: +代表是面向0(最终目标), 而-代表是远离0;
区分方向很好理解了就: 5和-5当然不一样了啊; 你虽然是站在5, 但是你面向0, 还是背对0, 所对应的走到0需要的cost是完全不一样的;

那么也就能解释为什么他的dist数组必须要涵盖正负对称两段了;
说起来awice自己的解释里面有提到这个观点了;

写到这里其实就很慌了, 这个代码最后按照这个思路设计下去, 就跟awice的思路太类似了; 面试的时候敢写吗;
我还是用两个数组来写? 或许可以; 就这样, 用两个数组来区分方向;
用正负号来把相同距离不同方向的bookkeeping合并到一个数组里面毕竟还是有点太slick了;

现在还不确定我两个数组的方法能够不能够写出来, 不过已经发现的awice的写法的优势已经包括:

统一处理的情况下, k循环里面的各种step什么的更新的计算式简单很多; 虽然并不是说就一定很容易想到;
终点0也好处理很多: 不用管你是哪个朝向的: 因为用正负号代表朝向, 对0来说都无所谓, 这也是为什么他的dist的长度可以是一个奇数;

有惊无险的稍微debug了一下之后, AC了, 速度波动几次在210-250(NA)水平, 还行吧, 知足了; 关键是用一个相对钝化过的方法来实现了这个方法;
当然, again, bound的正确性的解释还是有待和面试官讨价还价;

最后发现每一个i循环里面的这个更新公式并不是特别的trivial; 我最后干脆用的枚举来讨论: 考虑不同的face的情况下, 对应的pos, cost, 下一个face这些变量应该怎么变化;

别怕枚举

讲了很多遍的一个观念了, 我就不啰嗦了;

用很直白的一个方式自己AC了一下, 还是很开心的, 这个题目可以过了; 后面再来看大家对于正确性证明有没有什么进展;

不过我上面的写法有一个问题就是, 我尝试一旦relax 终点0成功, 就立刻return, 结果好像是不行的; 为什么呢? 上次有一道用DJ算法的题, 就是可以的, 为什么这题就不可以了?

刚开始之前那个BFS版本, 实际上应该是可以的, 那个就是当前一个level结束之后如果发现碰到终点了, 就结束就行了;
但是这里DJ算法跟BFS不一样, 没有什么level的关键, 优先级完全依靠cost本身, 所以那个思路并不适用;

其实这题帖子看的也不少了, 主要还是目前没有很好的正确性证明, 所以后面还是等discuss成熟一些再来看; 这个题就先补标"未完成"了;
TODO
看discuss有没有新的答案;

UNFINISHED
算了, 还是标一下; 有些东西的证明不学会, 就算是我自己的dumb搜索算法, 我感觉真正面试的时候都不太好说;

editorial

Approach Framework

先搞了一个framework, 不知道是什么意思;

前三段这个分析不错, 建立了这个sequence的模式: not just any sequence, 而是有一个pattern; 这样问题就简化了很多, 这个也是hard问题的一个套路; 事实上, 很多hard问题的初衷就是希望你能够展现出这种解剖和简化问题的能力;

注意他第三段这个结论跟我自己写我自己的解法的时候的分析是一样的; 我当时还是费了不少周折才想到这一步的, 不过这里看看人家, 好像很自然的样子;
当然, 我其实并没有跟他这样证明don't start with R. 这个如果是面试的时候可能会被打断并且问到;

第四段其实是跟我的想法类似的一个分析, 就是第一段Ak尽可能的多走; 就跟整数power2拆分一样, 一开始尽可能的多拆一点, 然后后面慢慢收缩;
当然他这里实际上也没有给出真正的证明, 一个WLOG就打发了;

第五段这个是一样的; 事实上我自己上面草稿好像对于这个有所分析; 既然找的是最短的, 这种明显冗余的可能性就可以尽快排除掉;

最后一段完全就是我的round up log2的计算了; 可以稍微走过头一点, 但是不要走过头太多, 这个就是他这里的这个bound的意思;
这个bound其实应该是可以证明的, 比如(2^m - 1) - (2^n - 1) = target, 那么我们可以声明m应该取这个target的log2 ceiling; 怎么证明? 直观理解, 假设m多走一点, 那么后面n就要多抵消回来一点, 最后m和n都不必要的变大了; 整个sequence长度也不必要的变大了;
但是严格证明还不好说;

所以我的方向几乎是没什么错的;
然后他就开始用DJ算法了...到目前为止完全没有clue; 为什么突然就想到用DJ算法?

Approach #1: Dijkstra's [Accepted]

Intuition

With some target, we have different moves we can perform (such as k1=0,1,2,⋯, using the notation from our Approach Framework), with different costs.

This is an ideal setup for Dijkstra's algorithm, which can help us find the shortest cost path in a weighted graph.

这个观察很敏锐了, 他分析出来我们最后要找的是[A...AR]+这样一个模式之后, 每一个Ak段他认为是一个edge; edge的cost就是这个段的长度; 所以他认为这个实际上是一个最短路径问题, 这个我就真的没有意识到;

当然一个原因是感觉当时掉到greedy或者说DP的坑里没爬出来;

Algorithm

Dijkstra's algorithm uses a priority queue to continually searches the path with the lowest cost to destination, so that when we reach the target, we know it must have been through the lowest cost path. Refer to this link for more detail.

Back to the problem, as described above, we have some barrier where we are guaranteed to never cross. We will also handle negative targets; in total we will have 2 * barrier + 1 nodes.

After, we could move walk = 2**k - 1 steps for a cost of k + 1 (the 1 is to reverse). If we reach our destination exactly, we don't need the R, so it is just k steps.

一段长度为k的Ak可以走的距离是2**k -1, 这个也是分析过的了;
对于最后的这个R, 有一个专门的特殊分析;
看起来描述倒是很简单, 看一下算法具体怎么实现的;

class Solution {  
    public int racecar(int target) {  
        int K = 33 - Integer.numberOfLeadingZeros(target - 1);  
        int barrier = 1 << K;  
        int[] dist = new int[2 * barrier + 1];  
        Arrays.fill(dist, Integer.MAX_VALUE);  
        dist[target] = 0;  

        PriorityQueue<Node> pq = new PriorityQueue<Node>(  
            (a, b) -> a.steps - b.steps);  
        pq.offer(new Node(0, target));  

        while (!pq.isEmpty()) {  
            Node node = pq.poll();  
            int steps = node.steps, targ1 = node.target;  
            if (dist[Math.floorMod(targ1, dist.length)] > steps) continue;  

            for (int k = 0; k <= K; ++k) {  
                int walk = (1 << k) - 1;  
                int targ2 = walk - targ1;  
                int steps2 = steps + k + (targ2 != 0 ? 1 : 0);  

                if (Math.abs(targ2) <= barrier && steps2 < dist[Math.floorMod(targ2, dist.length)]) {  
                    pq.offer(new Node(steps2, targ2));  
                    dist[Math.floorMod(targ2, dist.length)] = steps2;  
                }  
            }  
        }  

        return dist[0];  
    }  
}  

class Node {  
    int steps, target;  
    Node(int s, int t) {  
        steps = s;  
        target = t;  
    }  
}

这个barrier好像就是... 如果target是6, 那么6-1=5有29个leading zero, 那么K就等于3 + 1, 实际上也就是5的binary表示的有效长度+1;
然后算出来的barrier是16, 给dist的长度是33. dist长度好像就是因为要考虑barrier可能是往正负两个方向;

暂时还是不知道这个barrier限制的到底是什么; dist的长度是33, 所以应该是有33个node的意思;

public static int floorMod(int x,  
                           int y)
Returns the floor modulus of the int arguments.
The floor modulus is x - (floorDiv(x, y) * y), has the same sign as the divisor y, and is in the range of -abs(y) < r < +abs(y).

The relationship between floorDiv and floorMod is such that:

floorDiv(x, y) * y + floorMod(x, y) == x
The difference in values between floorMod and the % operator is due to the difference between floorDiv that returns the integer less than or equal to the quotient and the / operator that returns the integer closest to zero.

Examples:

If the signs of the arguments are the same, the results of floorMod and the % operator are the same.

floorMod(4, 3) == 1; and (4 % 3) == 1

If the signs of the arguments are different, the results differ from the % operator.

floorMod(+4, -3) == -2; and (+4 % -3) == +1

floorMod(-4, +3) == +2; and (-4 % +3) == -1

floorMod(-4, -3) == -1; and (-4 % -3) == -1
If the signs of arguments are unknown and a positive modulus is needed it can be computed as (floorMod(x, y) + abs(y)) % abs(y).

暂时还是看不懂到底是在干什么; 先打trace看看算了; 一点头绪都没有的算法, 不想先手写trace, 干脆直接先打一个trace出来看看到底是在干什么; 然后后面需要具体分析逻辑细节的时候, 再去手算什么的;

可以看一下他这里用这个floorMod完成的这个映射操作:

java.lang.Integer res0 = 29  
java> Math.floorMod (-6, 33)  
java.lang.Integer res1 = 27  
java> Math.floorDiv (-6, 33)  
java.lang.Integer res2 = -1  
java> for (int i = -16; i <= 16; i++) {  
    | System.out.printf ("%d -> %d\n", i, Math.floorMod (i, 33));  
    | }  
-16 -> 17  
...  
-1 -> 32  
0 -> 0  
1 -> 1  
...  
16 -> 16

0..16这个范围还是正常的, 然后-16..-1这个就是映射到了后半段; 为什么把负数放到后面? 这样可能是避免0..16这个范围的都要shift一个offset他觉得很麻烦; 不过其实自己offset也不是什么很坏的办法; 看个人习惯吧; 这个floorMod我现在还不是很有把握;

注意看这两段内的相对顺序并没有变; -1这种在负数段的最右边;

如果是input是1的情况下, 这个是头两个while iteration的trace:

K:(4), barrier:(16), dist length:(33)  
after initialization:  
pq: [(0,6)]  
dist: 6=0   

node: (0,6)  
check: dist[6] = 0 ≤ 0  
    k:(0), walk:(0), targ2:(-6), steps2:(1)  
    Update dist for -6 at dist[27]  
        pq: [(1,-6)]  
        dist: 6=0 27=1   
    k:(1), walk:(1), targ2:(-5), steps2:(2)  
    Update dist for -5 at dist[28]  
        pq: [(1,-6), (2,-5)]  
        dist: 6=0 27=1 28=2   
    k:(2), walk:(3), targ2:(-3), steps2:(3)  
    Update dist for -3 at dist[30]  
        pq: [(1,-6), (2,-5), (3,-3)]  
        dist: 6=0 27=1 28=2 30=3   
    k:(3), walk:(7), targ2:(1), steps2:(4)  
    Update dist for 1 at dist[1]  
        pq: [(1,-6), (2,-5), (3,-3), (4,1)]  
        dist: 1=4 6=0 27=1 28=2 30=3   
    k:(4), walk:(15), targ2:(9), steps2:(5)  
    Update dist for 9 at dist[9]  
        pq: [(1,-6), (2,-5), (3,-3), (4,1), (5,9)]  
        dist: 1=4 6=0 9=5 27=1 28=2 30=3   

node: (1,-6)  
check: dist[27] = 1 ≤ 1  
    k:(0), walk:(0), targ2:(6), steps2:(2)  
        pq: [(2,-5), (4,1), (3,-3), (5,9)]  
        dist: 1=4 6=0 9=5 27=1 28=2 30=3   
    k:(1), walk:(1), targ2:(7), steps2:(3)  
    Update dist for 7 at dist[7]  
        pq: [(2,-5), (3,7), (3,-3), (5,9), (4,1)]  
        dist: 1=4 6=0 7=3 9=5 27=1 28=2 30=3   
    k:(2), walk:(3), targ2:(9), steps2:(4)  
    Update dist for 9 at dist[9]  
        pq: [(2,-5), (3,7), (3,-3), (5,9), (4,1), (4,9)]  
        dist: 1=4 6=0 7=3 9=4 27=1 28=2 30=3   
    k:(3), walk:(7), targ2:(13), steps2:(5)  
    Update dist for 13 at dist[13]  
        pq: [(2,-5), (3,7), (3,-3), (5,9), (4,1), (4,9), (5,13)]  
        dist: 1=4 6=0 7=3 9=4 13=5 27=1 28=2 30=3   
    k:(4), walk:(15), targ2:(21), steps2:(6)  
        pq: [(2,-5), (3,7), (3,-3), (5,9), (4,1), (4,9), (5,13)]  
        dist: 1=4 6=0 7=3 9=4 13=5 27=1 28=2 30=3

感觉他有一个隐性的优化, 就是他这个最短路径, 实际上是从6出发, 然后找0; 你看他最后代码比较的最终sink也是0; 当然两者之间的等价还是很好理解的;

if (dist[Math.floorMod(targ1, dist.length)] > steps) continue;

这一句应该就是简单的一个lazy deletion. 用java本身的PriorityQueue来写DJ算法的话, 是经常会用到这个方法; 不算第一次见到了;
话说回来, 让你写你知道怎么写吗?
大概原理就是dist里面像现在对于这个target(从6开始, 希望走到0)保留的最好结果(从6开始走到这里需要的最少的steps), 所以如果你queue里面Poll出来一个居然还比dist结果还要差的steps结果for this target, 那么就可以直接丢弃了;

到目前为止, 我还是不理解他这个K和barrier到底是怎么计算出来; 偏巧这两个东西在循环代码里面还属于是比较重要的;

比如我们现在站在6这里, 然后我们有k=0..4种选项, 分别代表A0..A4. 你看他实际代码的计算, 对应这个k的walk距离算出来之后, 是用walk来减这个6的, 什么意思呢, 比如k=4, 算出来walk是15, 那么我们从6开始, 应该走到-9(原点反方向, reminder: 原点是我们的目标); 如果我们在-9, 那么应该是到6. 我们最终目标都是走到0, 所以中间会有一个类似在原点周围振荡的过程; 这个是一个很好的逻辑; 配合他之前的把0和6对调(6走向0而不是0走向6), 最后这个震荡更容易理解一些;

那么这个从6走15应该走到-9, 从-6走15应该走到9的操作, 用他的一个walk - target就简单的表示出来了; 不对诶, 他从6开始, 下一步是走到9了? 我分析错了?

换个角度, 你看他说, (0, 6)能走到(1, -6), 这个是为什么? 如果只是一个R, 你只是换方向了, 并没有任何的距离改动才对啊?

如果我理解, 下一个位置的计算应该是target - walk, 也就是我在6, 我就像6的反方向走walk=15的距离到-9;
不过话说回来, 这个正负号不处理其实也没有什么关系; 你在6了, 你要走AAAAR了, 那么你下一步在(5, 9), 还是(5, -9), 有差别吗?
而且因为最后要找的结果是0, 所以正负号对于最后的target没有区别, 你是从-9走到0, 还是从9走到0, 也无所谓;

不过我还是觉得不是很严谨;

反过来回头想, 如果真的是本着符号无所谓的态度, 干脆target2直接取绝对值不行吗? 这样dist数组也可以省下来一半长度; 不过这些还都属于Premature Optmization, 后面再说;

这个算法大概的意思是理解了; target, 包括从6走到0, 代表graph里面的node; 因为都是数字, 正常的用一个数组可以搞定;
在每一个node, 你有集中选择, 走一个AkR的步骤; 到达下一个node(target);
这个AkR这一个edge的cost就是k+1, 除非最后走到了0了已经, 那就是k; 最后的R不要了;

if (Math.abs(targ2) <= barrier && steps2 < dist[Math.floorMod(targ2, dist.length)]) 这个应该也是不难理解的; targ2不能太远(否则就出界dist了), 然后steps2不能太差: 要比dist[targ2]现在的这个dist要好, 要能relax targ2当前的距离, 那么我们就进行更新: 新的(targ2, steps2)扔到PQ里面去, 然后dist更新一下;

这题难主要还是难在这个graph model非常的隐晦; 我反正是基本把他的framework完全分析出来了, 但是没有想到他这个转换到最短路径来解决的思路;

我回头来看他这个: if (dist[Math.floorMod(targ1, dist.length)] > steps) continue;, 这个写的对吗? 如果我Poll出来一个steps, 比我现在dist里面对这个targ1存的dist要小, 结果还直接discard? 这个感觉就是一个完全没有执行的吧? Poll出来的steps要么比当前存放的这个dist要相等(因为上一个循环enqueue他的同时, 是更新了dist的), 要么就是比dist这个entry要大, 怎么会小呢?

看他的k, 是从0开始的, 所以我自己的解法的致命伤: Ak1RRAk2(同向归零)这种操作, 他这个是完全可以实现的;

我搜索了一下对于前三个test的trace, 这个pruning一次也没有触发; 我感觉他就是写错了;

我把这里改成<之后居然也AC了; 但是速度变慢了: 280变成了320. 波动了一下又变成了200了; 所以我认为我还是对的;

leetcode_deleted_user 17 Apr 15, 2018, 7:22 AMReport
In the approach 1, the sentence 'if dist[targ] > steps: continue' is redundant since the condition is never satisfied.

还是有人认同我的观点的; 不过说redundant还是有点过分了; 这个人应该是没有理解这个pruning的真实意图. 但是算法本身的flow他是理解了的;

那么这个做法到目前为止唯一不明白的地方就是K和barrier是怎么算出来的了; 有些搜索问题, 这个有效上限的计算是相当有讲究的;

跑步回来大概是理解了; 首先, 我们还是先确定, what question am I asking?

不要用特别抽象的方法去思考这个问题, 这里我们就更加具体的(也就是结合一个实际的例子, 或者说input): 为什么input是6的时候, 也就是我们要从6走到0的时候, K只能是4? 更进一步; 为什么我们的目标是6的时候, 站在每一个点上面的时候, 我最多只能连续走4个A(光是问, 为什么K最多是4, 这个还是太抽象了: K本身到底是什么? 什么含义?)?
这样可能就好理解很多了: 这个为什么在最上面framework的时候其实就已经解释过了; 所以这个基本就很好理解了;
然后因为最多可以搞一个A4出来, 所以最远可以走到16(其实不到, 但是他放宽了一点);

另外就是他这个计算这个K的方法, leading zero之类的东西, 用来计算这个也是不错的思路;

然后floorMod这个来handle一个负数往array index的一个mapping, 也是一个不错的思路吧;

总体来说我觉得这个算法是一个需要重点掌握的算法, 非常的有意思, 而且也很考察扎实的基本功;

Approach #2: Dynamic Programming [Accepted]

这解释的乱七八糟的, 完全没有看懂;

还是先用6这个具体的例子来看看他代码在干什么;

class Solution {  
    public int racecar(int target) {  
        int[] dp = new int[target + 3];  
        Arrays.fill(dp, Integer.MAX_VALUE);  
        dp[0] = 0; dp[1] = 1; dp[2] = 4;  

        for (int t = 3; t <= target; ++t) {  
            int k = 32 - Integer.numberOfLeadingZeros(t);  
            if (t == (1<<k) - 1) {  
                dp[t] = k;  
                continue;  
            }  
            for (int j = 0; j < k-1; ++j)  
                dp[t] = Math.min(dp[t], dp[t - (1<<(k-1)) + (1<<j)] + k-1 + j + 2);  
            if ((1<<k) - 1 - t < t)  
                dp[t] = Math.min(dp[t], dp[(1<<k) - 1 - t] + k + 1);  
        }  

        return dp[target];    
    }  
}

你看他这里base case也是给的很多的; 所以以前也是说过的, DP多给几个base case, 没毛病;

t=3的时候, 直接是进入第一个if, 因为3 = 2^2 - 1;
t=4的时候, k=3, 2^3-1=7≠4=t; 第一个if被跳过了; j=0..2: j=0, dp[4]=min(dp[4], dp[4-2^2 + 2^0] + 3-1 + 0+2), 这个是什么意思?
这样, 我们还是先思考我们自己会怎么解决这个问题: 我现在在4, 我在思考, 我怎么来的呢;

算了打trace:

t:(3), k:(2)  
dp:0=0 1=1 2=4 3=2   

t:(4), k:(3)  
    j:(0): [4 - 4 + 1] + 2 + 0 + 2, dp: 0=0 1=1 2=4 3=2 4=5   
    j:(1): [4 - 4 + 2] + 2 + 1 + 2, dp: 0=0 1=1 2=4 3=2 4=5   
dp:0=0 1=1 2=4 3=2 4=5   

t:(5), k:(3)  
    j:(0): [5 - 4 + 1] + 2 + 0 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=8   
    j:(1): [5 - 4 + 2] + 2 + 1 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7   
dp:0=0 1=1 2=4 3=2 4=5 5=7   

t:(6), k:(3)  
    j:(0): [6 - 4 + 1] + 2 + 0 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=6   
    j:(1): [6 - 4 + 2] + 2 + 1 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=6   
dp:0=0 1=1 2=4 3=2 4=5 5=7 6=5   
6  
5, expected: 5

大概是能理解一个大概的思路, 用站在4找爸爸不太好, 我们站在5找爸爸; 在5这里, 我们先看的是5 - 4 + 1 = 2, 意思也就是2 - 1 + 4 = 5这个path; 注意, awice计算的时候有一个优化, 实际上这个path真正的组成应该是: 2 - (1 - 1) + (4 - 1) = 5, 第一个小括号是反向A0, 然后同向(跟2同向)一个A2, 就到了5了, 总共的代价是0 + 2 + 2, 也是符合上面的trace的; 所以现在至少知道他在干什么事情了;

那么为什么这样算? 尤其是我现在站在5, 为什么知道有一段是用A(k-1) = A2? 以及反向A0然后同向A2, 代表的是什么意思? 我现在在2, 我R, 然后我A0, 然后我又R, 然后我A2, 到了5; 哦, 就是RRAk这个模式, 也就是我自己的greedy没有解决的这个问题; 这里这个逻辑就很自然的handle了; 类似的, 你看他比如5的下一个case: 3, R, A1, R, A2, 走到5, 看到没有, 这两个逻辑完全统一起来了; 这个很6;
当然, 这个也是DP区别于greedy的一个优势所在了: 就是可以统一不同的可能性; 不过这个也是说起来轻巧, 真正写的时候哪有这么简单;

回头看看好像awice上面的讲解也没什么问题;
那么下面就是, 比如我3往5走, 怎么知道是R A1 R A2这样的组合? 尤其是, 5之前的最后一段, 这里是固定是A(k-1) = A2的; 为什么有这个决定?
awice这里实际上就是告诉你了, 应该这么走了; 他这里实际上是把原来的这个比如target往0走(或者0往target走)的问题, generalize到更大的一个: 任何一个i往任何一个j走; 比如这里我们要研究的是, 2怎么走到5? 3怎么走到5?
awice这里是把这个决策方式已经告诉你了: how to justify?

尤其是, 你自己描述一下他这个path选择的heuristic: 他这里的意思, 就是目标(大的那个数字)的parent link一定是A(k-1), 然后总共肯定只有两个link走过来, 那么第一个link肯定就是剩下的, 反正好算;
这样一个思路有什么根据?

他没有justify;

首先, 假如是站在3这里, 或者7这里, 按照if那一条, 直接我们就一个从0开始的Ak就能走过来; 没错, 不用考虑其他任何的爸爸; 因为这个是一个指数增长的过程: 你想想你在0, 然后你知道只要你指数级别增长速度, 你最后肯定会走到7, 不会走过头, 那还能有比这个更快的吗? 直接走就行了;

我们再来看他怎么找6的:

3, R, A0, R, A2
4, R, A1, R, A2

你看他对于6这个终点, 两个循环(j=0..1)尝试的分别是什么?

尝试的其实好像是不同的起点, 对吧?
那么对应于这里不同的起点, 什么是相同的? 最后一个link是相同的;
不知道怎么justify, 那么举一个反例看看呢? 为什么不能从2走到6? 如果从2开始, 我们应该是要: 2 A1 R R A2, 好像跟4也没有区别?
能不能从1来? 好像不能了; 可以自己尝试一下;

反过来看他的代码, 他代码干的事情实际上是这样的, 我现在要给6找爸爸: 首先我一个向原点方向(这里没有用上面DJ算法的反向震荡的思路, 所以大的目标是从0找到6)一个反向伸出去一个A2; 然后这个A2的起点, 进行循环: 要么就是A0, 意思就是咱们就从这里出发的, 要么就是一个正向A1走到4;
那么这个循环有什么意义? 为什么用这个思路搜索的爸爸是最有力的candidate parents?

我看他倒数二三两段, 感觉好像看懂了一点; 第一个for循环肯定就是这个don't drive cross的意思, 所以才是这里固定6的parent link是A2的原因; 当然, 这个并不能解释所有的代码, 比如, 为什么这个path只有两个link? 为什么这个path的第一个link一定要跟第二个link反向? 等等;
然后for后面的这个最后的if, 就是处理, 假如我们实际上是从比如...上面的trace忘记打这个if了, 打一下;

t:(3), k:(2)  
dp:0=0 1=1 2=4 3=2   

t:(4), k:(3)  
    j:(0): [4 - 4 + 1] + 2 + 0 + 2, dp: 0=0 1=1 2=4 3=2 4=5   
    j:(1): [4 - 4 + 2] + 2 + 1 + 2, dp: 0=0 1=1 2=4 3=2 4=5   
    [7 - 4] + 3 + 1, dp: 0=0 1=1 2=4 3=2 4=5   
dp:0=0 1=1 2=4 3=2 4=5   

t:(5), k:(3)  
    j:(0): [5 - 4 + 1] + 2 + 0 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=8   
    j:(1): [5 - 4 + 2] + 2 + 1 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7   
    [7 - 5] + 3 + 1, dp: 0=0 1=1 2=4 3=2 4=5 5=7   
dp:0=0 1=1 2=4 3=2 4=5 5=7   

t:(6), k:(3)  
    j:(0): [6 - 4 + 1] + 2 + 0 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=6   
    j:(1): [6 - 4 + 2] + 2 + 1 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=6   
    [7 - 6] + 3 + 1, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=5   
dp:0=0 1=1 2=4 3=2 4=5 5=7 6=5   
6  
5, expected: 5

还是有被触发的;

比如6这里, 我们是找的是7 - 6 = 1, 意思就是1, A3, 到了6; 不对, 7 - 6 = 1, 6 = 7 - 1, 这个看起来像是0, A3, R, A1这样的操作, 但是这个不对, 因为这个不是从dp[1]出发了, 而且跟后面的3 + 1的cost也不对应: 多了1;
我唯一能想到的理解方式, 就是我们实际上是从-1过来的: -1, A3, 也不对啊, 这样的cost就应该是3了, 不是3+1;
他给的解释是, AkR, 什么意思? 从哪里A3R能到6?

我突然想到一个问题, 如果面试碰到这个问题, 你怎么表达这个问题? 你直接用Ak这个notation吗? 那么如果面试官看过这个LeetCode上面的题, 肯定知道我是看答案来的? 自己换一个notation?
不过Google不是自己本身就鼓励准备的时候刷LeetCode吗? 也不知道怎么个说法;

回到正题;
睡觉的时候又多想了一下, 还是无法理解这个思路; 在下面问了问题:

I am still trying to understand the DP approach, for an input of 6, the iteration where t is 6 would have calculations like this:
t:(6), k:(3)  
    j:(0): [6 - 4 + 1] + 2 + 0 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=6   
    j:(1): [6 - 4 + 2] + 2 + 1 + 2, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=6   
    [7 - 6] + 3 + 1, dp: 0=0 1=1 2=4 3=2 4=5 5=7 6=5   
dp:0=0 1=1 2=4 3=2 4=5 5=7 6=5
The first two j lines corresponds to the for loop. The last line corresponds to the last if line.

So if I understand this correctly, that means our result for dp[6] can be calculated in three ways:

start at 3, R, A0, R, A2, reach 6

start at 4, R, A1, R, A2, reach 6

start at -1, R, A3, reach 6

Assuming left means going in the negative direction in the numerical axis, and right means goes in the positive direction, like 0 to 3. I think your dp array is using some symmetry here, in that dp[i] actually stores the result for both dp[-i] and dp[i]: -i and i do have different default direction: the default direction at -i is left, and i is right.

I am not sure I am understanding the calculations correctly, but tentatively I have a few questions:

in the last calculation, which you describe as crossing the target, when we start at -1, I have to R first, because I am at the negative direction and my dp[-1] result currently corresponds to a default direction of left. So if I want to reach dp[6], whose default direction is right, I should first R at -1, then go A3 directly to 6.

then the question is, in the first calculation, when I start at dp[3], why do I have to do 2 R first? I am at dp[3], my default direction is right because I am in the positive half segment, why can't I just directly A2 to 6 with cost of 2? UPDATE: I think I understand this part now, my speed at dp[3] is not 1, so I should RR to reset my speed first.

In the first two calculations, according to your actual code lines, I have to make sure that my last link before reaching 6 should always be A2: so why is that? I am not wondering about the calculation about k=3, I am just asking, how do you know I should only reach 6 in this case with only 2 links? how do you know that the last one has to be A2? if I start at dp[2], then I A1 R R A2, then I actually have the same result as starting from dp[4], how does your logic handle duplicate paths like this?

如果我自己写这个DP, 我会这样写, DP数组用一个类似比如target x 2这样写, 第二个维度对应, 我当前的方向是正向还是负向. 感觉这个更正常, 后面可以考虑自己实现一下;

不过暂时先看一下discuss; 这个是没办法了; 这题后面如果面试碰到, DP肯定会问; awice这个我根本没有理解的DP我是肯定没有办法说的;

editorial2这个DP实在实在是看不懂了, 直接去discuss1看一看, 也是一个DP;

https://leetcode.com/problems/race-car/discuss/123834/C++JavaPython-DP-solution-average-O(logN)

n is the length of target in binary and we have 2 ^ (n - 1) <= target < 2 ^ n

We have 2 strategies here:

Go pass our target , stop and turn back
We take n instructions of A.
1 + 2 + 4 + ... + 2 ^ (n-1) = 2 ^ n - 1
Then we turn back by one R instruction.
In the end, we get closer by n + 1 instructions.

Go as far as possible before pass target, stop and turn back
We take N - 1 instruction of A and one R.
Then we take m instructions of A, where m < n

class Solution {  
    int[] dp = new int[10001];  

    public int racecar(int t) {  
        if (dp[t] > 0) return dp[t];  
        int n = (int)(Math.log(t) / Math.log(2)) + 1;  
        if (1 << n == t + 1) dp[t] = n;  
        else {  
            dp[t] = racecar((1 << n) - 1 - t) + n + 1;  
            for (int m = 0; m < n - 1; ++m)  
                dp[t] = Math.min(dp[t], racecar(t - (1 << (n - 1)) + (1 << m)) + n + m + 1);  
        }  
        return dp[t];  
    }  
}

说是DP, 其实是一个recursion+memo; dp的index是t, t也是recursion的参数, 然后dp的长度给的就是target的范围, 所以每一个t(node)对应的就是一个坐标轴位置应该; 传统定义;

n的算法倒是跟我一样, 而不是用awice那个办法;

好像跟editorial2的差不多啊, 回到那里用的6这个例子, 你看他dp[6]先初始化到dp[7-6]+3+1, 跟editorial2当时的例子一模一样的; 我说awice怎么editorial2讲的那么烂, 原来又是直接在discuss看来然后自己改写的; 无非是人家原作者用的是recursion, 他改成了Bottom-Up的iteration填表;

LuckyPants 116 Apr 15, 2018, 12:13 AMReport
Nice dp solution. But not very clear explanation. Here is my explanation:
Consider two general cases for number i with bit_length n.

i==2^n-1, this case, n is the best way

2^(n-1)-1<i<2^n-1, there are two ways to arrive at i:

Use n A to arrive at 2^n-1 first, then use R to change the direction, by here there are n+1 operations (n A and one R), then the remaining is same as dp[2^n-1-i]

这个是解释了我的一个主要疑惑, 其实是利用一个对称性; 我觉得这个其实不是特别好想, 尤其是如果是面试高压力的情况下, 能够这么跳脱的想到这个对称性, 估计Guoye那个脑子可以;

Use n-1 A to arrive at 2^(n-1)-1 first, then R to change the direction, use m A to go backward, then use R to change the direction again to move forward, by here there are n-1+2+m=n+m+1 operations (n-1 A, two R, m A) , current position is 2^(n-1)-1-(2^m-1)=2^(n-1)-2^m, the remaining operations is same as dp[i-(2^(n-1)-1)+(2^m-1)]=dp[i-2^(n-1)+2^m)].

这个解释的很好很好; 而且这里为什么要把n和m分开, 也就是类似awice那里说, 为什么最后一环应该是A(n-1)这个一样; 他这里给出的解释很具体; 而且这个其实解释了input是5的时候, 怎么走7那个问题: 要后撤步的走法;

那么怎么证明只有这两种走法呢? 对吧, 其实他这个也没有证明, 第二种情况下, 应该先来一个A(n-1);

Why dp in this way?

I first think the dp way should be:
dp[i] = min(n+1+dp[2**n-1-i], n-1+2+dp[i-2**(n-1)+1])

这个好像跟我后来写的这个greedy BFS有点类似;

But it's wrong, look at the (n-1) A case, we do A for (n-1) times, then do two R, then the situation is the same as dp[i-2(n-1)+1]. This can be larger than the actual min operations since, there may be redundant R operations, we can combine RR operation with the remaining (2(n-1)-1) to i path. So we use m to go backward between the two R operations and count the remaining (2^(n-1)-2^m) to i path to include the combining situation.

他下面还特别针对5这个例子解释了一下;
以及他这里总结的意思是, 我那个greedy的思路, 就是直接简单粗暴的RR减速, 实际上是不够general的, 两个R之间可能还有东西;

For example:

target = 5

The right answer should be AARARAA, positions: 0, 1, 3, 3, 2, 2, 3, 5
But if we use the above formula, the answer is AA (0->3) RR (make speed at 1 again) AARA (3->5)

We can move the last A to the middle of RR, so that we save an operation. That's where the combine can happen.
So we do dp by adding m A between the RR and add the # operations for remaining distance.
If you have better explanation, please let me know. Thanks.

这个解释得到的vote比他原作者答案还要高, 了解一下;

最后看完, 实际上也不是非常完整, 尤其是他说的:

there are two ways to arrive at i:

这个其实他并没有给出证明, 为什么其他的不行;

larrywang2014 540 Apr 15, 2018, 5:35 PMReport
The key thing that is not convincing to me is that how you prove your solution will result in the minimum number of instructions. Basically, you just gave 2 possible ways to reach target. These 2 possible ways may or may not result in the minimum number of instructions.

完全没有错; 他和awice其实都没有给出证明;

weishun 2 Apr 15, 2018, 4:41 AMReport
Nice solution. However I don't understand how you know these two strategies covers all possibilities?
For, "1. Go pass our target , stop and turn back",
Is it possible that the minimum solution is that go to 2^(n+k) with k>0 and then go back?
Thank you very much!

好像意识到这个证明不完整性的很多都是中国人;

作者回复:

lee215 1574 Apr 15, 2018, 5:17 AMReport
This was an intuition at first. I might add some explanations for this part later.

然而半个多月过去了;

kamfung2017 2 Apr 16, 2018, 11:31 PMReport
I think it is not hard to show that k=0 is the best strategy in cases when we decide to go pass the target and come back. What is harder for me to see is if we decide not to go pass the target, why would the best stagey be to go n-1 steps and stop to go back, why would it not be the best if we go k steps, where 1< k < n -1 and stop to go back.

那这题就凉了啊, 先摆着吧. 平时带着想一想, 暂时补死磕了; 发了个邮件给剑神, 不知道有没有想法;

TODO
无法证明

https://leetcode.com/problems/race-car/discuss/123884/Accepted-Java-solution-with-BFS

class Solution {  
    class CarInfo{  
        int pos, speed;  
        public CarInfo(int p, int s) {  
            pos = p;  
            speed = s;  
        }  
    }  
    public int racecar(int target) {  
        Set<String> visited = new HashSet<>();  
        String begin = 0 + "/" + 1;  
        visited.add(begin);  
        Queue<CarInfo> queue = new LinkedList<>();  
        queue.add(new CarInfo(0,1));  
        int level = 0;  
        while (!queue.isEmpty()) {  
            int size = queue.size();  
            for(int i = 0; i < size; i++) {  
                CarInfo cur = queue.poll();  
                if (cur.pos == target) return level;  
                String s1 = (cur.pos + cur.speed) + "/" + (cur.speed * 2);  
                String s2 = cur.pos + "/" + (cur.speed > 0 ? -1 : 1);  
                if (Math.abs(cur.pos + cur.speed - target) < target && !visited.contains(s1)) {  
                    visited.add(s1);  
                    queue.add(new CarInfo(cur.pos + cur.speed, cur.speed * 2));  
                }  
                if (Math.abs(cur.pos - target) < target && !visited.contains(s2)) {  
                    visited.add(s2);  
                    queue.add(new CarInfo(cur.pos, cur.speed > 0 ? -1 : 1));  
                }  
            }  

            level++;  
        }  
        return -1;  
    }  

}

简单的BFS, 好像discuss好多这样写的BFS, 就是很直接的, 一个字母一个字母的构建这个sequence, 维护speed之类的东西, 好像下面contest也有一个是这么写的;

暂时没有仔细看了, 这个应该不难吧?

好像不是那么简单哦:

想回到球场 8 Apr 15, 2018, 2:37 AMReport
I have same idea with yours. But you get an amazing pruning part. Which cut down the 75% the time. But can you prove it?
(Math.abs(cur.pos + cur.speed - target) < target

然后up自己回的:

AdaZhang 53 Apr 15, 2018, 5:12 PMReport
Unfortunately, I do not know how to prove it. Actually this is also the part I struggled in the contest. If you find out the method to prove it, please let me know. Thanks!

larrywang2014 540 Apr 15, 2018, 5:40 PMReport
@AdaZhang guess you will be rejected if you code something you yourself do not even understand during an interview.

这个也是实话实说, 一点头绪都没有的解法, 真的不要在面试当中乱说, 还不如写一个suboptimal的解法;

这题下面评论主要就是集中在这个, 我先看一下这个代码;

好像不是特别难; 封装的是position和speed, 这个还好理解; 然后循环里面, s1和s2分别对应的在当前这个cur的基础上, 我们选择A还是选择R;
最后如果hit, 返回值是level; 这个是因为我们每一个level实际上选择的就是单独的一个letter; 注意他这里的BFS写法, 是里面有size这个隔断, 而不是直接不管level分界线的;
距离的更新记录在下一个CarInfo的position里面, 这个不难理解;

所以最后基本就归结于这个优化上面了; 顺便, 这个算法确实是一个需要掌握的算法: 面试的时候想不出来DP, 这么一个简单的搜索是必须要能想到的;
注意看他这里visited的实现方式, 没有给CarInfo增加一个hash函数(好像改写hash一定要同时改写equals, 有点麻烦, 不对, 是反了吧? 改equals一定要改hash), 而是直接手动写一个tostring一样的东西, 然后记录这个的hash; 面试来说, 也不失为一个办法;

看这个优化, 这个优化的意思是, 首先, speed这个时候是没有更新的, 还是刚刚Poll出来的那个speed;
然后第一个if: 我们能不能enqueue s1, 也就是能不能在这里A? 他的判断是, 我们先当前位置按照原速度走一步, 然后如果在target左边不超过原点, 或者target右边不超过target一倍的位置, 我们才可以在这里A; 这个是怎么解读?

整个不是一个正确性要求, 而是一个pruning要求; 所以如果在[0, 2target]以外, 我们照样enqueue, 最后不影响代码的正确性, 就是慢一些; 那么这里要证明的目标应该是, 假设真的是那么远开外, 那么从那个位置走得到的sequence, 肯定不是最优的(最短的);

我现在在pos, 我有一个speed, 我想我下一步干什么; 我能不能A呢? 如果我A的话, 我马上的位置就更新为pos + speed; 然后speed会加倍, 不过这个对当前iteration没有影响;
所以这里判断的是如果我们采纳目前这个类似前++得到的speed, 那么我们会不会走太远了?
感觉这个也挺intuitive的, 能够严格证明吗?
比如我目标是6, 我在8, 我速度是+5, 那么我如果这个时候继续A, 我就走到13去了, 我要证明我在(8, +5)这里, 必须立刻R;

我怎么证明这个时候往右跳到13不明智呢? (注意我这里的思考方法, 就是不停的把你要解决的难题, 平淡语言化);

我感觉更难证明的其实是这个: Math.abs(cur.pos - target) < target; 意思是, 如果我当前决定不A, 也就是不采纳这个被上一个iteration给前++出来的speed的话, 我的位置还在[0, 2target]内的话, 我就不应该R, 意思是, 我应该A? 凭啥?
比如target是4, 那么我们先走到3, 这个时候还在0..6里面, 然后他这里的意思就是不应该R, 但是我们下一步实际上是RR了啊, 然后在A1的;
哦不对, 我理解错了: 如果不采纳之前的speed, 可以维持在0..6, 那么就可以R, 那么这个就没问题了;

而之前那个if的意思是, 如果采纳之前提供的那个speed, 还能维持在0..6, 那么就可以A; ;
这个逻辑上是没有问题的;
那等于意思就是, 用类似扫地机器人那种思路, 你一个car, Poll出来之后, 就只看你当前的信息, 然后决定你能不能A, 能不能R: 注意, 这两者不互斥的, 这也是为什么人家这里没有用else;
无论你希望A还是R, 只要最后不超过0..2t这个区间, 就没问题; 那么, 怎么证明这个;

这个好像没那么好证明啊;

一个简单的论断是, 你在8的时候有两个选择, 一个是R到(8, -1), 一个是A到(13, 10), 那么问题来了, 你A那么远出去了, 你肯定要回来吧, 至少要有一个R, 假设你A完了之后就立刻R, 那么你变成了(13, -1);
我们把这个(8, -1)(直接R得到的)和(13, -1)(AR得到的)进行对比; 是不是可以说前者肯定能更快到6呢? (就不考虑R本身就比AR节省了1了);
好像也不是那么好证明啊; 就比如是13, 一个A3立刻就回到6了, 你8怎么到6还不好说呢;
所以这个证明可能更复杂一些, 比如要证明不可能到8啊, 或者不可能A出界了还到了13这种这么巧等于6+7的位置啊, 等等;
这个就很难了; 不知道;

ososso 2 Apr 15, 2018, 11:51 AMReport
Agreed. The dp solution is not something that I can come up with within 30mins on whiteboard, even though the running time of this is 259 ms and that of dp is 7ms.

所以如果没有这个优化, 估计还真的不能AC, 因为前面有人说过了, 加了这个优化是直降75%的时间;

TODO
暂时无法证明这个;

好像他们很多竞赛选手还真的就是这样, 竞赛限时的时候, 想到一个感觉对的优化, 立刻就用上; 我看了一下, 底下的contest解法里面也有人用类似的这个pruning;

https://leetcode.com/problems/race-car/discuss/124326/Summary-of-the-BFS-and-DP-solutions-with-intuitive-explanation/136712

I -- BFS solution

Well, the BFS solution is straightforward: we can keep track of all the possible positions of the racecar after n instructions (n = 0, 1, 2, 3, 4, ...) and return the smallest n such that the target position is reached. Naive BFS will run at O(2^n) since for each position we have two choices: either accelerate or reverse. Further observations reveal that there may be overlapping among intermediate states so we need to memorize visited states (each state is characterized by two integers: car position and car speed). However, the total number of unique states still blows up for large target positions (because the position and speed can grow unbounded), so we need further pruning of the search space.

II -- DP solution

DP solution works by noting that after each reverse, the car's speed returns to 1 (the sign can be interpreted as the direction of the speed). So we can redefine the problem in terms of the position of the car while leave out the speed: let T(i) be the length of the shortest instructions to move the car from position 0 to position i, with initail speed of 1 and its direction pointing towards position i. Then our original problem will be T(target), and the base case is T(0) = 0. Next we need to figure out the recurrence relations for T(i).

Note that to apply the definition of T(i) to subproblems, the car has to start with speed of 1, which implies we can only apply T(i) right after the reverse instruction. Also we need to make sure the direction of the initial speed when applying T(i) is pointing towards the final target position.

However, we don't really know how many accelerate instructions there should be before the reverse instruction, so theoretically we need to try all possible cases: zero A, one A, two A, three A, ... and so on. For each case, we can obtain the position of the car right before the reverse instruction, which will be denoted as j = 2^m - 1, with m the number of A's. Then depending on the relation between i and j, there will be three cases:

j < i: the reverse instruction is issued before the car reaches i. In this case, we cannot apply the definition of T(i) to the subproblems directly, because even though the speed of the car returns to 1, its direction is pointing away from the target position (in this case position i). So we have to wait until the second reverse instruction is issued. Again, we don't really know how many accelerate instructions there should be in between these two reverse instructions, so we will try each of the cases: zero A, one A, two A, three A, ..., etc. Assume the number of A is q, then the car will end up at position j - p right before the second reverse instruction, where p = 2^q - 1. Then after the second reverse instruction, our car will start from position j - p with speed of 1 and its direction pointing towards our target position i. Since we want the length of the total instruction sequence to be minimized, we certainly wish to use minimum number of instructions to move the car from j - p to i, which by definition will be given by T(i-(j-p)) (note that in the definition of T(i), we move the car from position 0 to position i. If the start position is not 0, we need to shift both the start and target positions so that the start position is aligned with 0). So in summary, for this case, the total length of the instruction will be given by: m + 1 + q + 1 + T(i-(j-p)), where m is the number of A before the first R, q is the number of A before the second R, the two 1's correspond to the two R's, and lastly T(i-(j-p)) is the length of instructions moving the car from position j - p to the target position i.

j == i: the target position is reached without any reverse instructions. For this case, the total length of the instruction will be given by m.

j > i: the reverse instruction is issued after the car goes beyond i. In this case, we don't need to wait for a second reverse instruction, because after the first reverse instruction, the car's speed returns to 1 and its direction will be pointing towards our target position i. So we can apply the definition of T(i) directly to the subproblem, which will be T(j-i). Note that not only do we need to shift the start and target positions, but also need to swap them as well as the directions. So for this case, the total length of the instructions will be given by m + 1 + T(j-i).

Our final answer for T(i) will be the minimum of the above three cases.

III -- Intuitive explanation of the optimizations

As I mentioned in section I, we need further optimizations for the BFS solution to work efficiently. This turns out also to be the case for the DP solution. To see why, recall that in the first case of the DP solution, we don't really impose any upper limit on the value of q (we do have limit for the value of m though: j = 2^m-1 < i), while in the third case, we don't really have any upper limit for the value of m. Apparently we cannot explore every possible values of m and q (there are infinitely many).

to be updated...

IV -- Solutions

Here is a list of solutions: one BFS, one top-down DP and one bottom-up DP.

The BFS runs at O(target log(target)) in the worst case, with O(target log(target)) space. The reasoning is as follows: in the worst case, all positions in the range [-target, target] will be visited and for each position there can be as many as 2 * log(target) different speeds.

Both the top-down DP and bottom-up DP run at O(target * (log(target))^2) with O(target) space. However, the top-down DP may be slightly more efficient as it may skip some of the intermediate cases that must be computed explicitly for the bottom-up DP. Though the nominal time complexity are the same, both DP solutions will be much more efficient in practice compared to the BFS solution, which has to deal with (position, speed) pairs and their keys for hashing, etc.

BFS solution:

public int racecar(int target) {  
    Deque<int[]> queue = new LinkedList<>();  
    queue.offerLast(new int[] {0, 1}); // starts from position 0 with speed 1  

    Set<String> visited = new HashSet<>();  
    visited.add(0 + " " + 1);  

    for (int level = 0; !queue.isEmpty(); level++) {  
        for(int k = queue.size(); k > 0; k--) {  
            int[] cur = queue.pollFirst();  // cur[0] is position; cur[1] is speed  

            if (cur[0] == target) {  
                return level;  
            }  

            int[] nxt = new int[] {cur[0] + cur[1], cur[1] << 1};  // accelerate instruction  
            String key = (nxt[0] + " " + nxt[1]);  

            if (!visited.contains(key) && 0 < nxt[0] && nxt[0] < (target << 1)) {  
                queue.offerLast(nxt);  
                visited.add(key);  
            }  

            nxt = new int[] {cur[0], cur[1] > 0 ? -1 : 1};  // reverse instruction  
            key = (nxt[0] + " " + nxt[1]);  

            if (!visited.contains(key) && 0 < nxt[0] && nxt[0] < (target << 1)) {  
                queue.offerLast(nxt);  
                visited.add(key);  
            }  
        }  
    }  

    return -1;  
}

Top-down DP:

public int racecar(int target) {  
    int[] dp = new int[target + 1];  
    Arrays.fill(dp, 1, dp.length, -1);  
    return racecar(target, dp);  
}  

private int racecar(int i, int[] dp) {  
    if (dp[i] >= 0) {  
        return dp[i];  
    }  

    dp[i] = Integer.MAX_VALUE;  

    int m = 1, j = 1;  

    for (; j < i; j = (1 << ++m) - 1) {  
        for (int q = 0, p = 0; p < j; p = (1 << ++q) - 1) {  
            dp[i] = Math.min(dp[i],  m + 1 + q + 1 + racecar(i - (j - p), dp));  
        }  
    }  

    dp[i] = Math.min(dp[i], m + (i == j ? 0 : 1 + racecar(j - i, dp)));  

    return dp[i];  
}  

Bottom-up DP:  

```java

public int racecar(int target) {  
    int[] dp = new int[target + 1];  

    for (int i = 1; i <= target; i++) {  
        dp[i] = Integer.MAX_VALUE;  

        int m = 1, j = 1;  

        for (; j < i; j = (1 << ++m) - 1) {  
            for (int q = 0, p = 0; p < j; p = (1 << ++q) - 1) {  
                dp[i] = Math.min(dp[i], m + 1 + q + 1 + dp[i - (j - p)]);  
            }  
        }  

        dp[i] = Math.min(dp[i], m + (i == j ? 0 : 1 + dp[j - i]));  
    }  

    return dp[target];  
}

没有特别仔细的看了, 主要是看了一下他对DP的解释, 我认为还是不够完整;

when j < i:
0 -(Am)-> j -(R)-> j -(Ap)-> j-q -(R)-> j-q then subproblem T(i-(j-p)).
How do you prove that two R is optimal in this case?

You DP does assume less than the top-voted DP solution in that you actually loop on both m and q, which I prefer. But I think there are still missing pieces.

首先最起码他是在展示思考过程了; 后面再看看他有没有更多的说法, 这个人我记得是比较强的, 或许他后面会有新的想法;

TODO
关注更新;

他这里好像也提到了BFS的bound实际上并不好证明;

回复我了:

fun4LeetCode 3760 3 minutes agoReport
Hello @[email protected]. For the j < i case, the option of taking two "R" has to be optimal per the definition of the problem T(i). Because after the first two "R", no matter what actions the driver take, it cannot be shorter than T(i-(j-p)). The real questions are how to prove that:

for j < i, after the first "R", we don't need go past the beginning.

for j > i, we don't need to keep "Accelerating" once we past the target.
These two questions are related to each other (the first one can be converted into the second one), but I was not able to prove vigorously that the second statement is true.

讲的是有道理的;

contest:


class Solution {  
public:  
    int racecar(int target) {  
        map<pair<int, int>, int> used;  
        used[make_pair(0, 1)] = 0;  
        queue<pair<int, int>> wait_list;  
        wait_list.push(make_pair(0, 1));  
        while (!wait_list.empty()) {  
            auto cur = wait_list.front();  
            wait_list.pop();  
            int pos = cur.first, speed = cur.second, step = used[cur];  
            if (abs(speed) > target * 2) { continue; }  
            pos += speed;  
            speed += speed;  
            if (pos == target) { return step + 1; }  
            if (used.find(make_pair(pos, speed)) == used.end()) {  
                used[make_pair(pos, speed)] = step + 1;  
                wait_list.push(make_pair(pos, speed));  
            }  
            pos = cur.first;  
            speed = (cur.second > 0) ? -1 : 1;  
            if (used.find(make_pair(pos, speed)) == used.end()) {  
                used[make_pair(pos, speed)] = step + 1;  
                wait_list.push(make_pair(pos, speed));  
            }  
        }  
    }  
};

好像居然可以用穷搜来做?

uwi:


class Solution {  
    int o, z, u;  

    int code(int x, int y)  
    {  
        return (x+z)*u+(y+o);  
    }  

    // 8128083783063552028  
    public int racecar(int target) {  
        o = 500;  
        z = 14;  
        int no = 10000;  
        int[][] dp = new int[z*2+1][o+no+1];  
        for(int i = 0;i < z*2+1;i++)Arrays.fill(dp[i], Integer.MAX_VALUE / 3);  
        u = o+no+1;  

        Queue<Integer> q = new ArrayDeque<>();  
        q.add(code(1, 0));  
        dp[1+z][0+o] = 0;  
        while(!q.isEmpty()){  
            int cur = q.poll();  
            int x = cur / u - z, y = cur % u - o;  
            int dx = x > 0 ? 1<<x-1 : -(1<<-x-1);  
            {  
                int ny = y + dx;  
                if(ny >= -o && ny <= no){  
                    int nx = x > 0 ? x + 1 : x - 1;  
                    if(nx >= -z && nx <= z){  
                        if(dp[nx+z][ny+o] > dp[x+z][y+o] + 1){  
                            dp[nx+z][ny+o] = dp[x+z][y+o] + 1;  
                            q.add(code(nx, ny));  
                        }  
                    }  
                }  
            }  
            {  
                int nx = x > 0 ? -1 : 1;  
                int ny = y;  
                if(dp[nx+z][ny+o] > dp[x+z][y+o] + 1){  
                    dp[nx+z][ny+o] = dp[x+z][y+o] + 1;  
                    q.add(code(nx, ny));  
                }  
            }  
        }  
        int ret = Integer.MAX_VALUE;  
        for(int i = 0;i < z*2+1;i++){  
            ret = Math.min(ret, dp[i][target+o]);  
        }  
        return ret;  
    }  
}

这个代码基本上就是没法看了; 而且看起来也是一个搜索算法; 如果discuss里面看到其他有解释的搜索算法之后, 再回头来看这个思路; 或许这个搜索算法就是对应的tag里面的heap;

Blind_Lockhart:


class Solution(object):  
    def racecar(self, target):  
        """  
        :type target: int  
        :rtype: int  
        """  
        curset=set([])  
        curset.add((0,1))  
        allset=set([])  
        allset.add((0,1))  

        t=0  
        while curset:  
            t+=1  
            nextset=set([])  
            for (p,v) in curset:  
                np=p+v  
                if abs(np)<=target*3:  
                    if np==target: return t  
                    nv=v*2  
                    if (np,nv) in allset: continue  
                    allset.add((np,nv))  
                    nextset.add((np,nv))  

                np=p  
                if v>0: nv=-1  
                else: nv=1  
                if (np,nv) in allset: continue  
                allset.add((np,nv))  
                nextset.add((np,nv))  

            curset=nextset  
        return -1

看起来还不太乱的一个代码;

hanwang:


class Solution:  
    def racecar(self, target):  
        """  
        :type target: int  
        :rtype: int  
        """  
        dic = {}  
        q = [(0, 1, 0)]  
        dic[(0, 1)] = True  
        while q:  
            cur = q.pop(0)  
            if cur[0] == target:  
                return cur[2]  
            rs = -1 if cur[1] > 0 else 1  
            if (cur[0], rs) not in dic:  
                q.append((cur[0], rs, cur[2] + 1))  
                dic[(cur[0], rs)] = True  

            np = cur[0] + cur[1]  
            ns = cur[1] * 2  
            if np > 0 and np < 20000 and (np, ns) not in dic:  
                q.append((np, ns, cur[2] + 1))  
                dic[(np, ns)] = True

不算太乱的一个代码;

Problem Description

Your car starts at position 0 and speed +1 on an infinite number line. (Your car can go into negative positions.)

Your car drives automatically according to a sequence of instructions A (accelerate) and R (reverse).

When you get an instruction "A", your car does the following: position += speed, speed *= 2.

When you get an instruction "R", your car does the following: if your speed is positive then speed = -1 , otherwise speed = 1. (Your position stays the same.)

For example, after commands "AAR", your car goes to positions 0->1->3->3, and your speed goes to 1->2->4->-1.

Now for some target position, say the length of the shortest sequence of instructions to get there.


Example 1:  
Input:   
target = 3  
Output: 2  
Explanation:   
The shortest instruction sequence is "AA".  
Your position goes from 0->1->3.


Example 2:  
Input:   
target = 6  
Output: 5  
Explanation:   
The shortest instruction sequence is "AAARA".  
Your position goes from 0->1->3->7->7->6.

Note:

1 <= target <= 10000.

Difficulty:Hard
Total Accepted:93
Total Submissions:1.3K
Contributor:awice
Companies
google
Related Topics
dynamic programmingheap

LC836 RaceCar

RaceCar [source code]

think out loud

别怕枚举

editorial

Approach Framework

Approach #1: Dijkstra's [Accepted]

Approach #2: Dynamic Programming [Accepted]

Why dp in this way?

I -- BFS solution

II -- DP solution

III -- Intuitive explanation of the optimizations

IV -- Solutions

Problem Description

results matching ""

No results matching ""