NumberOfSubarraysWithBoundedMaximum [source code]
public class NumberOfSubarraysWithBoundedMaximum {
static
/******************************************************************************/
class Solution {
public int numSubarrayBoundedMax(int[] A, int L, int R) {
return count (A, R + 1) - count (A, L);
}
int count (int[] nums, int K) {
int res = 0, count = 0;
for (int num : nums) {
if (num < K) {
count++;
} else {
count = 0;
}
res += count;
}
return res;
}
}
/******************************************************************************/
public static void main(String[] args) {
NumberOfSubarraysWithBoundedMaximum.Solution tester = new NumberOfSubarraysWithBoundedMaximum.Solution();
int[][] inputs = {
{2, 1, 4, 3}, {2,3,3},
{2,9,2,5,6}, {2,8,7},
};
for (int i = 0; i < inputs.length / 2; i++) {
System.out.println (Printer.separator ());
int[] A = inputs[2 * i];
int L = inputs[2 * i + 1][0], R = inputs[2 * i + 1][1], ans = inputs[2 * i + 1][2], output = tester.numSubarrayBoundedMax (A, L, R);
System.out.printf ("[%s] and %d <= %d -> %s, expected: %d\n",
Printer.array (A), L, R, Printer.wrap (output + "", output == ans ? 92 : 91), ans
);
}
}
}
一开始的想法, 认为就是一个分段算法的逻辑, 写了半天, 不对:
class Solution {
public int numSubarrayBoundedMax(int[] A, int L, int R) {
int anchor = -1, res = 0, subarray_max = -1;
for (int i = 0; i < A.length; i++) {
System.out.printf ("\nL, R = %d, %d, A[%d] = %d\n", L, R, i, A[i]);
if (anchor < 0) {
if (A[i] >= L && A[i] <= R) {
anchor = i;
subarray_max = A[i];
}
} else {
if (A[i] <= subarray_max) {
}
else {
if (A[i] >= L && A[i] <= R)
subarray_max = A[i];
else {
res += choose (i - anchor + 1, 2);
anchor = -1;
subarray_max = -1;
}
}
}
System.out.printf ("i:(%d), anchor:(%d), res:(%d)\n", i, anchor, res);
}
if (anchor >= 0)
res += choose (A.length - anchor + 1, 2);
return res;
}
int choose (int N, int K) {
double res = 1.0;
for (int i = 1; i <= K; i++)
res *= (double) (N - K + i) / i;
return (int) res;
}
}
又尝试去改, 最后发现逻辑相当复杂; 比如2 2
这种全部in bound 的, 那么直接用N choose K的排列组合来中找就行, 但是如果是后面又有向下出界的, 那么算起来就麻烦, 不能直接简单的排列组合拉倒;
比如如果是有2 2
, 那么你是排列组合, 假如后面又来了1 1 1
, 你怎么办? 那么是要加上一个2 * 3; 假如后面又来了2 2
, 也就是又进去了? 你怎么办? 这样搞起来逻辑就很复杂, 感觉我还是没有看到问题的本质;
另外, 这种类型的问题以前总结过一般有两种思路, 一种就是简单的triggered update, 一种是continuous update, 我感觉这题后者好像合理一些, 因为感觉这题维护一种run好像不够;
上面代码是模仿uwi写的, 速度是6ms (NA). discuss里面其实很多1pass的做法, 估计是比这个稍微快一点;
editorial
Approach #1: Counting [Accepted]
Intuition
We can make the following logical steps to arrive at the solution:
As we only care about whether each element is less than, between, or greater than the interval [L, R], let's pretend each element is either 0 if it is less than L; 1 if it is between L and R; or 2 if it is greater than R.
Then, we want the number of subarrays with no 2 and at least one 1. The 2s split the array into groups of arrays with only 0s and 1s. For example, if our array is [0, 0, 1, 2, 2, 1, 0, 2, 0], then the answer is just the answer for [0, 0, 1] plus the answer for [1, 0] plus the answer for [0].
这个思考方式还是有点意思的, 不过这个分析应该是还没有结束, 比如你怎么解决0 0 1
和0 0 1 1 0
有多少个subarray?
For each such group (arrays of only 0s or 1s), to count the number of subarrays that have at least one 1, let's count all the subarrays in the group, minus the subarrays that only have zeroes.
For example, [0, 0, 1] has 6 subarrays, 3 of which are zero-only, which adds 3 to the answer; [1, 0] has 3 subarrays, 1 of which is zero-only, which adds 2 to the answer; and [0] has 1 subarray, 1 of which is zero-only, which adds 0 to the answer. The final answer to the previous original example of A = [0, 0, 1, 2, 2, 1, 0, 2, 0] is 3 + 2 + 0 = 5.
最后他下面的算法其实还是uwi的那个算法, 但是跟他上面的这个分析其实并没有什么卵关系;
Algorithm
Say count(B) is the number of subarrays that have all elements less than or equal to B. From the above reasoning, the answer is count(R) - count(L-1).
How do we compute count(B)? We remember cur, the number of contiguous, previous elements less than or equal to B. When we see a new element less than or equal to B, the number of valid subarrays ending at this position is cur + 1. When we see a new element greater than B, the number of valid subarrays ending at this position is 0.
这个就懂了, 这里就理解了为什么uwi用那样的加法来计算subarray count了, 看来还是有一点DP的思想在里面的(root level involvement).
class Solution {
public int numSubarrayBoundedMax(int[] A, int L, int R) {
return count(A, R) - count(A, L-1);
}
public int count(int[] A, int bound) {
int ans = 0, cur = 0;
for (int x: A) {
cur = x <= bound ? cur + 1 : 0;
ans += cur;
}
return ans;
}
}
算法基本是一样的;
这题geeks4geeks貌似也有: https://www.geeksforgeeks.org/number-subarrays-maximum-value-given-range/
看来这个网站真的是全面, 这种新题都已经有了;
class Solution {
public int numSubarrayBoundedMax(int[] A, int L, int R) {
int j=0,count=0,res=0;
for(int i=0;i<A.length;i++){
if(A[i]>=L && A[i]<=R){
res+=i-j+1;count=i-j+1;
}
else if(A[i]<L){
res+=count;
}
else{
j=i+1;
count=0;
}
}
return res;
}
}
没想到真的有奇葩把1pass的做法写出来了, 对于:
[0,3,1,4,5,2,1,5,10,6]
3
6
这个例子:
核心就是用j来记录当前valid的*最长8substring的起点(如果不valid, 也就是超过R了, 你看他直接就把j移动到下一个i的位置), 然后count跟上面editorial的定义实际上是一样的, 就是number of subarray ending at [i];
不对, count好像不是这么定义的;
The condition A[i]>=L && A[i]<=R, that means is a valid A[j:i] subarray and thus we can have (i-j+1) valid subarrays, count is the valid subarrays between j to i at this point
The condition A[i]=L and <=R) which is within A[j:i], thus adding last valid number of subarrays which is count.
Else just move the back pointer forward
这个解释好像还可以;
难点是, 比如上面图上, 中间的2和1这里, 应该怎么操作? 为什么这里可以直接res += count?
所以count实际上维护的是最长的valid subarray (maximum in range), with ending number also in range的长度;
那么在2这里, 实际上就是从2向前面这5个number, 各自发出一条延长线, 就是这里多加出来的5的含义, 看这个图, 是对于1的:
所以j维护的是最长valid subarray的起点, 然后count维护的是从这个起点到最后一个in range的点的距离;
我的解释:
Definition:
valid: describes a subarray that has its maximum in range L..R;
j: start index of the longest valid subarray up till i
;
count: length of subarray A[j..x]
where A[x]
is the last number (within A[j..i]
) that is still in range L..R;
If A[i]
is in range L..R, then we count subarrays it can contribute: those subarrays ending at A[i]
. This counting avoids duplicate. Say we are at A[4] = 5
, any point within A[j..4-1]
can be the start point.
The tricky part is to think, say when i
is 5 or 6, what do we do with A[5] = 2
or A[6] = 1
?
Take i == 6
for example: why do we add 5 here to res
?
The number of subarrays that A[6]=1
can contribute are marked with purple. Anything in the range of A[0..5]
can be the start point of the subarray that ends at A[6]=1
, which are subarrays contributed by A[6]=1
.
The idea is to keep track of 3 things while iterating: the number of valid subarrays (res), the number of valid subarray starting points (heads), and the number of not-yet valid starting points (tails).
- Values between L and R are heads. Every contiguous value afterwards that is less than R afterwards can extend them, creating new subarrays.
- Values less than L are tails. If they connect to a head later in the array, they become a head for valid subarrays.
- Values greater than R are combo breakers. They stop all heads and tails from forming from subarrays.
Therefore, we keep a rolling count of valid subarrays as we iterate through A, the main array.- If a head is encountered, it joins the existing heads to form subarrays at each iteration. All tails are promoted to heads. All existing heads create a new valid subarray.
- The new head creates subarray of a single element ([head])
- Each promoted head creates subarrays from its tail index to current index (e.g. [tail1, tail2, head, …], encountering head promotes tail1 and tail2 to heads and creates [tail1, tail2, head] and [tail2, head])
- If a tail is encountered, all existing heads can create another subarray with it. The tail remains useless until it encounters a head (see above).
- If a combo breaker is met, all existing heads and tails become useless, and are reset to 0.
Counts of new subarrays (i.e. head count) are added to res at each iteration, if valid.
他这个逻辑跟上一个算法的核心应该是类似的, 不过用不同的方式表达出来了; 注意到他这里他其实是清楚的意识到了, 要把in range的(heads)和less的(tails)分开区别对待, 这个想法我自己当时contest的时候也是想到了的, 不过后面没有想到他这里这么复杂的逻辑;
class Solution {
public:
int numSubarrayBoundedMax(vector<int>& A, int L, int R) {
int res = 0, heads = 0, tails = 0;
for (int val : A) {
if (L <= val && val <= R) {
// val is a head. All tails promoted to heads
heads+= tails + 1;
tails = 0;
res += heads;
}
else if (val < L) {
// val is a tail, can extend existing subarrays
tails++;
res += heads;
}
else {
// combo breaker
heads = 0;
tails = 0;
}
}
return res;
}
};
promotion这个操作很有意思;
class Solution {
public:
int numSubarrayBoundedMax(vector<int>& A, int L, int R) {
int result=0, left=-1, right=-1;
for (int i=0; i<A.size(); i++) {
if (A[i]>R) left=i;
if (A[i]>=L) right=i;
result+=right-left;
}
return result;
}
};
i:(0) >>> left:(-1), right:(-1), res:(0)
i:(1) >>> left:(-1), right:(1), res:(2)
i:(2) >>> left:(-1), right:(1), res:(4)
i:(3) >>> left:(-1), right:(3), res:(8)
i:(4) >>> left:(-1), right:(4), res:(13)
i:(5) >>> left:(-1), right:(4), res:(18)
i:(6) >>> left:(-1), right:(4), res:(23)
i:(7) >>> left:(-1), right:(7), res:(31)
i:(8) >>> left:(8), right:(8), res:(31)
i:(9) >>> left:(8), right:(9), res:(32)
看trace来说跟第一个做法实际上是差不多的, 只不过代码更加简练, 但是背后的逻辑实际上是一样的; 注意看他这里的更新方式, 永远都能保证left <= right: 一个[i]可能可以更新right(只要geq L), 但是未必能够更新left(需要> R).
Explanation:left+1..right
is the longest subarray so far that has an ending number (A[right]
) in range L..R
.
When A[i] > R
, the update ensures this interval becomes empty again so the update to result
can be handled uniformly in all situations.
The initial case: we don't have some A[right]
in range L..R
yet, so it's also empty.
I literally submitted this in the last 10 seconds left in the contest and I’m surprised I didn’t miss any corner cases, lol…
Imagine you have a state machine. You really only have the following two states:
1 - Since seeing the last >R element, you have not yet seen an [L,R]-element. (call “haveYet = false”)
2 - Since seeing the last >R element, you have seen at least one or more [L,R]-element. (call “haveYet = true”)We initialize like this: we pretend A has a >R element at index -1 (imaginary index) and we are walking first at index 0, and we are in state (1).
If we see a new >R element, we transition to state (1) no matter whether we are currently in state (1) or state (2). Save the position of this as ‘LastGR’ (stands for last index greater than R).
If we see a new [L,R]-element, we transition from (1)–>(2) and mark this as the ‘lastViable’.Now, when we are on index i, we want to see if we can use this as an end index for ALL subarrays that could possibly end at this position. We are only able to do this if we are in state (2) AND the subarray encapsulates at least the last [L,R]-element we have seen. So, for a start index, everything since the position lastGR+1 up to and including the lastViable (last [L,R]-element we saw) is a possible starting position. We add those to our global running sum.
class Solution {
public int numSubarrayBoundedMax(int[] A, int L, int R) {
int lastGR = -1;
int sum = 0;
boolean haveYet = false;
int lastViable = -1;
for (int i = 0; i < A.length; i++) {
if (A[i] > R) { lastGR = i; haveYet = false; continue; }
if (A[i] >= L && A[i] <= R) { lastViable = i; haveYet = true; }
if (haveYet) {
sum += lastViable - lastGR;
}
}
return sum;
}
}
跟上面两个做法实际上是类似的, 维护两个index; 不过逻辑没有上面那个人那么简洁: 多维护了一个boolean, 而且对sum的update加了一个conditional;
不过说实话, 短时间内想要理清这种逻辑还是挺难的; 这题最直观的方法最后实际上还是uwi的那种做法;
uwi做法:
class Solution {
public int numSubarrayBoundedMax(int[] a, int L, int R) {
int n = a.length;
return count(a, R) - count(a, L-1);
}
int count(int[] a, int R)
{
int my = 0, ret = 0;
for(int v : a){
if(v <= R){
my++;
}else{
my = 0;
}
ret += my;
}
return ret;
}
}
看了一下他主函数, 好像他这个直觉真的很强, 因为是一个找interval的问题, 他立刻就反应过来用差值的思路来做, 这个真的是素质非常的强才能立刻想到的做法;
无话可说, 这个真的就是我智商不够用, 没有想到这个;
仔细思考了一下, 大概理解了他这个算法, 只能说他这里这个count的定义真的是非常的犀利; count返回的就是number of subarrays such that its maximum is less or equal to R; 这样一个定义, 就可以让差值代表要求的这个内容; 想到这个思路的一个方向, 就是脑子里有一个很好的visual, 因为这个subarray是可以有任何的形状的, 就好像是一个在bound下面游动的小蛇一样, 你关心的只是一个最高点, 在一个水平面下面就行了;
下面一个不理解的地方就是, 为什么在算count的时候, 直接只用my++, 比如我们有2 2 2 2
这样一个, my最后贡献给ret的就是1 + 2 + 3 + 4. 为什么不算subarray? 诶, 不对啊, 这个最后的结果是跟N choose 2一样的啊? 不信你看等差数列求和, 跟N choose 2实际上是一样的;
那么uwi为什么知道这里这么写? 难道真的只是巧合? 还是故意炫技?
另外, 这里也是间接学到了, 如果是C(N, 2), 那么实现起来的线性算法反而更加简单;
或者你直接尝试理解? 为什么C(N, 2), 跟这个等差求和是一样的结果, 背后有没有深层次的道理?
哇, 看uwi的代码真的是学到很多东西;
大概能理解为什么首先这两个值相等, 比如N等于5, 那么你+2的时候, 加的实际上就是两个长度为3的subarray; 感觉uwi应该是熟悉这种subarray的count的方法, 毕竟这个方法还是比较neat的, 只要知道总长度, 然后一个1..N的求和就行了, 这种特别适合N还在变化的场合的求和;
一个图示: 脑子里可以有这个水面下扭动的蛇的动态;
Problem Description
We are given an array A of positive integers, and two positive integers L and R (L <= R).
Return the number of (contiguous, non-empty) subarrays such that the value of the maximum array element in that subarray is at least L and at most R.
Example :
Input:
A = [2, 1, 4, 3]
L = 2
R = 3
Output: 3
Explanation: There are three subarrays that meet the requirements: [2], [2, 1], [3].
Note:
- L, R and A[i] will be an integer in the range [0, 10^9].
- The length of A will be in the range of [1, 50000].
Difficulty:Medium
Total Accepted:470
Total Submissions:1.8K
Contributor:ngoc_lam
Companies
adobe
Related Topics
array