MinimumMovesToEqualArrayElementsII [source code]
public class MinimumMovesToEqualArrayElementsII {
static
/******************************************************************************/
public class Solution {
public int minMoves2(int[] nums) {
Arrays.sort(nums);
int i = 0, j = nums.length - 1;
int sum = 0;
while (i < j)
sum += nums[j--] - nums[i++];
return sum;
}
}
/******************************************************************************/
public static void main(String[] args) {
MinimumMovesToEqualArrayElementsII.Solution tester = new MinimumMovesToEqualArrayElementsII.Solution();
}
}
这个题目一点想法也没有, 直接看答案了;
首先这个问题的笨办法你要想一想的, 就是从min到max一个iterate, 然后每一个iteration计算一下距离之和就行了;
这个就是editorial1的做法:
public class Solution {
public int minMoves2(int[] nums) {
long ans = Long.MAX_VALUE;
int minval = Integer.MAX_VALUE;
int maxval = Integer.MIN_VALUE;
for (int num : nums) {
minval = Math.min(minval, num);
maxval = Math.max(maxval, num);
}
for (int i = minval; i <= maxval; i++) {
long sum = 0;
for (int num : nums) {
sum += Math.abs(num - i);
}
ans = Math.min(ans, sum);
}
return (int) ans;
}
}
注意这个的复杂度是比N^2还要差的, 因为diff很可能比N要大;
editorial2对上面优化到N^2:
最后的这个目标value只有可能是在这N个value当中的一个, 具体的证明有点麻烦, 不过可以证明, 无论你最后的目标value是在哪两个nums的value中间, 你移动目标value到这两个entry当中的一个, 最后的optimality是不会被影响的; 这个可以自己大概画一个图(代表nums的曲线记得是sorted的, 这样方便理解)就可以看出来了;
editorial上面给出的一个类似的证明是反过来的: 他先假设移动到最终的target value最近的entry, 然后解释从这个entry移动到optimal value的过程中, 至少不会降低距离之和(moves之和. 维持不变, 是在target value上下的entry数量一样的情况下(sorted前提), 其他情况下, 肯定会增加).
public class Solution {
public int minMoves2(int[] nums) {
long min = Integer.MAX_VALUE;
for (int num : nums) {
long sum = 0;
for (int n : nums) {
sum += Math.abs(n - num);
}
min = Math.min(min, sum);
}
return (int) min;
}
}
感觉还是有点不甘心, 按理来说这个穷搜不是想不出来的; 我们以前也总结过了, math问题, 不会math的解法不怪你, 但是你搜索的方法至少要能写一个AC的出来;
editorial3在上面的基础上优化到NlgN:
public class Solution {
public int minMoves2(int[] nums) {
Arrays.sort(nums);
long min = Long.MAX_VALUE, sum = 0, total = 0;
for (int num : nums) {
total += num;
}
for (int i = 0; i < nums.length; i++) {
long ans = ((long) nums[i] * i - sum) + ((total - sum) - (long) nums[i] * (nums.length - i));
System.out.println(nums[i] + " " + ans);
min = Math.min(min, ans);
sum += nums[i];
}
return (int) min;
}
}
这里利用的一个核心原理是: 如果你的target是k, at index_k(因为k肯定是nums的一个entry), 那么我们要算的moves之和就是(在sorting之后):
(k countBefore_k - sumBefore_k) + (sumAfter_k - k countAfter_k);
这两个sum在iterate的过程当中是非常好计算的, 左边的吃, 右边的吐, 就行了;
editorial4的时候就是math了: 这里他开始认为这个问题就是给你一组点, 要你找到跟所有的点的距离之和最近的一个y值, level;
The problem of finding the number k to which all the other numbers eventually settle can also be viewed as: Given a set of points in 1-d. Find a point k such that the cumulative sum of distances between k and the rest of the points is minimum. This is a very common mathematical problem whose answer is known. The point k is the median of the given points.
也不知道哪里common了, 反正这个答案我还真是不知道的; 这个math概念我是想到了的, 但是就是不知道怎么做. 自己随便才一个average或者是median肯定是不行的, 真正面试的时候一定会让你证明的: 这种代码这么简单的问题, 如果你讲不清楚背后的原理, 到时候代码一写两个人大眼瞪小眼吗;
具体的证明还有点复杂, 就是上面editorial3的时候的那个等式, 两边全部对k取导数. 左边导数为0的时候就是最小值;
其实我想到一个问题, 如果按照之前证明editorial2的优化的时候的思路, 自己在脑子里对sort之后的曲线进行一个动态移动是可以想出来median是最小的: 无论你从median位置向上还是向下移动(by one entry at a time), 比如你就有五个点的这种; 本来你在[2]的位置, 你移动到[3], 那么你左边的sum就会增加三个distance = nums[3] - nums[2], 右边的sum则会减少两个这个distance, 最后的总的moves肯定是增加的了的. 也就是从median位置往两头移动都会导致moves增加. 脑子里过一遍动画效果应该是能想通的;
感觉这个题目, 能够想到k全部取到entry还是听关键的;
这个是代码:
public class Solution {
public int minMoves2(int[] nums) {
Arrays.sort(nums);
int sum = 0;
for (int num : nums) {
sum += Math.abs(nums[nums.length / 2] - num);
}
return sum;
}
}
discussion有一个类似的:
public class Solution {
public int minMoves2(int[] nums) {
Arrays.sort(nums);
int i = 0, j = nums.length-1;
int count = 0;
while(i < j){
count += nums[j]-nums[i];
i++;
j--;
}
return count;
}
}
这个代码正好也是editorial5的优化, 就是省略了找median的过程;
这个解仍然是NlgN;
editorial下面的两个解都是意识到: 这里我们需要的其实不是sorting(虽然sorting对于很多array问题都是降低难度的第一步), 我们实际上需要的只是median; 所以下面就改进找median的办法, 整个算法的复杂度就可以达到O(N)了;
两个优化用的方法分别是Quick select和median of medians. 所以这个其实也未必是坏事, 这两个都是从来没有实现过的坑, 正好填一下;
editorial6用的是quick select:
public class Solution {
public void swap(int[] list, int i, int j) {
int temp = list[i];
list[i] = list[j];
list[j] = temp;
}
public int partition(int[] list, int left, int right) {
int pivotValue = list[right];
int i = left;
for (int j = left; j <= right; j++) {
if (list[j] < pivotValue) {
swap(list, i, j);
i++;
}
}
swap(list, right, i);
return i;
}
int select(int[] list, int left, int right, int k) {
if (left == right) {
return list[left];
}
int pivotIndex = partition(list, left, right);
if (k == pivotIndex) {
return list[k];
} else if (k < pivotIndex) {
return select(list, left, pivotIndex - 1, k);
} else {
return select(list, pivotIndex + 1, right, k);
}
}
public int minMoves2(int[] nums) {
int sum = 0;
int median = select(nums, 0, nums.length - 1, nums.length / 2);
for (int num : nums) {
sum += Math.abs(median - num);
}
return sum;
}
}
这个算法的O(N)是AverageCase, 如果是WorstCase, 其实还是O(N^2); 这个是editorial上面给出的答案, 我好像记得当时上算法课的时候给的结果不是这个?
注意这里, pivotIndex是用来判断median的标准, 它的标称值始终是length / 2, 这个结论可以稍微记一下, 也就是说median始终是在length / 2这个位置的, 不用管长度的奇偶性;
幸亏当时上课的时候学过一次
这个InPlace的partition当时也是纠结了挺久才理解的一个算法, 也属于一个快慢指针的算法: j至少比i快;
所以这个算法其实也不是特别的复杂, 核心函数就是select, 其结构也很简单, 就是先base case, 然后partition, 这个相当于root operation, 其实是每一个recursion都有的; 然后最后就是recursive call;
当然细节上的处理还是记忆一下, 比如select的三个参数分别是什么, 是index类还是length类; 以及partition的处理, j的范围, 右边界right是否include(这个是跟底下的list[j] < pivotValue是否取等号要对应的).
其他的话, 就没什么, 这个算法本身只能说不熟悉, 但是不能说难到看不懂;
editorial7采用的就是median or medians, 这个算法基本就是完全不记得怎么回事了;
这个算法的WorstCase是O(N). 而quick select的话, 如果pivot的选择太arbitrary, 最后是有可能速度很差的;
这个作者认为median of medians其实就是对quick select算法的一个小优化: choose the pivot more cleverly;
public class Solution {
public void swap(int[] list, int i, int j) {
int temp = list[i];
list[i] = list[j];
list[j] = temp;
}
public int partition(int[] list, int left, int right, int val) {
int i;
for (i = left; i < right; i++) {
if (list[i] == val) {
break;
}
}
swap(list, i, right);
int pivotValue = list[right];
int storeIndex = left;
for (i = left; i <= right; i++) {
if (list[i] < pivotValue) {
swap(list, storeIndex, i);
storeIndex++;
}
}
swap(list, right, storeIndex);
return storeIndex;
}
int findMedian(int arr[], int l, int len) {
Arrays.sort(arr, l, l + len);
return arr[l + len / 2];
}
int kthSmallest(int arr[], int l, int r, int k) {
if (k > 0 && k <= r - l + 1) {
int n = r - l + 1, i;
int median[] = new int[(n + 4) / 5];
for (i = 0; i < n / 5; i++) {
median[i] = findMedian(arr, l + i * 5, 5);
}
if (i * 5 < n) {
median[i] = findMedian(arr, l + i * 5, n % 5);
i++;
}
int medOfMed = (i == 1) ? median[i - 1]: kthSmallest(median, 0, i - 1, i / 2);
int pos = partition(arr, l, r, medOfMed);
if (pos - l == k - 1) {
return arr[pos];
}
if (pos - l > k - 1) {
return kthSmallest(arr, l, pos - 1, k);
}
return kthSmallest(arr, pos + 1, r, k - pos + l - 1);
}
return Integer.MAX_VALUE;
}
public int minMoves2(int[] nums) {
int sum = 0;
int median = kthSmallest(nums, 0, nums.length - 1, nums.length / 2 + 1);
for (int num : nums) {
sum += Math.abs(median - num);
}
return sum;
}
}
Using the above method ensures that the chosen pivot, in the worst case, has atmost 70% elements which are larger/smaller than the pivot.
quick select的一个问题就在于你怎么选择pivot, 使得recursion的level数量最少, 也就是partition执行的次数最少(CodeLine思路的复杂度计算告诉我们, 这个才是最costly的部分: recursive call本身是可以忽略cost的); 如果我们的pivot的选择非常的差, 那么一个level只能减少1的长度, 最后level的数量可能是N. 那么最后N个partition, 就是N^2的复杂度了;
这个算法还是有点复杂的, 很多小细节的处理, 比如:
int median[] = new int[(n + 4) / 5];
这里这个size的处理, 这样就保证如果有多余的个数, 也能够有地方放;
包括最后一个interval怎么处理;
注意这里这个k的定义; 就是ascending的时候的第k个, 不过是0-based还是1-based? 举个例子看了一下, 好像是1-based;
当然这个算法我觉得最需要学习的还是变量和参数的设置: 到底是用index类还是length类, 以及index类的时候, 右端点是否include;
这个算法有点复杂, 留给以后有时间自己实现一遍吧, 不过就算是看过一遍, 感觉这个算法一小时内都不一定写的出来; 关键还是很多习惯, 尤其是变量设置习惯的养成;
这个算法而且我感觉有点奇怪, 他这个是不是就直接找到median了啊? 为什么还要用kthSmallest的quick select来做呢? 直接这个medOfMed就是真正的median了吧?
好像就是用来验证一下?
算了后面还是等把CLRS相关内容看一下之后再来回头看这个算法是什么意思; editorial下面倒是给了一个复杂度的证明, 不过又臭又长懒得看了, 不可能比CLRS要好的.
这个是discussion上面一个对于median的证明: https://math.stackexchange.com/questions/113270/the-median-minimizes-the-sum-of-absolute-deviations
editorial里面的文章好像就是抄的这个人的解:
Hi there! I am sharing my full thinking process under time pressure and solution. The resultant array elements must be all the same. Let's say some integer k, so final array looks like [k,k,k,...,k]. The number of moves to get that array can be calculated by, moves = |k - a1| + |k-a2| + ...+|k-ai|+...+|k-an|. Note that some elements are greater than k and some elements are less. Thus, it worth to know the number of elements that are less than k, and the number of elements that are greater that k. If know them we can reformulate the equation as moves = Nlessk - sumLess +sumGreater-Ngreaterk. Now the equation gets more "programmable", and we have to search for such k, that minimizes value of moves. It is obvious that k must be selected among the original array elements, because it reduces the number of operations at least by one. Thus, we came up with an idea. For each element in array calculate moves, then select minimum among them. Now, we see another problem, how to calculate them number of elements that are less than k and their sum? The same problem occurs for greater elements. Naive solution is to calculate number of smalles elements and greater elements along with their sum for each element of the array seperately. That would give us solution with O(n^2) time complexity.
Can we do better? Yes we can, if we sort out the original array first, we can immediately know both the number of elements that are less than the current element and the number of elements that are greater that the current element. In addition we have to know total sum of elements and sum of elements before each element. Finally we came up with O(nlog(n)) time and O(1) space solution! Below is the code for that solution:
public class Solution {
public int minMoves2(int[] nums) {
if(nums==null|| nums.length==0) return 0;
long moves = Integer.MAX_VALUE;
Arrays.sort(nums);
long totalSum = 0L;
long sum = 0L;
for(int i =0;i<nums.length;i++){
totalSum += (long)nums[i];
}
for(int i =0;i<nums.length;i++){
long m = (long)(i-(nums.length-i-1)-1)*(long)nums[i]-sum+(totalSum-sum);
moves = Math.min(m, moves);
sum+=nums[i];
}
return (int)moves;
}
}
Can we do better? Yes, we can! Let's think about k. Intuitively that value must be the median element in the original sorted array. Proof of that fact is simple, just look at our equation moves = numLessk-sumLess + sumGreater - numGreaterk. Let's take derivative from that function by k. dmoves/dk = numLess-numGreater. The moves becomes minimum when the derivative is zero, so numLess-numGreater = 0 or numsLess = numGreater. Which is possible only when the element is the median element. We can find k in O(1) time. Instead of brute forcing all elements in the array we just consider the middle element in the sorted array. Despite generally the time complexity remains being O(nlog(n)), due to sorting, it improves performance very much. Further it can be optimized up to O(n), by using Quick select algorithm, and you can find this solution here.
public class Solution {
public int minMoves2(int[] nums) {
if(nums==null|| nums.length==0) return 0;
long moves = Integer.MAX_VALUE;
Arrays.sort(nums);
long totalSum = 0L;
long sum = 0L;
for(int i =0;i<nums.length;i++){
totalSum += (long)nums[i];
if(i<nums.length/2) sum+=(long)nums[i];
}
int k = nums.length/2;
moves = (long)(k-(nums.length-k-1)-1)*(long)nums[k]-sum+(totalSum-sum);
return (int)moves;
}
}
P.S: I have demonstrated the calculation of moves to be understandable, that is why it looks poor. Also be careful with shortening equation for moves in the first code. Because it may exceed the integer range if not to be careful with multiplications. Sorry for poor code.
这个是他对于k一定位于entry的解释:
Hi! Let's consider some number k that does not belong to the array (Final array [k,k,k,k...,k]). Then it would be necessary to change all elements to achieve the final array. If we select k that belongs to the array, then at least one element remains without change.
很聪明好吧; 不对, 这个解释好像是错的. 并没有解释核心问题; 还是要用我自己的解释来想这个问题;
仔细想想好像之前我那个证明k on entry的思路并不严谨? 而且如果是even的长度, 那么你根本无法证明k on entry的; 感觉那个证明最后只能直接和median的证明结合起来考虑;
不对, k on entry好像还是对的, 就算是even长度的情况, median本身是不存在的, 但是median两头的这两个entry, 其实都是可以达到和median本身一样的optimality的;
不过好像这个东西其实还是无法证明的, 除非你提及median, 脱离median来证明这个东西根本无法解决(sum的差值是跟k左右两边的count的差值有关的, 你要证明sum的差值怎样, 你就要证明count差值怎样, 那么这个就是要证明median了); 所以最后估计还是要用求导的算法来做;
discussion上面也有一个人做了一个quick sort:
This solution relies on the fact that if we increment/decrement each element to the median of all the elements, the optimal number of moves is necessary. The median of all elements can be found in expected O(n) time using QuickSelect (or deterministic O(n) time using Median of Medians).
public int minMoves2(int[] A) {
int sum = 0, median = quickselect(A, A.length/2+1, 0, A.length-1);
for (int i=0;i<A.length;i++) sum += Math.abs(A[i] - median);
return sum;
}
public int quickselect(int[] A, int k, int start, int end) {
int l = start, r = end, pivot = A[(l+r)/2];
while (l<=r) {
while (A[l] < pivot) l++;
while (A[r] > pivot) r--;
if (l>=r) break;
swap(A, l++, r--);
}
if (l-start+1 > k) return quickselect(A, k, start, l-1);
if (l-start+1 == k && l==r) return A[l];
return quickselect(A, k-r+start-1, r+1, end);
}
public void swap(int[] A, int i, int j) {
int temp = A[i];
A[i] = A[j];
A[j] = temp;
}
他这个还是submission最优解好像.
Problem Description
Given a non-empty integer array, find the minimum number of moves required to make all array elements equal, where a move is incrementing a selected element by 1 or decrementing a selected element by 1.
You may assume the array's length is at most 10,000.
Example:
Input:
[1,2,3]
Output:
2
Explanation:
Only two moves are needed (remember each move increments or decrements one element):
[1,2,3] => [2,2,3] => [2,2,2]
Difficulty:Medium
Total Accepted:16.4K
Total Submissions:31.8K
Contributor: andrew56
Related Topics
math
Similar Questions
Best Meeting Point Minimum Moves to Equal Array Elements