Understanding the Median-Based Optimization
Introduction
When solving array transformation problems, a common challenge is minimizing the number of operations required to make elements equal. This problem asks us to determine, for every prefix of a sorted array, the minimum number of increment or decrement operations needed to make all elements in that prefix equal.
At first glance, the problem may seem like a brute-force optimization task. However, a key mathematical observation allows us to solve it efficiently in linear time.
Consider the array:
[1, 6, 9, 12]
For every index i, we examine the prefix from 0 to i.
For i = 2, the prefix becomes:
[1, 6, 9]
We need to make all elements equal while minimizing the total number of operations.
An operation consists of either increasing or decreasing an element by 1.
If we choose a target value of 6:
|1 - 6| + |6 - 6| + |9 - 6|
= 5 + 0 + 3
= 8
The total cost is 8.
The question is: how do we know that 6 is the best value to choose?
The answer comes from an important mathematical property.
For a set of numbers, the value that minimizes the sum of absolute differences is the median.
For example:
[1, 6, 9]
Median = 6
Cost at 6:
5 + 0 + 3 = 8
Cost at 5:
4 + 1 + 4 = 9
Cost at 7:
6 + 1 + 2 = 9
The median produces the minimum possible cost.
Since the given array is already sorted, finding the median of any prefix becomes extremely easy.
For a prefix ending at index i:
mid = i / 2
The median is simply:
arr[mid]
A beginner might now think that we can compute the answer by iterating through every element in every prefix and calculating distances from the median.
While correct, that approach would require:
O(1 + 2 + 3 + ... + n)
= O(n²)
With n = 100000, this would be far too slow.
We need a faster way.
Breaking Down the Cost
Let's examine how the cost is formed.
Assume:
Prefix = [1, 6, 9, 12]
Median = 6
The total cost is:
(6 - 1) + (6 - 6) + (9 - 6) + (12 - 6)
Since the array is sorted, all elements before the median are less than or equal to the median.
Similarly, all elements after the median are greater than or equal to the median.
This allows us to split the cost into two parts.
Left Side Contribution
Median * CountLeft - SumLeft
Right Side Contribution
SumRight - Median * CountRight
Combining them gives:
Cost =
(Median * CountLeft - SumLeft)
+
(SumRight - Median * CountRight)
Using Prefix Sums
Now the challenge becomes obtaining SumLeft and SumRight efficiently.
This is where prefix sums help.
Create an array:
pref[i]
where:
pref[i] = arr[0] + arr[1] + ... + arr[i]
Example:
Array: [1, 6, 9, 12]
Prefix Sum: [1, 7, 16, 28]
Using prefix sums:
leftSum = pref[mid]
and
rightSum = pref[i] - pref[mid]
Both calculations take constant time.
For every prefix:
Find the median index.
Get left and right sums from the prefix-sum array.
Compute left cost.
Compute right cost.
Add them together.
Each prefix is processed in O(1) time.
Since there are n prefixes:
Total Time Complexity = O(n)
Cost Formula
The complete formula becomes:
leftCost =
1L * arr[mid] * (mid + 1) - leftSum;
rightCost =
rightSum - 1L * arr[mid] * (i - mid);
answer =
leftCost + rightCost;
Example Verification
Let's verify it on:
[1, 6, 9]
Median index:
mid = 1
median = 6
Left sum:
1 + 6 = 7
Right sum:
9
Left cost:
6 * 2 - 7
= 5
Right cost:
9 - 6 * 1
= 3
Total:
5 + 3
= 8
which matches the expected answer.
Key Insight
The key takeaway from this problem is recognizing the role of the median in minimizing absolute differences. Once that observation is made, prefix sums transform an otherwise quadratic solution into an efficient linear-time algorithm.
This combination of mathematical insight and preprocessing is a common pattern in coding interviews and competitive programming.
Since the array is already sorted, the minimum number of operations needed to make all elements in a prefix equal is obtained by making them equal to the median of that prefix.
For a prefix arr[0...i]:
The cost can be computed in O(1):
cost =
(arr[mid] * (mid + 1) - leftSum)
+
(rightSum - arr[mid] * (i - mid))
where:
Using prefix sums, each prefix answer is computed in constant time, giving O(n) overall.
Java Solution
class Solution {
public ArrayList<Integer> optimalArray(int[] arr) {
int n = arr.length;
ArrayList<Integer> ans = new ArrayList<>();
long[] pref = new long[n];
pref[0] = arr[0];
for (int i = 1; i < n; i++) {
pref[i] = pref[i - 1] + arr[i];
}
for (int i = 0; i < n; i++) {
int mid = i / 2;
long leftSum = pref[mid];
long rightSum = pref[i] - pref[mid];
long leftCost = 1L * arr[mid] * (mid + 1) - leftSum;
long rightCost = rightSum - 1L * arr[mid] * (i - mid);
long cost = leftCost + rightCost;
ans.add((int) cost);
}
return ans;
}
}