Data Structures and Algorithms (DSA)  

Equalize All Prefix Sums

Understanding the Median-Based Optimization

Introduction

When solving array transformation problems, a common challenge is minimizing the number of operations required to make elements equal. This problem asks us to determine, for every prefix of a sorted array, the minimum number of increment or decrement operations needed to make all elements in that prefix equal.

At first glance, the problem may seem like a brute-force optimization task. However, a key mathematical observation allows us to solve it efficiently in linear time.

Consider the array:

[1, 6, 9, 12]

For every index i, we examine the prefix from 0 to i.

For i = 2, the prefix becomes:

[1, 6, 9]

We need to make all elements equal while minimizing the total number of operations.

An operation consists of either increasing or decreasing an element by 1.

If we choose a target value of 6:

|1 - 6| + |6 - 6| + |9 - 6|
= 5 + 0 + 3
= 8

The total cost is 8.

The question is: how do we know that 6 is the best value to choose?

The answer comes from an important mathematical property.

For a set of numbers, the value that minimizes the sum of absolute differences is the median.

For example:

[1, 6, 9]

Median = 6

Cost at 6:

5 + 0 + 3 = 8

Cost at 5:

4 + 1 + 4 = 9

Cost at 7:

6 + 1 + 2 = 9

The median produces the minimum possible cost.

Since the given array is already sorted, finding the median of any prefix becomes extremely easy.

For a prefix ending at index i:

mid = i / 2

The median is simply:

arr[mid]

A beginner might now think that we can compute the answer by iterating through every element in every prefix and calculating distances from the median.

While correct, that approach would require:

O(1 + 2 + 3 + ... + n)
= O(n²)

With n = 100000, this would be far too slow.

We need a faster way.

Breaking Down the Cost

Let's examine how the cost is formed.

Assume:

Prefix = [1, 6, 9, 12]
Median = 6

The total cost is:

(6 - 1) + (6 - 6) + (9 - 6) + (12 - 6)

Since the array is sorted, all elements before the median are less than or equal to the median.

Similarly, all elements after the median are greater than or equal to the median.

This allows us to split the cost into two parts.

Left Side Contribution

Median * CountLeft - SumLeft

Right Side Contribution

SumRight - Median * CountRight

Combining them gives:

Cost =
(Median * CountLeft - SumLeft)
+
(SumRight - Median * CountRight)

Using Prefix Sums

Now the challenge becomes obtaining SumLeft and SumRight efficiently.

This is where prefix sums help.

Create an array:

pref[i]

where:

pref[i] = arr[0] + arr[1] + ... + arr[i]

Example:

Array:      [1, 6, 9, 12]

Prefix Sum: [1, 7, 16, 28]

Using prefix sums:

leftSum = pref[mid]

and

rightSum = pref[i] - pref[mid]

Both calculations take constant time.

For every prefix:

  1. Find the median index.

  2. Get left and right sums from the prefix-sum array.

  3. Compute left cost.

  4. Compute right cost.

  5. Add them together.

Each prefix is processed in O(1) time.

Since there are n prefixes:

Total Time Complexity = O(n)

Cost Formula

The complete formula becomes:

leftCost =
1L * arr[mid] * (mid + 1) - leftSum;

rightCost =
rightSum - 1L * arr[mid] * (i - mid);

answer =
leftCost + rightCost;

Example Verification

Let's verify it on:

[1, 6, 9]

Median index:

mid = 1
median = 6

Left sum:

1 + 6 = 7

Right sum:

9

Left cost:

6 * 2 - 7
= 5

Right cost:

9 - 6 * 1
= 3

Total:

5 + 3
= 8

which matches the expected answer.

Key Insight

The key takeaway from this problem is recognizing the role of the median in minimizing absolute differences. Once that observation is made, prefix sums transform an otherwise quadratic solution into an efficient linear-time algorithm.

This combination of mathematical insight and preprocessing is a common pattern in coding interviews and competitive programming.

Since the array is already sorted, the minimum number of operations needed to make all elements in a prefix equal is obtained by making them equal to the median of that prefix.

For a prefix arr[0...i]:

  • Median index: mid = i / 2

  • Let prefixSum[j] be the sum of elements from 0 to j

The cost can be computed in O(1):

cost =
(arr[mid] * (mid + 1) - leftSum)
+
(rightSum - arr[mid] * (i - mid))

where:

  • leftSum = sum(0...mid)

  • rightSum = sum(mid+1...i)

Using prefix sums, each prefix answer is computed in constant time, giving O(n) overall.

Java Solution

class Solution {
    public ArrayList<Integer> optimalArray(int[] arr) {
        int n = arr.length;
        ArrayList<Integer> ans = new ArrayList<>();

        long[] pref = new long[n];
        pref[0] = arr[0];

        for (int i = 1; i < n; i++) {
            pref[i] = pref[i - 1] + arr[i];
        }

        for (int i = 0; i < n; i++) {
            int mid = i / 2;

            long leftSum = pref[mid];
            long rightSum = pref[i] - pref[mid];

            long leftCost = 1L * arr[mid] * (mid + 1) - leftSum;
            long rightCost = rightSum - 1L * arr[mid] * (i - mid);

            long cost = leftCost + rightCost;

            ans.add((int) cost);
        }

        return ans;
    }
}