How to calculate median of two sorted arrays?

How to calculate median of two sorted arrays?

This problem has been coming up in a lot of practice sheets, and SDE sheets. In this article, we will be going over this problem in depth, first understanding the question, and all the approaches and then solving it in C++. This question can be found on Leetcode: 4. Median of Two Sorted Arrays – https://leetcode.com/problems/median-of-two-sorted-arrays/

Understanding The Question

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.

  • Inside an ordered array, the median of the array would be the item that is present in the middle of the array. If the length of the middle is even, then the median would be the average of the middle two items.
  • Hence, for an even-sized array: [1, 2, 3, 4, 5, 6]; Median is (3 + 4) / 2 = 3.5.
  • And for an odd-sized array: [1, 2, 3, 4, 5]; Median is 3.

Hence, the two arrays being already sorted have an advantage but there is a catch here. The two arrays do not necessarily have the elements in such an order that both arrays can directly be attached. Hence merging the two lists will take more time than required. Let’s discuss these solutions below.

Solution – With Code In C++

Brute Force Approach – Merging And Sorting

We can create a new array with all the elements from both arrays and then sort the array.

Time Complexity:

  1. Copying the elements into the new array: O(m) + O(n).
  2. Sorting the elements inside the new array: O((m+n) log(m+n)).

Space Complexity:

  • O(m+n) for creating the new array.

Hence, total T. C. becomes O(N logN) and S. C. is O(N) where N is the total size of the new array.

Solution Code in C++

double bruteSolution(vector<int>& nums1, vector<int>& nums2) {
    vector<int> all_nums;
    for(auto n : nums1) all_nums.push_back(n);
    for(auto n : nums2) all_nums.push_back(n);

    int N = all_nums.size();
    sort(all_nums.begin(), all_nums.end());

    // N/2 == (N-1)/2 for odd N, but different for even N,
    double median = (all_nums[N/2] + all_nums[(N-1)/2])/2.0;
    return median;
}
Code language: C++ (cpp)

Beginner Approach – Merging As In Merge Sort

Since the array is already sorted, we can take this advantage and merge both the arrays as done in the Merge Sort algorithm. Because of this, the new array produced is already sorted.

Time Complexity:

  • Copying the elements into the new array: O(m) + O(n).

Space Complexity:

  • O(m+n) for creating the new array.

Hence, total T. C. becomes O(N) and S. C. is O(N) where N is the total size of the new array.

Solution Code in C++

double okaySolution(vector<int>& nums1, vector<int>& nums2) {
    vector<int> all_nums;
    int m = nums1.size(), n = nums2.size(), N = m + n;
    int p1 = 0, p2 = 0;

    // Adding the smallest element first, hence merging both list effectively
    while(p1 < m && p2 < n) {
        if(nums1[p1] < nums2[p2]) {
            all_nums.push_back(nums1[p1]);
            p1++;
        } else if(nums1[p1] > nums2[p2]) {
            all_nums.push_back(nums2[p2]);
            p2++;
        } else {
            all_nums.push_back(nums1[p1]);
            p1++;
            all_nums.push_back(nums2[p2]);
            p2++;
        }
    }

    // Adding the remaining elements if either was exhausted first
    while(p1 < m) {
        all_nums.push_back(nums1[p1]);
        p1++;
    }
    while(p2 < n) {
        all_nums.push_back(nums2[p2]);
        p2++;
    }

    // For odd arrays, N/2 == (N-1)/2.
    double median = (all_nums[N/2] + all_nums[(N-1)/2])/2.0;
    return median;
}
Code language: C++ (cpp)

Better Approach – Two Pointer Approach

In this question, we do not require merging the array explicitly. We can just take variables to keep track of the current element. Following the same style as the Beginner’s approach, we have two pointers showing the elements already taken into account. With each iteration, we check whether the considered elements, if merged, also consider both the medians. Then as the medians are extracted, the result is returned.

Time Complexity:

  • Iterating over the elements from both arrays: O(m) + O(n).

Space Complexity:

  • No extra memory was required other than a few variables, S.C. is O(1).

Hence, total T. C. becomes O(N) and S. C. is O(1) where N is the total size of both arrays.

Solution Code in C++


double betterSolution(vector<int>& nums1, vector<int>& nums2) {
    int m = nums1.size(), n = nums2.size(), N = m+n;
    int p1 = 0, p2 = 0;
    int median1 = -1, median2 = -1, curr = -1;

    while(median1 == -1 || median2 == -1) {
        if(p1 < m && p2 < n) {
            curr = nums1[p1] <= nums2[p2] ? nums1[p1++] : nums2[p2++];
        } else if(p1 < m) {
            curr = nums1[p1];
            p1++;
        } else {
            curr = nums2[p2];
            p2++;
        }

        // '(p1+p2) - 1' to remove overshoot after curr was assigned and the pointer was incremented
        if(p1 + p2 - 1 == (N-1)/2)
            median1 = curr;
        if(p1 + p2 - 1 == N/2)
            median2 = curr;
    }

    // For odd arrays, median1 == median2 as N/2 == (N-1)/2
    return (median1 + median2)/2.0;
}

Code language: C++ (cpp)

Best/Optimal Approach – Binary Search For Medians

  • Let’s assume the two initial arrays as A and B which merge together to form M. We divide this merged array M into two halves M1 and M2.
  • It is clearly seen that M1 has some elements of A and some of B such that if ‘x’ elements are present from A, (M1 – ‘x’) elements are present from B. The elements of part of A that were left out along with the leftovers of B became M2.
  • If we place a divider on A and B we can easily pick out the elements that belong to M1 on the left-hand side and for M2 on the right-hand side. Maximum of Aleft and Bleft are the two elements that are the maximum elements from M1 while minimum of Aright and Bright, are two elements that are the smallest from M2. [M1max = max(Aleft-max, Bleft-max)] and [M2min = min(Aright-min, Bright-min)]. To find the median, only these 4 elements are needed: Aleft-max, Bleft-max, Aright-min, and Bright-min.
  • It can also be seen that no matter the position of the dividers, position of elements never change. Hence as array A is already sorted, Aleft-max <= Aright-min. Similarly, left of B will always be smaller or equal to the right of B Bleft-max <= Bright-min.
  • As M1max <= M2min, therefore Aleft-max <= Aright-min & Aleft-max <= Bright-min, and similarly, Bleft-max <= Bright-min & Bleft-max <= Aright-min. So for checking the proper divider position, only two main checks are needed: Aleft-max <= Bright-min & Bleft-max <= Aright-min. And from previous points, only divider A poisition is to be found out, as (M1 – ‘x’) will be the divider position for B.
  • For safety, we keep A as the smallest array between A and B.

Time Complexity:

  • Iterating over the elements from both arrays: O(log(m)) where m is the minimum size of the array.

Space Complexity:

  • No extra memory was required other than few variables, S.C. is O(1).

Solution Code in C++

double findMedianSortedArrays(vector<int>& A, vector<int>& B) {
    int n1 = A.size(), n2 = B.size();

    // Keeping A as the smallest sized array
    if(n1 > n2) return findMedianSortedArrays(B, A);

    int s1 = 0, e1 = n1;
    
    // Performing Binary Search on A to find the best position of the divider
    while(s1 <= e1) {
        int part1 = s1 + (e1 - s1) / 2;         // Divider A
        int part2 = (n1 + n2 + 1)/2 - part1;    // Divider B

        int l1 = (part1 - 1 >= 0 ? A[part1 - 1] : INT_MIN);     // Maximum of A_left
        int r1 = (part1 < n1 ? A[part1] : INT_MAX);             // Minimum of A_right

        int l2 = (part2 - 1 >= 0 ? B[part2 - 1] : INT_MIN);     // Maximum of B_left
        int r2 = (part2 < n2 ? B[part2] : INT_MAX);             // Minimum of B_right

        
        if(l1 <= r2 && l2 <= r1) {
            int median;
            if((n1+n2)%2 == 1) {
                // Combined Odd Size, Only One Median
                median = max(l1, l2);
            } else {
                // Two medians, return average
                median = (max(l1, l2) + min(r1, r2)) / 2.0;
            }
            return median;
        } else if(l1 > r2) {
            // A_left > B_right, shift divider position to left
            e1 = part1 - 1;
        } else if(l2 > r1){
            // A_right < B_left, shift divider position to right
            s1 = part1 + 1;
        }

    }

    return 0.0;
}
Code language: C++ (cpp)

Sharing is caring

Did you like what Aman Ahmed Siddiqui wrote? Thank them for their work by sharing it on social media.

0/10000

No comments so far