Merge Sort
Code
C language code:
#include <stdio.h>
#include <stdlib.h>
void merge(int arr[], int l, int m, int r) {
int i, j, k;
int n1 = m - l + 1;
int n2 = r - m;
int* L = (int*) malloc(n1 * sizeof(int));
int* R = (int*) malloc(n2 * sizeof(int));
for (i = 0; i < n1; i++)
L[i] = arr[l + i];
for (j = 0; j < n2; j++)
R[j] = arr[m + 1 + j];
i = 0;
j = 0;
k = l;
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
} else {
arr[k] = R[j];
j++;
}
k++;
}
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
while (j < n2) {
arr[k] = R[j];
j++;
k++;
}
free(L);
free(R);
}
void merge_sort(int arr[], int n) {
int curr_size;
int left_start;
for (curr_size = 1; curr_size <= n-1; curr_size = 2*curr_size) {
for (left_start = 0; left_start < n-1; left_start += 2*curr_size) {
int mid = left_start + curr_size - 1;
int right_end = (left_start + 2*curr_size - 1 < n-1) ? left_start + 2*curr_size - 1 : n-1;
merge(arr, left_start, mid, right_end);
}
}
}
int main() {
int arr[] = {12, 11, 13, 5, 6, 7};
int arr_size = sizeof(arr) / sizeof(arr[0]);
printf("Given array is \n");
for (int i = 0; i < arr_size; i++)
printf("%d ", arr[i]);
printf("\n");
merge_sort(arr, arr_size);
printf("\nSorted array is \n");
for (int i = 0; i < arr_size; i++)
printf("%d ", arr[i]);
printf("\n");
return 0;
}
Code Explanation
- Initialization: The array
arris the array to be sorted, andnis the length of the array. - Merge function:
- The
mergefunction merges two sorted array sections into one sorted array. - Creates two temporary arrays
LandRto store the left and right sub-arrays respectively. - Compares elements in the temporary arrays one by one, placing the smaller element back into the original array.
- Places the remaining elements back into the original array.
- The
- Merge sort function:
merge_sortuses a bottom-up iterative approach, splitting the array into sub-arrays and merging them.- The outer loop
curr_sizestarts from 1, doubling each time, up ton-1. - The inner loop
left_startmoves by2*curr_sizepositions each time. - Calculates the midpoint and end point for each round of merging, then calls the
mergefunction.
Example Run
Assuming the input array is {12, 11, 13, 5, 6, 7}:
- Initial state:
{12, 11, 13, 5, 6, 7} - First merge (curr_size=1):
- Merge
{12, 11}to get{11, 12} - Merge
{13, 5}to get{5, 13} - Merge
{6, 7}to get{6, 7} - State:
{11, 12, 5, 13, 6, 7}
- Merge
- Second merge (curr_size=2):
- Merge
{11, 12}and{5, 13}to get{5, 11, 12, 13} - Merge
{6, 7}unchanged - State:
{5, 11, 12, 13, 6, 7}
- Merge
- Third merge (curr_size=4):
- Merge
{5, 11, 12, 13}and{6, 7}to get{5, 6, 7, 11, 12, 13} - Final state:
{5, 6, 7, 11, 12, 13}
- Merge
Time Complexity Analysis
The time complexity of merge sort mainly depends on the splitting and merging processes.
- Best-case time complexity: , because each split recursively divides the array in half, and each merge operation requires linear time.
- Worst-case time complexity: , regardless of the initial order of the input array, the number of steps executed is essentially the same.
- Average time complexity: , for most input arrays, merge sort's performance is very stable.
Space Complexity Analysis
Merge sort requires additional auxiliary space to store temporary sub-arrays, so the space complexity is .
Advantages and Disadvantages
Advantages:
- Stable time complexity; suitable for large-scale data sorting.
- Stable sorting algorithm; does not change the relative order of equal elements.
- The divide-and-conquer strategy is easy to parallelize.
Disadvantages:
- Requires additional auxiliary space; higher space complexity.
- Relatively high implementation complexity; more complex to understand and code than simple sorting algorithms.
Applicable Scenarios
Merge sort is suitable for the following scenarios:
- Sorting large datasets.
- When a stable and efficient sorting algorithm is needed.
- In parallel computing environments, merge sort can make good use of multiple processors.
Summary
Merge sort is an efficient sorting algorithm that uses a divide-and-conquer strategy to split the array into smaller sub-arrays, sorts them separately, and then merges them. Although its implementation complexity is relatively high and it requires additional space, it performs excellently when handling large-scale data and is a stable, well-performing sorting algorithm.