Divide and Conquer

General Understanding

对于分治,需要了解的是

  1. 为什么分治能优化
  2. 什么时候可以用分治
  3. 分治的实现细节

Question 1: Why D&C can do better?

对于第一个问题,可以从数学形式(方程)和内在原因入手。

一般而言,爆搜或者朴素做法的时间复杂度多为二次、三次以上,那么此时将原问题分解的话,利用系数的变化不难发现,解决子问题比解决原问题要快得多,如果子问题合并代价不多的情况下,其实就可以按照这个思路一直分下去,代价会一直减小,也就是一直到递归基础,然后再逐层合并,这时通过master定理,在系数选取合适,归并代价合适的情况下,确实可以发现分解成子问题能够比爆搜要快,得到优化。内在的原因是通过分解子问题,其实是避免了一些重复的运算。

For the first question, the intuition behind divide-and-conquer is that in many problems the amount of work that has to be done is based on some combinatorial property of the input that scales more than linearly.

For example, in the closest pair of points problem, the runtime of the brute-force answer is determined by the fact that you have to look at all possible pairs of points.

If you take something that grows quadratically and cut it into two pieces, each of which is half the size as before, it takes one quarter of the initial time to solve the problem in each half, so solving the problem in both halves takes time roughly one half the time required for the brute force solution. Cutting it into four pieces would take one fourth of the time, cutting it into eight pieces one eighth the time, etc.

The recursive version ends up being faster in this case because at each step, we avoid doing a lot of work from dealing with pairs of elements by ensuring that there aren’t too many pairs that we actually need to check. Most algorithms that have a divide and conquer solution end up being faster for a similar reason.

Question 2: In which case do we use D&C?

但是并不是说分治就一定能得到优化,一定要结合问题特征使用,下面是反例:

For your second question, no, divide and conquer algorithms are not necessarily faster than brute-force algorithms. Consider the problem of finding the maximum value in an array. The brute-force algorithm takes O(n) time and uses O(1) space as it does a linear scan over the data. The divide-and-conquer algorithm is given here:

  • If the array has just one element, that’s the maximum.
  • Otherwise:
    • Cut the array in half.
    • Find the maximum in each half.
    • Compute the maximum of these two values.

This takes time O(n) as well, but uses O(log n) memory for the stack space. It’s actually worse than the simple linear algorithm.

所以还是要清楚,在什么情况下才去用or考虑分治。

下面来大概总结一下:

首先看原问题爆搜时间复杂度以及是否有子结构,之类的就可以考虑,这种可能是优化系数之类的。
然后是结合原问题具体地去考虑分治的实现,一般而言有两种:朴素divide+优化conquer,or 优化divide+朴素conquer。

值得注意的是,二分其实可以看成是一种特殊的分治,子问题系数是1,很高效的分治。

Question 3: What’re the realization details when using D&C?

The Paradigm of D&C:


分治算法的应用

排序问题

merge sort & quick sort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
//merge sort
void merge_sort(int q[], int l, int r){

if(l >= r) return;

int mid = (l + r) >> 1;
merge_sort(q,l,mid), merge_sort(q, mid + 1, r);

int k = 0, i = l, j = mid + 1 ;
while(i <= mid && j <= r)
if(q[i] <= q[j]) tmp[k++] = q[i++];
else tmp[k++] = q[j++];
//扫尾
while(i <= mid) tmp[k++] = q[i++];
while(j <= r) tmp[k++] = q[j++];
//物归原主
for(int i = l, j = 0 ; i <= r ; i++, j++) q[i] = tmp[j];

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
//quick sort
void quick_sort(int q[], int l, int r)
{
//递归的终止情况
if(l >= r) return;
//第一步:分成子问题
int i = l - 1, j = r + 1, x = q[l + r >> 1];
while(i < j)
{
do i++; while(q[i] < x);
do j--; while(q[j] > x);
if(i < j) swap(q[i], q[j]);
}
//第二步:递归处理子问题
quick_sort(q, l, j), quick_sort(q, j + 1, r);
//第三步:子问题合并.快排这一步不需要操作,但归并排序的核心在这一步骤
}

Max & Min

这是对于O(n)系数改进的分治算法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
PII max_min(int l, int r){
if(r - l == 1)
if(a[l] < a[r]) return {a[l], a[r]};
else return {a[r], a[l]};

int mid = l + r >> 1;
int x1, y1, x2, y2;
{x1, y1} = max_min(l, mid);
{x2, y2} = max_min(mid + 1, r);
int x = min(x1, x2);
int y = max(y1, y2);

return {x, y};
}

Convex Hull

Peak Finding

optimization of O(n) problem.

1-dimension version

2-dimension version

中位数问题

有效二分