Блог пользователя ramentaneja

Автор ramentaneja, история, 12 месяцев назад, По-английски

Hello everyone,

I have been using Binary Search for a while, but it still confuses me as to when I should use <= and when I should use < and how I should be updating my low and high variables accordingly.

I am not sure if this has already been answered, I was unable to find an answer that solved my doubt regarding how to update low and high variables.

It's just something I am struggling to wrap my head around, any help regarding this would be highly appreciated. Cheers!

  • Проголосовать: нравится
  • +10
  • Проголосовать: не нравится

»
12 месяцев назад, # |
Rev. 2   Проголосовать: нравится +17 Проголосовать: не нравится

I'm writing binary search in this way: if you want to find the lowest $$$x$$$ satisfying $$$f(x)$$$:

int l=lowest_possible_value-1,r=highest_possible_value,m;
while(r-l>1)
{
    m=l+(r-l)/2;if(f(m))r=m;else l=m;
}
return r;

and vice versa if you want to find the highest $$$x$$$ satisfying $$$g(x)$$$:

int l=lowest_possible_value,r=highest_possible_value+1,m;
while(r-l>1)
{
    m=l+(r-l)/2;if(g(m))l=m;else r=m;
}
return l;
  • »
    »
    12 месяцев назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Does this always work? or do you modify according to problems as well?

    • »
      »
      »
      12 месяцев назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Works almost always (l = m , r = m). You just have to find the best initial values for l and r for this.

      • »
        »
        »
        »
        11 месяцев назад, # ^ |
        Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

        The key idea here is keeping invariants, this is precisely what's happening in the code. In the case of finding the smallest $$$x$$$ satisfying $$$f(x)$$$ we initially choose $$$l$$$ and $$$r$$$ such that $$$f(l)$$$ does not hold and $$$f(r)$$$ holds. Throughout the whole binary search, we always keep $$$l$$$ and $$$r$$$ that way. In the case of maximizing $$$x$$$, we keep the invariant that $$$f(l)$$$ holds and $$$f(r)$$$ does not hold (or in the code, $$$g(l)$$$ and $$$g(r)$$$).

        • »
          »
          »
          »
          »
          11 месяцев назад, # ^ |
          Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

          The way I think of it is this: at any point, we have $$$l, r$$$ such that $$$f$$$ is true on $$$(l_0, l]$$$ and false on $$$[r, r_0)$$$, and we want to bring one of these "borders" towards the other. With this, the base case $$$(l, r) = (l_0, r_0)$$$ is trivially true. The termination condition says $$$f(l_0, x]$$$ is true and $$$f[x + 1, r_0)$$$ is false, so either $$$x = l_0$$$ or $$$x$$$ is such that the non-empty subarray $$$[l_0 + 1, x]$$$ is true, and either $$$x + 1 = r_0$$$ or $$$x$$$ is such that the non-empty subarray $$$[x + 1, r_0 - 1]$$$ is true. Thus we can simply set $$$l_0$$$ and $$$r_0$$$ to be one before and one after the search array endpoints (or the places where we definitely know that $$$f$$$ is true and false respectively).

    • »
      »
      »
      11 месяцев назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      To be honest, I haven't seen a binary search problem where the codes don't work. Tell me if you find any!

»
12 месяцев назад, # |
Rev. 2   Проголосовать: нравится -14 Проголосовать: не нравится

.

»
11 месяцев назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

Try to formally declare your state, what $$$l$$$ and $$$r$$$ means.

Let's declare some notation:

$$$a[i...j]$$$ = Subarray $$$(a[i], a[i + 1], a[i + 2], ..., a[j])$$$

$$$a[i...]$$$ = Subarray $$$(a[i], a[i + 1], a[i + 2], ...,$$$ till end of array $$$)$$$

$$$a[...j]$$$ = Subarray $$$($$$ From start of array $$$..., a[j - 1], a[j])$$$

$$$a[i...i]$$$ = Subarray having single term $$$(a[i])$$$

Empty Subarray: $$$a[0...-1]$$$ or $$$a[1...0]$$$ or ... (Basically when $$$i > j$$$, it denotes an empty subarray)

Now let's find the logic for the lower bound of an array sorted in increasing order.

Function $$$lowerBound(a, x)$$$: Returns the smallest index having element greater than or equal to $$$x$$$.

The function will look something like this:

int lowerBound(int[] a, int x) {
	int n = a.length;
	int l = _;
	int r = _;

	while(_) {
		int m = (l + r) / 2; // Or maybe ceil(l + r) / 2
		if(a[m] >= x) {
			l = _;
			// OR
			r = _;
		} else {
			l = _;
			// OR
			r = _;
		}
	}

	return _;
}

Let $$$l$$$ and $$$r$$$ be defined as:

$$$a[...(l - 1)]$$$: All elements are smaller than $$$x$$$

$$$a[l...(r - 1)]$$$: Unchecked portion of array

$$$a[r...]$$$: All elements are greater than or equal to $$$x$$$

First, think about the initial value of $$$l$$$ and $$$r$$$.

Initialization

Now, let's see how we should update the values $$$l$$$ and $$$r$$$ at every iteration.

Maintenance

How long the loop should run?

Termination

If at any point, I ask you the current best answer you have, what would it be?

Answer Index

The final code will be:

int lowerBound(int[] a, int x) {
	int n = a.length;
	int l = 0;
	int r = n;

	while (l < r) {
		int m = (l + r) / 2;
		if (a[m] >= x) {
			r = m;
		} else {
			l = m + 1;
		}
	}

	return r;
}

This is one of the possible codes on lowerBound. You can have different codes on considering different states.

The same idea can be applied on any binary search problem.

When solving problems using "Binary search on answer", you'll get one of the two situations:

...TTTTTFFFFF...

OR

...FFFFFTTTTT...

You can make a binary search algorithm for the last true or first false in the first case, or the last false or first true in the second case.

»
11 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Binary search is a search algorithm. As long as you don't lose whatever you're searching for, then you win.

In binary search, the search domain (i.e. the answer you're looking for) is in the [low, high] region. You check the answer for mid = (low + high)/2. Is mid a valid answer? Would you want to discard mid? If yes for the first question and no for the second, don't lose it. Otherwise, feel free to drop it.

Key point is, again, don't lose sight of what you're "searching" for. Only discard what's guaranteed to not be a possible answer, and don't ever let the answer escape the [low, high] zone.

»
11 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

I prefer checking low <= high. If the value of high is initially good then after doing binary search return low, otherwise return high

»
11 месяцев назад, # |
  Проголосовать: нравится +9 Проголосовать: не нравится

The diffrence between <= and < is when using <=, the binary search range is $$$[L,R]$$$, and when using <, the range is $$$[L,R)$$$.

I perfer using <= and use another variable res to store the possible answer.

int L, R, res;
while(L <= R) {
    int mid = (L + R) / 2;
    if(check(mid)) res = mid, L = mid + 1;
    else R = mid - 1;
}
»
11 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

I wrote a blog a while ago which has a few sections that talk about the ways people implement binary search, in particular, about implementations that use $$$<$$$ and implementations that use $$$\le$$$.