Binary search on real values

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3611
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	Radewoosh	3415
8	Um_nik	3376
9	maroonrk	3361
10	XVIII	3345

#	User	Contrib.
1	Qingyu	162
2	adamant	148
3	Um_nik	146
4	Dominater069	143
5	errorgorn	141
6	cry	138
7	Proof_by_QED	136
8	YuukiS	135
9	chromate00	134
10	soullless	133

What does a typical binary search in real values look like? Like this:

float l = 0.0f, r = 1e14f;
for (int i = 0; i < iteration_count && l + eps < r; ++i)
{
    float mid = 0.5f * (l + r);
    if (check(mid))
        r = mid;
    else
        l = mid;
}

Here l and r are the boundaries that depends on the condition, check (x) is a function that returns true starting from some x and false before x and we need to find this x. As a rule, l or 0.5 * (l + r) is the answer.

But there are several disadvantages:

The number of iterations must be choosed balancing between the precision and the execution time.
Answer with an error.

What can be done with this?

First, let's take a look at how real values are represented. You can read in detail, for example, here and here. We use the fact that if we perform the operation Y = real(int(X) + 1), where X is a real value, int(X) takes a real value and returns an integer whose bitwise representation is the same as X (returns the same X interpreted as an integer), real(X) does the opposite — interprets the integer as real, then Y will be equal to the next representable real value after X. Roughly this is similar to Y = X + eps, where eps is the minimum possible and X >= 0 andY = X - eps, if X <= 0 (in both cases, zero is included, because in real values there are 0 and -0). Roughly — because not all X has the same eps. I should note that there is NaN,inf, the maximum positive or negative value, but this is not what I want to talk about. It can be seen that for real L and R, where L < R, the condition int(L) < int(L) + 1 < int(L) + 2 < ... < int(R) is satisfied, similarly as L < L + eps < L + 2 * eps < ... < R. So we can do a binary search with the boundaries int(L), int(R). Now we can do this:

float l_real = 0.0f, r_real = 1e14f;
uint32_t l = reinterpret_cast<uint32_t&>(l_real), r = reinterpret_cast<uint32_t&>(r_real);
while (l < r)
{
    uint32_t mid = l + (r - l) / 2;
    if (check(reinterpret_cast<float&>(mid)))
        r = mid;
    else
        l = mid + 1;
}

The same works for double anduint64_t.int forfloat and long long fordouble can be used. I should note that the standard does not guarantee that float andint are 32 bits each, and double andlong long are 64 bits each but I didn’t see that this was not the case.

What are the advantages?

Maximum possible precision.
A small number of iterations (up to 32 for float and up to 64 fordouble).

What is important to remember?

l,r and mid are some non-meaningful values, all that is important is maintaining order in both interpretations.
l + r may overflow, therefore l + (r - l) / 2 instead of (l + r) / 2.
There is an inconvenience with boundaries with different signs. For negative values we have to convert the value from two's complement representation when converted from real to integer to keep order. You can also make a check in zero, so that the boundaries have the same sign, or add a constant to the values. In my experience problem where boundaries with different signs is a rare case.

It would not be superfluous to take a look on pictures of floating point distribution. For example, I’ll note that in the interval from 0 to std::numeric_limints<float>::max() the median (which is also the first mid in binary search) will be equal to 1.5. That is, from 0 to 1.5 can be represented as many values as from 1.5 to the most representable value. For float in the usual approach, we need 129 iterations only to make the right boundary less than 1.5, if the answer is, for example, 0 (don't forget that usually the initial right boundary is much less). That is, in the usual approach, representable values are discarded extremely badly.

An example of solving a problem using this approach: 45343644

The same approach applies to ternary search (who wants to make a cool example? ^ _ ^).

double d = 0.1;
std::int64_t n;
static_assert(sizeof n == sizeof d);
// n = *reinterpret_cast<std::int64_t*>(&d); // Undefined behavior
std::memcpy(&n, &d, sizeof d); // OK
n = std::bit_cast<std::int64_t>(d); // also OK

Comments (15)

Show archived | Write comment?

Iightcode

8 years ago, hide # |

← Rev. 2 →

+73

It's very interesting, but here's the way I see it.

What is better:

for (int whatever = 0; whatever < 200; whatever++)

+ simple

+ easy to understand

+ fast enough 99% of the time

- not fast enough 1% of the time

+ 200 is good enough for any problem

OR:

uint32_t l = reinterpret_cast<uint32_t&>(l_real), r = reinterpret_cast<uint32_t&>(r_real);, blahblahblah

- C++ black magic

- who knows when it will break because of some compiler issue from 2012

+ save a couple iterations

- weird negative edge cases

→ Reply

tantam75

8 years ago, hide # ^ |

+ optimize the precision

Maxim

Thanks for sharing your point of view. Let me share the way I see it then.

Cycle from l to r is also simple and easy to understand. Approach that in the article is also simple and easy to understand imho.

32 is good enough for not less amount of problems. Let me give an example: what if we have binary search in binary search and a heavy check function? Then this approach is ~40 times (200 * 200 / (32 * 32)) faster.

Why did you mark it as disadvantage? Come to the dark side ;-)

I do believe that the probability of that is not more than the break of simple cycle.

Not weird imho.

nekit2305

6 years ago, hide # ^ |

oh my god, it really works

thanks a lot

10 WA/TL attempts with iterative binary search bring me here

kraskevich

← Rev. 3 →

C++ black magic
who knows when it will break because of some compiler issue from 2012

It's not even black magic, it's undefined behavior. It's already broken.

From https://en.cppreference.com/w/cpp/language/reinterpret_cast:

+10

it's undefined behavior

It's true because of type aliasing rule. But from the same link:

Note that many C++ compilers relax this rule

Also I've mentioned that there is no guarantee that float and int are 32-bit and double and long long are 64-bit. So it's already UB. But it doesn't really matter because it's almost impossible to find the compiler that misbehaves. It's individual choice between hypothetical misbehaving and more precisely/faster solution. Also it's still more probable to fail writing simple cycle as I've mentioned earlier. Actually, there is no compiler that follows the standard at 100%. If there is some discomfort — it is possible to use memset from your example.

It's already broken

Strange hyperbolization.

brdy

Ternary Search - Min Function

double r_lo = 0;    ll lo = reinterpret_cast<ll&>(r_lo);
double r_hi = MAXV; ll hi = reinterpret_cast<ll&>(r_hi);
while (hi > lo)
{
    ll lm = lo + (hi - lo)/3;
    ll rm = hi - (hi - lo)/3;
    if (calc(reinterpret_cast<double&>(lm)) < calc(reinterpret_cast<double&>(rm))) hi = rm-1;
    else lo = lm+1;
}

For finding the max of a unimodal function just reverse the sign.

AnandOza

6 years ago, hide # |

-8

For the regular floating-point search, instead of an iteration count, can you alternatively specify r-l > 1e-7? (Or whatever tolerance you need.)

farmersrice

+18

No, it doesn't work well.

Can you let me know why? It seems identical in functionality to a fixed iteration count, but easier to understand.

+13

I'm not really sure why, I think it has something to do with floating point precision, but it does not work well. I remember trying to solve a couple problems like that and I was always getting WA no matter what tolerance I had, then I changed to a fixed iteration count and boom ez AC.

dalex

r — l > eps works badly on functions like y = 1000000000000x and you need to find f(x)

f(r) — f(l) > eps works badly on functions like y = 0.00000000000001x and you need to find x

snacache

I tried to solve this problem during the 2015 ICPC Latin American regionals using r-l>EPS without any success.

After the contest I tried using iteration count and it was accepted :(

Errichto

If you are binary searching the answer and will just print $$$(L + R) / 2$$$ after the binary search is done, then you can use something like this:

while(R - L > eps * max(1.0, L)) { ... }

This checks for a relative or absolute error. It assumes that $$$0 \leq L \leq R$$$ though. A general version is quite tricky and I'm not even sure that I will get it right here:

#define SMALLEST (L < 0 && 0 < R ? 0.0 : min(abs(L), abs(R)))
while(R - L > eps * max(1.0, SMALLEST)) { ... }

The idea is to get the minimum possible value of the answer — the minimum value of something in interval $$$[L, R]$$$.

My advice is: just iterate a constant number of times.

Bonus

roll_no_1

I have also seen people do something like this:

If every test file has a single test case and say, the time limit is around 2 seconds, then you may also write the binary search as:

while((double)clock() / CLOCKS_PER_SEC < 1.8) {
     //Obtain mid, compute stuff and update lo and hi accordingly
}

It seems more of a magic to me, but if you can choose the constant in the while loop in a way that your execution time is within the time limit, you will get a very good precision.

Maxim's blog

What can be done with this?