Hi all,
I've read a lot about binary search and its notorious tricky nature to implement correctly. One thing I learned over time is thinking in loop invariants helps with proving your code's correctness. I recently solved a problem on binary search and got AC but I fail to see why my code works correctly
link: http://mirror.codeforces.com/problemset/problem/676/C
For this problem, we have to find the longest substring that consists of the same letter (either 'a' or 'b') IF you can change 'a's to 'b's or vice versa up to K times.
I solved the problem separately for 'a's and 'b's, then returned the maximum of two.
Let me elaborate what I did for one letter 'a' (it's identical for the other character)
If I had a magic function that tells me "can you find a candidate answer of length L", what I am looking for is
for given array A of length N
YES, YES, YES, ...., YES, NO, NO, NO, ..., NO
I need to find the last index (or longest length) that says YES.
to find such an index using binary search, I maintain the following loop invariants,
1) l < h and initially, l = 0, h = n
2) call that magic function P, then always, P(l) = True and P(h) = False. put another way, answer lies in interval [l, h)
3) the while loop continues as long as "l + 1 < h". so, when it terminates, l + 1 >= h, but combined with 1) we know l + 1 = h.
4) because of 2), we know that the answer must be l
translated to pseduo-code,
But the answer this code gives is always one less than the actual answer. When I return h instead of l (because i'm 1 off), I get AC. But following my lines of logic, I know it must be 'l' that is the answer. Can you guys help me with finding a flaw in my logic?
UPD: previous claim was wrong. Maybe you try to return max of h-l instead of h-l+1 in the stated problem?
How I write binsearch, if there is at least one "YES", answer in [0, n-1]
Hi, thanks for your input. I have a question
1) because you write the while loop condition as "h -1 > l", it terminates when l, h are right next to each other. (i.e. h — 1>= l, or l+1<=h, but we know l < h, eventually l + 1 = h)
2) The way you are halving the interval is equivalent to ensuring "the answer lies in [l, h) "
So my question is in line 8 of your code, why check from h? 1) + 2) gauruntees that "h" cannot be an answer
I've looked again at your binsearch pseudocode — it looks OK (overlooked that right bound is exclusive, my pseudocode used inclusive bounds).
19061181
if cumsum[idx + candidate] - cumsum[idx - 1] <= k:
checks that in range [idx, idx+candidate] there will be at most k changes. But length of interval is candidate + 1, not candidate. Also, you rebuild your array cumsum on every binary search.