iez's blog

By iez, history, 10 months ago, In English

Binary search does 5 operations

  • Calculate mid: 3 operations (1 subtraction, 1 division by 2, 1 addition)
  • Compare target with mid element: 1 comparison
  • Update boundary (low or high): 1 assignment

Ternary search does 9 operations

  • Calculate mid1 and mid2: 6 operations (2 subtractions, 2 divisions by 3, 2 additions/subtractions)
  • Compare target/function values at mid1 and mid2: 2 comparisons
  • Update boundaries (low, high, or both): 1 assignment

What is ternary search?

Ternary search is a type of search similar to binary search but instead of splitting the vector into 2 and checking which half the answer is in, it divides it into 3 parts and checks which third contains the answer.

So binary search is O(log₂ n) while ternary search is O(log₃ n).

But I'm here to prove ternary search is always slower.

Let's write log₃ using log₂.

(log₃ n) / (log₂ n) = (log n / log 3) / (log n / log 2)

(log n / log 3) / (log n / log 2) = (log 2) / (log 3)

(log 2) / (log 3) ≈ 0.6309297536...

So log₃ is almost 0.63 * log₂.

Since log₃ = 0.63...(infinite decimals) * log₂, we can round and conclude:

log₃ < 0.64 * log₂

Let's put it in our inequality:

9 * log₃ > 5 * log₂ -->

9 * log₂ * 0.64 > 5 * log₂

5.76 * log₂ > 5 * log₂ which is true

Since the growth ratio of log₃ to log₂ is less than 1, log₃ will increase more slowly, so ternary search, despite dividing into 3 parts, performs more operations per iteration and thus is always slower than binary search.

Also an interesting fact is that while the O of one is Log3 and the other is Log2, the log2 is faster than the log3.

  • Vote: I like it
  • +18
  • Vote: I do not like it

»
10 months ago, hide # |
 
Vote: I like it +21 Vote: I do not like it

Really nice proof! You can also look at it from an information‐theoretic angle: the optimal way to split your search space is into e (≈2.718) parts since that minimizes k/ln k. You have to pick an integer k, and 2 is closer to e than 3 so binary search (k=2) is the best integer choice

On top of that dividing by 2 on real hardware is just a bit shift whereas dividing by 3 wich is slower (multiply and shift) so the constant factor gap is even wider.

  • »
    »
    10 months ago, hide # ^ |
     
    Vote: I like it +21 Vote: I do not like it

    why is 2 closer to e than 3?

    • »
      »
      »
      10 months ago, hide # ^ |
       
      Vote: I like it +3 Vote: I do not like it

      Yeah you ar right for f(k)=k/ln k its minimized at k=e=2.718 and for integers f(2)=2.885 while f(3)=2.730. So 3 is actually a bit closer to the ideal split than 2.

      but that only counts comparisons on the hardware level dividing by 2 is a single fast bitshift while dividing by 3 is a slower integer divide (multiply by reciprocal + shift) so binary ends up faster in practice even if 3 is closer.

      thnx for the catch

  • »
    »
    10 months ago, hide # ^ |
     
    Vote: I like it +11 Vote: I do not like it

    Assuming that the check of values at the places you split is the most expensive part(which is almost always the case), the goal is actually to minimize $$$\frac{k-1}{\log(k)}$$$. This function is increasing, so you just want to minimize $$$k$$$.

»
10 months ago, hide # |
 
Vote: I like it +8 Vote: I do not like it

ternary search is $$$\mathcal{O}(\log_{1.5})$$$

  • »
    »
    10 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    can you explain why ? doesnt it keep 1/3th of the array each time ?

    • »
      »
      »
      10 months ago, hide # ^ |
       
      Vote: I like it 0 Vote: I do not like it

      because, in ternary search $$$R-L$$$ getting smaller by $$$3/2$$$ times after each iteration. look at the code and comments:

      M1 = (2 * L + R) / 3;
      M2 = (L + R * 2) / 3;
      //(2 * L + R) / 3 = 2/3 * L + 1/3 * R = L - 1/3 * L + 1/3 * R = L + (R - L) / 3
      //(L + R * 2) / 3 = 1/3 * L + 2/3 * R = L - 2/3 * L + 2/3 * R = L + (R - L) * 2 / 3
      if (get(M1) >= get(M2)) {
          L = M1; // R - (L + (R - L) / 3) = R - L - (R - L) / 3 =
                  // = (R - L) * 2 / 3 <- smaller by 3/2 times
      } else {
          R = M2; // (L + (R - L) * 2 / 3) - L =
                  // = (R - L) * 2 / 3 <- smaller by 3/2 times
      }
      
      • »
        »
        »
        »
        10 months ago, hide # ^ |
         
        Vote: I like it 0 Vote: I do not like it

        I saw on a post that there are multiple ways to do A ternary search, the one I'm talking about makes 2 mids and checks in which third it is, so that would be log3 i think

        • »
          »
          »
          »
          »
          10 months ago, hide # ^ |
           
          Vote: I like it 0 Vote: I do not like it

          Well you check 2 * log3(N) points so it's at least 2*log3/log2 times slower.

»
10 months ago, hide # |
 
Vote: I like it +1 Vote: I do not like it

Why is ternary search $$$O(\log_3n)$$$? Isn't it $$$O(\log_2n)$$$ or $$$O(\log_{1.5}n)$$$ depending on the implementation? Since we divide the range into three parts and throw one each time.

  • »
    »
    10 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    i still dont know what the default one is but there is an implementation where you throw at 2 thirds each time

    you create 2 mids and check in which third it is

    if its smaller than both mids its in the first third, if its bigger than one and smaller than the other its in the middle, if its bigger than both its in the third

»
10 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Your definition of ternary search isn't quite correct.

Instead of saying "checks which third contains the answer", it should be "discarding a third that does not contain the answer".

This means that the size of array reduces into 2/3 of its original size, turning the time complexity into O(log1.5 n) instead of O(log3 n)

  • »
    »
    10 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Yes but you could also modify it to be log3, also the one i learnyt was the one where you keep 1 third cause it didnt make sense to keep 2thirds when you could keep 1 third. but i guess it has less operations than keeping 1 third

»
10 months ago, hide # |
 
Vote: I like it +7 Vote: I do not like it

Actually 🤓☝️, $$$log_3(N)$$$, $$$log_2(N)$$$ and $$$log_c(N))$$$ all have the same complexity $$$O(ln(N))$$$, where $$$c$$$ is a constant.

That's because $$$log_c(N) = ln(N)/ln(c)$$$, since $$$ln(c)$$$ is constant, $$$O(log_c(N)) = O(ln(N))$$$. (For the same reason $$$O(2*N) = O(c*N) = O(N)$$$.

I.e. every logarithm complexity have the same complexity iff it has a constant base.