Hi, Codeforces! Recently I thought of an interesting problem and would like to hear your thoughts.
You are given some $$$0 \lt \varepsilon \lt \frac{1}{2}$$$ as input and have an unknown biased coin with
You toss it sequentially. After each toss $$$t$$$, you must output an estimate $$$p_t\in[0,1]$$$. An oracle who knows the true $$$p$$$ stops the process at the first time your estimate is $$$\varepsilon$$$-close:
The algorithm does not choose when to stop; it only outputs estimates after each toss. The task is to design an algorithm (possibly randomized) with optimal worst-case expected number of tosses:
Also, what strategy achieves it (and how would you prove the bound)?







