Here's a simple problem for you.
You are given $$$n=200$$$ elements and you have to iterate over all possible quadriplets (4) of them, the order of elements of each quadriplet doesn't matter, and you can take same element more than one time. Give quick approximation of how many iterations will algorithm run.
It is not $$$1,6*10^9$$$
So if you have to iterate over all subsets of length $$$k$$$, it is important to remember to divide running time by $$$k!$$$. In fact, if $$$k=6$$$, you can fit in one second with $$$n<=85$$$, while $$$85^6≈3*10^{11}$$$, but you would iterate only $$$≈5*10^8$$$ subsets, and not knowing that you might not even think of it.
I hope this blog will help, because I was doing same mistake for a long time.
Does this formula only work when the order of the elements does not matter?
Also, is this the difference between iterating an inner
for
from 0 to iterating from the outerfor
variable? Such as:Yes and yes. The first way of doing the loop doesn't give you the speedup by a factor of 24. It is the same time complexity but the latter is significantly faster.
Alternative Title : person discovers number of ways to choose x items from n things without repeats is nCx and not n^x.
(This is a joke)
Disagree, I made this blog not because I discovered this formula, but because I never used it when analysing time complexity, and I saw other people do the same mistake.
upd: ok, r/woosh
Interesting
… and deducts an almost-correct formula for counting them.
well he is close enough, do you have a way to analyse the exact complexity for his task? (with formula) (genuinely curious, maybe im missing something obvious)
F(n, x) = sum(i = 1 to n, F(i, x — 1)) was the best i could get.
(2 other people suggested ways to get it, however i realized a simpler way(kind of similiar to nor's))
consider the difference between the previous element and the next element in the sorted order of the list, and you get a simple equation in variables that you can solve with stars and bars.
You can try stirling's approximation
Here's a physicist's (quick and dirty, not necessarily rigorous) way of doing things:
Assume $$$n$$$ is large enough. Then $$$1$$$ is small enough to be a $$$\mathrm{d} x$$$ (in slightly more rigorous terms, you use the trapezoidal rule).
Then the relation (approximately) becomes $$$F(n, k) = \int_0^n F(x, k - 1) \mathrm{d} x$$$.
By induction, $$$F(n, k) \approx \frac{n^k}{k!}$$$.
im aware of the approximate result, i was looking for an exact one, thanks all the same.
For your recurrence, if you note the Hockeystick identity, you can easily show via induction that the answer is $$$\binom{n + k - 1}{k}$$$, assuming $$$F(n, 1) = n$$$.
right thanks
It should be something like $$$n+k-1$$$ choose $$$k$$$, from stars and bars.
My comment somehow got deleted, reupload:
I claim that
F(n,k)=C(n+k-1,k)
, the proof is as below:If we have an array of length
n
and we want to partition it intok
pieces (each piece must have at least1
member), the number of ways we can do it isf(n,k)=sum(f(n-i,k-1),i=1~n)=sum(f(i,k-1),i=0~n-1)
(ifn<k
,f(n,k)=0
), which means we split off a piece that'si
long and split the rest intok-1
pieces.We can construct a function
g(n,x)=f(n+x,x+1)
,thus:Obviously,
g(n,x)=F(n,x)
, which meansF(n,x)
is equal to the ways you can partition an array of sizen+x
intox+1
parts. Imagine we havex
boards and we insert them into then+x-1
gaps between then+x
array members. For each insertion policy there is one-and only one partition scheme associated with it. And since there isC(n+x-1,k)
insertion schemes, there is alsoC(n+x-1,k)
ways to choosek
repeatable items fromn
different ones without considering order.Q.E.D.
The title is a bit misleading because complexity is still $$$O(n^4)$$$. But your critique is unfounded I think.
In competitive programming, we need to estimate the running time of the code, and the most widely used method is calculating time complexity. But the complexity itself doesn't answer the question "Will it run in time under the given constraints?", which is what we really want. Our heuristic for this is to plug max limits into the calculated complexity and compare the resulting number with some kind of constant, like $$$10^9 \cdot t$$$ (replace $$$10^9$$$ with the number you believe in) where $$$t$$$ is the time limit in seconds. So, if complexity is $$$O(f(n))$$$ and $$$n \le N$$$, we look at how $$$f(N)$$$ and $$$10^9 \cdot t$$$ are compared. More precisely, we usually look at the number $$$\frac{f(N)}{10^9 \cdot t}$$$: if it is significantly (by an order of magnitude, let's say) less than 1, we can be safe; if it is greater than 1, that will be TL for sure; between $$$1/10$$$ and 1 there is a gray area.
That is what I was taught when I was starting, and that is what I would teach people who are starting. I believe this to be the accepted standard in the community (modulo the particular number instead of $$$10^9$$$), correct me if I'm wrong.
And this heuristic actually fails miserably in the case highlighted by this post, and I believe that this case is significant (there are more than a couple of problems falling under this category), and that there are people with more experience and bigger rating compared to the topic-starter who don't know that. It's not that people don't know the right answer, they just don't bother calculating it when they check if the solution will run in time, because they were taught to ignore constant factors when calculating complexity. Heck, there was a person who set the problem where the intended solution was using this kind of thing to prove that it is fast enough, and they did not provide the best constant estimation!
"... when analysing running time" instead of "... when analysing time complexity" would be a better name though. $$$\frac{n^4}{24} + O(n^3)$$$ is the right way to write what you wanted to write.
I agree with you about title being misleading, will change it.
hi, im sorry for the misleading comment, my comment was made mostly in jest (because i believe this to be rather well known), however i do see that this could be easily mistaken for a serious suggestion at something wrong with the blog. I will update my comment to make it clear its a joke.
It's well-know trick in genshin impact since 2019.
Added correct formula for the problem.