Edu Round 93 Problem C accepted with n^2 solution ?? - Codeforces

Codeforces и Polygon могут быть недоступны в период с 6 декабря, 22:00 (МСК) по 7 декабря, 00:00 (МСК) в связи с проведением технических работ. ×

→ Обратите внимание

До соревнования
Codeforces Round 991 (Div. 3)
12:22:14
Зарегистрироваться »

→ Трансляции

Codeforces Round 991 (Div 3) — Solution Discussion

Shayan

До начала 14:42:13

Binary Search - Topic Stream

Shayan

До начала 34:47:13

Всё →

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	3985
2	jiangly	3741
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3489
7	Radewoosh	3483
8	Kevin114514	3443
9	ecnerwala	3392
9	Um_nik	3392

Страны | Города | Организации

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	158
5	-is-this-fft-	158
7	awoo	156
8	djm03178	155
9	TheScrasse	154
10	Dominater069	153

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя nur_riyad

Edu Round 93 Problem C accepted with n^2 solution ??

Автор nur_riyad, история, 4 года назад, По-английски

По-английски

### O(n^2) solution link

But if I just erase first three lines of that solution, then solution time changes from 4s+ to 1.2s why?

+102

nur_riyad
4 года назад
49

Комментарии

Комментарии (45)

Показать архивные | Написать комментарий?

»

4 года назад, # |

Проголосовать: нравится

0

Проголосовать: не нравится

Just wait for the open hacking phase to finish. You will have your answer.

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

+3

Проголосовать: не нравится

I tried with worst-case but it passes through.

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

-39

Проголосовать: не нравится

I think the solution is relying on the third line #pragma GCC optimization ("unroll-loops"). I guess GCC will be able to optimize it so that it's only O(n) (or something similar in run time). I think this actually might be much harder to hack than expected.

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

+10

Проголосовать: не нравится

#pragma GCC optimize("Ofast") #pragma GCC target("avx,avx2,fma") These commands are important, you can check that it passes without "unroll-loops".

→ Ответить

»

»

BohdanPastuschak

4 года назад, # ^ |

← Rev. 3 →

Проголосовать: нравится

+25

Проголосовать: не нравится

Obviously, compiler can't optimize this to $$$O(n)$$$ (see runtime — almost second), it is just very fast $$$O(n^2)$$$. And I bet it won't be hacked — you can check by yourself, this code runs less than a second on maxtest.

→ Ответить

»

»

»

XORring-Samurai

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

How does this happen?? Why the solution could pass even the worst test case?

→ Ответить

»

»

»

»

BohdanPastuschak

4 года назад, # ^ |

← Rev. 2 →

Проголосовать: нравится

0

Проголосовать: не нравится

C++ is fast , especially with this type of operations and #pragma GCC optimize("Ofast"). Without any optimizations it runs 3.5 seconds.

→ Ответить

»

»

»

»

»

XORring-Samurai

4 года назад, # ^ |

← Rev. 3 →

Проголосовать: нравится

+1

Проголосовать: не нравится

Quite a few times in CP, we deal with problems where the naive solution gives O(n^2) time complexity. Owing to the constraints which are usually of the order of 10^5 and 2 seconds, we optimize our code to O(n). But I had this question, if we can solve all of these problems using the O(n^2) naive solution and this optimization.

→ Ответить

»

»

»

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

Not all of them.

→ Ответить

»

»

»

»

»

»

»

XORring-Samurai

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

So, is the above mentioned case, an exception?

→ Ответить

»

»

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

+1

Проголосовать: не нравится

I wonder, why this not always work then? what it depends on exactly.

→ Ответить

»

»

»

»

»

»

BohdanPastuschak

4 года назад, # ^ |

Проголосовать: нравится

+1

Проголосовать: не нравится

Depends on many things such as type of operations and whether it can be optimized by compiler.

→ Ответить

»

»

»

»

»

»

»

4 года назад, # ^ |

← Rev. 2 →

Проголосовать: нравится

0

Проголосовать: не нравится

So can this sometimes make the program even slower? Because otherwise I would include it every time and it could probably make the program faster.

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

+15

Проголосовать: не нравится

how can it do it in only O(n)? That is better than most people's optimal solution(which is O(nlog(n))?) Oh, and also, the compiler cannot optimize comparisons in that case. Because it knows that it has to continue through the loops to get the answer(there is no pattern found, or no case where it is useless to continue the loop anymore). If what you are saying is true, then we would have a sort that works in O(n) lol. That would be better than the best sort which works in O(nlog(n)) with any numbers/values/datatype.

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

+3

Проголосовать: не нравится

I tried all 6 possible cases with first three line and found that this line "#pragma GCC optimize("Ofast") " must be included for soln to pass

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

+1

Проголосовать: не нравится

Only that is not enough, check

→ Ответить

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

+2

Проголосовать: не нравится

89966688 It is enough

→ Ответить

»

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

+1

Проголосовать: не нравится

Wow, didn't know that 64-bit was that fast. Thanks.

→ Ответить

»

4 года назад, # |

← Rev. 2 →

Проголосовать: нравится

+5

Проголосовать: не нравится

It is fast because,

Code is branchless(CPU can easily predict what comes next)
AVX enables vector instruction, which in this case gives 4x boost by combining and calculating 4 ints at same time
Ofast also gives significant boost, but why idk

→ Ответить

»

»

4 года назад, # ^ |

← Rev. 2 →

Проголосовать: нравится

0

Проголосовать: не нравится

how can we actively try to write such code,

i found the youtube video about

return condition false * answer + condition true * otherans //zero/false will cancel it

and instead of if (x>4) return true else return false

writing return x>4

→ Ответить

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

Most of the time compiler optimizes branches, mostly try to avoid false*ans1+true*ans2, use ?: instead

→ Ответить

»

4 года назад, # |

← Rev. 3 →

Проголосовать: нравится

0

Проголосовать: не нравится

I think the only solution to this is that if O(n) is expected, then make the time limit at most 0.8 second. It's 20 times slower than the intended solution.

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

Then,it will be very unfair for python users

→ Ответить

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

-12

Проголосовать: не нравится

Limit C++ only then.

→ Ответить

»

»

»

4 года назад, # ^ |

← Rev. 2 →

Проголосовать: нравится

+5

Проголосовать: не нравится

Making the n up to $$$3*10^5$$$ is enough. Most problems which its intended solution is $$$O(n)$$$ or $$$O(nlog_2(n))$$$ are $$$3*10^5$$$ which would obviously make that $$$O(n^2)$$$ solution to fail. I wonder why this time it is only $$$10^5$$$...

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

+8

Проголосовать: не нравится

I think increasing max N is typically better.
Increasing max N 10 times would increase n^2 solution time 100 times.

→ Ответить

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

-14

Проголосовать: не нравится

Probably it will fail Python. The thing that there is a time limit equality among all the languages, must be changed, imho.

→ Ответить

»

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

Probably it will fail Python

Less likely than with decreasing time limit

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

+1

Проголосовать: не нравится

I got TLE during the contest(using O(n^2) solution). But using your first 3 lines , it's accepted now.

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

+9

Проголосовать: не нравится

From what I see with this pragma code begins to use SSE instructions, so I assume it basically processes four values at a time

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

+1

Проголосовать: не нравится

if i use those 3 lines every solution,, can i get advantage all time or not??

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

+3

Проголосовать: не нравится

Sometimes, it mostly depends on structure of code

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

+1

Проголосовать: не нравится

LoL, quite unexpected, but cool!
Turns out you can do order of 10^10 operations in a second on CodeForces servers.

→ Ответить

»

»

4 года назад, # ^ |

← Rev. 2 →

Проголосовать: нравится

0

Проголосовать: не нравится

Not accurate. $$$10^{10}$$$ operations would theoretically take around 1800 ms or more in that problem with the same conditions. The number of operations is specifically in worst case is $$$n*(n-1)/2$$$ which is around $$$5*10^9$$$ operations.

→ Ответить

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

More or less accurate:
The number of loop iterations is 5 * 10^9,
but each loop iteration consists of several operations: ans+=(ll)(a[j]-a[i-1]==j-i+1);
and then there are also operations to do looping: one increment and one comparison per iteration.

→ Ответить

»

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

Oh, yes you are right. So that is two array accesses and one comparison and increment in each operation. You are right!

→ Ответить

»

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

+8

Проголосовать: не нравится

Yes, but in this case compiler uses AVX instruction set to calculate 4 iterations at same time, if you comment #pragma GCC target("avx,avx2,fma") time will jump to 4 seconds.

→ Ответить

»

4 года назад, # |

← Rev. 2 →

Проголосовать: нравится

0

Проголосовать: не нравится

pragma added to default template....lol

Also, I wonder how many people lost score on this because of failing to break this....

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

98 hacking attempts were made.

→ Ответить

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

That's massive....I am glad that I do not participate in hacking...

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

You don't lose or gain score in open hacking.

→ Ответить

»

»

»

4 года назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

I see...my bad then...

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

0

Проголосовать: не нравится

Also see this problem, the brute-force solution also passed using command-lines.

Command-lines do make your solution faster, it might pass $$$10^{10}$$$ operations within seconds, it's almost impossible to hack because it's about the basic-level optimization.

→ Ответить

»

4 года назад, # |

Проголосовать: нравится

-10

Проголосовать: не нравится

i have also used the above concept but was getting tle then i read this blog and used those 3 lines but still i am getting tle

Your code here...
#pragma GCC optimize("Ofast")
#pragma GCC target("avx,avx2,fma")
#pragma GCC optimization ("unroll-loops")
#include<bits/stdc++.h>
using namespace std;
int main()
{
    int i,j,n,t,ans;string s;
    cin>>t;
    while(t--)
    {
        cin>>n;ans=0;cin>>s;
        int a[n+1],p[n+1];
        p[0]=0;
        for(i=1;i<=n;i++)
        {
        a[i]=s[i-1]-'0';
        //if(a[i]==1)
        //ans++;
        p[i]=p[i-1]+a[i];
        }
        for(i=1;i<=n;i++)
        for(j=1;j<=n;j++)
        {
            if(i==j){
            if((p[i]-p[i-1])==i-j+1)
            ans++;}
 
        else if((j-i+1)==(p[j]-p[i-1])&&(i<j))
        ans++;
        }
        cout<<ans<<endl;
    }
}

→ Ответить

»

»

4 года назад, # ^ |

Проголосовать: нравится

+3

Проголосовать: не нравится

It won't work with any $$$O(n^2)$$$ solution, it needs to be coded in very specific way, so compiler can understand and optimize it.

→ Ответить

Codeforces (c) Copyright 2010-2024 Михаил Мирзаянов

Соревнования по программированию 2.0

Время на сервере: 05.12.2024 05:12:47 (j2).

Десктопная версия, переключиться на мобильную.

При поддержке