Hi Codeforces.
Today I came across a legendary problem: 896E - Welcome home, Chtholly. It’s rated $$$3100$$$ and is known for being a nightmare for data structure enthusiasts. The tags say "DSU" and "Data Structures," and the intended solutions involve complex Sqrt-decomposition or Segment Tree Beats.
But as my future handle suggests -- $$$SegTreeIsAMyth$$$. I decided to see if a simple $$$O(N \cdot M)$$$ brute force could pass.
To my surprise, it didn't just pass: it $$$crushed$$$ the tests in $$$1750ms$$$ (with a 3s limit)!
Screenshot:

The Code
The core of my solution is just a simple loop. No complex trees, no heavy pointers.
#include <bits/stdc++.h>
using namespace std;
#define ll long long
#define all(x) x.begin(), x.end()
#define rall(x) x.rbegin(), x.rend()
#define yes cout << "YES\n"
#define no cout << "NO\n"
#define debug cout << "Debug\n"
#pragma GCC optimize("O3")
#pragma GCC optimize("unroll-loops")
#pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
/*
---===ASCII help===---
'0' -> 48 '9' -> 57
'A' -> 65 'Z' -> 90
'a' -> 97 'z' -> 122
*/
signed main() {
ios_base::sync_with_stdio(false); cin.tie(nullptr);
int n, m; cin >> n >> m;
int arr[n];
for (int i = 0; i < n; ++i) cin >> arr[i];
while (m--) {
int t, l, r, x; cin >> t >> l >> r >> x;
--l; --r;
if (t == 1) {
for (int i = l; i <= r; ++i) {
if (arr[i] > x) arr[i] -= x;
}
}
else {
ll ans = 0;
for (int i = l; i <= r; ++i) {
if (arr[i] == x) ++ans;
}
cout << ans << '\n';
}
}
}
Why did it pass?
Many of you might think $$$10^{10}$$$ operations shouldn't pass in 3 seconds. Here is why it did:
GCC 14 & AVX2: The
#pragma GCC target("avx2")allows the compiler to use SIMD instructions. It processes multiple integers in a single CPU cycle.Loop Unrolling:
#pragma GCC optimize("unroll-loops")reduces the overhead of the loop control.Cache Locality: A simple array traversal is the most cache-friendly operation possible. While a Segment Tree jumps around in memory causing cache misses, the CPU prefetcher loves this linear scan.
Modern Judges: Codeforces' upgraded hardware is incredibly fast at executing vectorized code.
Conclusion
Next time you see a 3100-rated problem, don't be afraid. Sometimes, the "dumb" solution with the right pragmas is more powerful than the most sophisticated data structure.







