EarthMessenger's blog

By EarthMessenger, history, 12 days ago, In English

I was solving the problem 364E - Empty Rectangles. The time complexity of my solution is $$$O(n^2k \log n)$$$, where $$$n \leq 2500$$$ and $$$k \leq 6$$$. The time limit for this problem is 12 seconds.

However, my submission 295732996 runs significantly slower on Codeforces compared to AtCoder and my local machine. (btw, I AC this problem using C++17 (GCC 7-32), 11155 ms, a little faster)

Comparisons

To investigate, I embedded a data generator that creates random matrices with uniformly distributed 0s and 1s, then ran the code on different platforms:

  • On Codeforces: 12437ms
    Codeforces Screenshot

  • On AtCoder: 4260ms
    AtCoder Screenshot

  • On My Local Machine (i3-6100, g++ 14.2.1, Arch Linux): 4596ms
    Local Machine Screenshot

The Code

Here is the implementation with the embedded data generator:

#include <algorithm>
#include <array>
#include <cstdint>
#include <iostream>
#include <random>
#include <string>
#include <vector>

using u8 = uint8_t;
using i64 = long long;
using i128 = __int128_t;
using u32 = unsigned int;
using u64 = unsigned long long;
using u128 = __uint128_t;

template <typename T, int N>
struct StaticVector
{
  int p;
  std::array<T, N> v;
  
  StaticVector() : p(0) {}

  T &operator[](int x) { return v[x]; }
  const T &operator[](int x) const { return v[x]; }

  int size() const { return p; }

  bool full() const { return p == (int)v.size(); }

  void push_back(const T &t) 
  { 
    if (full()) return ;
    v[p++] = t; 
  }
};

template <typename T, int K, typename Cmp>
StaticVector<T, K> merge(const StaticVector<T, K> &a, const StaticVector<T, K> &b, Cmp &&cmp)
{
  StaticVector<T, K> c;
  int i = 0, j = 0;
  while (!c.full() && j < b.size()) {
    if (cmp(a[i], b[j])) c.push_back(a[i++]);
    else c.push_back(b[j++]);
  }
  while (!c.full()) c.push_back(a[i++]);
  return c;
}

std::vector<std::vector<u8>> transpose(std::vector<std::vector<u8>> a)
{
  int n = a.size();
  int m = a[0].size();
  std::vector<std::vector<u8>> b(m, std::vector<u8>(n));
  for (int i = 0; i < n; i++) {
    for (int j = 0; j < m; j++) {
      b[j][i] = a[i][j];
    }
  }
  return b;
}

template <int K>
i64 solve_details(std::vector<std::vector<u8>> a)
{
  if (a.size() < a[0].size()) a = transpose(std::move(a));

  if (a.size() == 1 && a[0].size() == 1) {
    return a[0][0] == K;
  } else {
    int mid = a.size() / 2;
    int n = a[0].size();

    static std::vector<StaticVector<int, K + 2>> f;
    static std::vector<StaticVector<int, K + 2>> g;

    f.assign(n, {});
    g.assign(n, {});

    for (int i = mid - 1; i >= 0; i--) {
      for (int j = 0; j < n; j++) {
        if (a[i][j] == 1) f[j].push_back(i);
      }
    }

    for (int i = mid; i < (int)a.size(); i++) {
      for (int j = 0; j < n; j++) {
        if (a[i][j] == 1) g[j].push_back(i + 1);
      }
    }

    i64 ans = 0;

    for (int l = 0; l < n; l++) {
      StaticVector<int, K + 2> fp, gp;
      fp.push_back(mid);
      gp.push_back(mid);
      for (int t = 0; t <= K; t++) {
        fp.push_back(-1);
        gp.push_back(a.size() + 1);
      }
      for (int r = l + 1; r <= n; r++) {
        fp = merge(std::move(fp), f[r - 1], std::greater<int>{});
        gp = merge(std::move(gp), g[r - 1], std::less<int>{});
        for (int i = 0; i <= K; i++) {
          int uc = fp[i] - fp[i + 1];
          int dc = gp[K - i + 1] - gp[K - i];
          if (fp[i] == mid) uc--;
          if (gp[K - i] == mid) dc--;
          ans += uc * dc;
        }
      }
    }

    ans += solve_details<K>(std::vector(a.begin(), a.begin() + mid));
    ans += solve_details<K>(std::vector(a.begin() + mid, a.end()));

    return ans;
  }
}

template <int K>
i64 solve(std::vector<std::vector<u8>> a, int k)
{
  if (K < k) return solve<std::min(K + 1, 6)>(std::move(a), k);
  return solve_details<K>(std::move(a));
}

int main()
{
  std::cin.tie(nullptr);
  std::ios::sync_with_stdio(false);

  int n, m, k;
  std::cin >> n >> m >> k;

  std::vector a(n, std::vector<u8>(m));

  std::mt19937 mt;

  for (int i = 0; i < n; i++) {
    for (int j = 0; j < m; j++) {
      a[i][j] = mt() & 1;
    }
  }

  std::cout << solve<0>(a, k) << std::endl;
}

Why does my code run so much slower on Codeforces? Is this due to differences in the hardware, compiler, or some other factor? Any insights would be appreciated.

Full text and comments »

  • Vote: I like it
  • +32
  • Vote: I do not like it

By EarthMessenger, history, 20 months ago, In English

You're given an integer n, answer the number of inversions in all the derangement permutations of length n. For example, if n = 3, there are two derangement premutations, 231 and 312, so the answer is 4.

There is such a sequence in OEIS. But I want to know how this is derived.

It's welcomed if you have other linear solution.

Full text and comments »

  • Vote: I like it
  • +25
  • Vote: I do not like it

By EarthMessenger, history, 21 month(s) ago, In English

Hello guys, I've made a short video editorial about the upcoming april fools day contest.

Video solution on youtube

Full text and comments »

  • Vote: I like it
  • +68
  • Vote: I do not like it

By EarthMessenger, history, 2 years ago, In English

Problem Statment

You're given a matrix of $$$N$$$ rows and $$$M$$$ columns, and $$$Q$$$ queries. $$$(1 <= N, M, Q <= 1000)$$$

For each query $$$(x, y, c)$$$, you need to rotate clockwise the square child matrix whose upper left corner is $$$(x, y)$$$ and lower right corner is $$$(x+c-1, y+c-1)$$$.

Print the final matrix as result.

Sample

input:

4 5 3
9 9 3 4 5
5 0 2 1 3
9 3 6 4 3
5 9 3 9 0
1 3 1
2 1 2
2 2 1

output:

9 9 3 4 5
9 5 2 1 3
3 0 6 4 3
5 9 3 9 0

Obviously there is an $$$O(n^3)$$$ brute force algorithm which is not fast enough. Are there any solution faster than $$$O(n^3)$$$?

I guess the technique of dancing links may be useful.

thanks :-)

Full text and comments »

  • Vote: I like it
  • +10
  • Vote: I do not like it