Educational Round 15 — Problem 702E (Analysis of Pathes on Functional Graph): http://mirror.codeforces.com/problemset/problem/702/E
The official editorial: http://mirror.codeforces.com/blog/entry/46324?locale=en
This article is not about another solution, but how to implement the solution that editorial provides with Haskell, efficiently. After reading the editorial I tried to solve it with Haskell, but I got memory limit exceed on test 28: http://mirror.codeforces.com/contest/702/submission/19550505
First I wondered if my code is wrong, so I ported my Haskell code to C++ and it was accepted (0.7ms, 138MB): http://mirror.codeforces.com/contest/702/submission/20682955
So problem is the nature of Haskell — strong type system and rich expressivity at the cost of excessive memory usage and low performance. Clearly heavy memory usage came from all that state nodes. each state is an array of nodes. There can be at most 10^5 nodes and 34 states (k <= 10^10 and 10^10 is 34 characters in binary representation).
Each node consists of (next node, total weight, minimum weight)
— whose type is Int
, Int64
, and Int64
respectively. In C++, this is (4+8+8 bytes) * 10^5 * 34 = 64.85MB. How many memory is needed for a node in Haskell?
type Node = (Int, Int64, Int64)
Node
has a bunch of pointers and wrappers so one Node
consumes 10 words. (You can see Haskell types' memory footprints here: https://www.youtube.com/watch?v=_pDUq0nNjhI) A word is 8 bytes on a 64-bit machine, so one Node
takes 80 bytes. To hold all states, we need 259MB. This is only for holding states. Other operations need their memory therefore memory limit (512MB) is exceeded.
To save memory I unpacked all fields of Node
:
{-# LANGUAGE BangPatterns #-}
data Node = Node {-# UNPACK #-} !Int {-# UNPACK #-} !Int64 {-# UNPACK #-} !Int64
It removes all pointers and wrappers inside a Node
. Regarding constructor overhead, now a Node
takes 4 words = 32 bytes. Now total states consume 103MB and total memory usage of the program is about 210MB. But it's not over. TLE on test 35 (n = 10^5, k = 10^10) :(
Again what matters is Haskell's powerful feature; lazy evaluation. Our states are memory-heavy and they need a lot of numeric calculations. There is no unnecessary work and we have to utilize all of the evaluated results. This is not a place for lazy evaluation, but for strict evaluation.
To force evaluation of states, I tried evaluate
in the Control.Exception
module: http://mirror.codeforces.com/contest/702/submission/20793048
evaluate :: a -> IO a
But there was no improvement. Stucked here, I was going mad but realized that evaluate
only evaluates to weak head normal form. To fully evaluate something we need force
in Control.DeepSeq
.
To force
a Node
, Node
should be an instance of NFData
.
import Control.DeepSeq
instance NFData Node where
rnf (Node a b c) = rnf a `seq` rnf b `seq` rnf c
Finally it was accepted (1.84s, 242MB): http://mirror.codeforces.com/contest/702/submission/20805671
With all my efforts, My Haskell solution is 2.6 times slower than C++ one which I wrote quite easy. The problem's time constraint is 2 seconds and Haskell version took 1.84s. So tough!
This problem made me give up to use Haskell on contests. Maybe I have to use Haskell only for practice :(