[Idea] Using HLD to reduce memory

#	User	Rating
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

#	User	Contrib.
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Recently when I was doing [Universal Cup](https://ucup.ac/) Round 5, I got stuck a tree problem [A](https://qoj.ac/contest/1111/problem/5566) as I realised that my solution required way too much memory. However, after the contest, I realised that there was a way that I can reduce a lot of memory using HLD. So here I am with my idea...↵
↵
Structure of Tree DP↵
==================↵
↵
Most tree DP problems follow the following structure.↵
↵
~~~~~↵
struct S {↵
    // return value of DP↵
};↵
S init(int u) {↵
    // initialise the base state of dp[u]↵
}↵
S merge(S left, S right) {↵
    // returns the new dp state where old state is left and transition using right↵
}↵
S dp(int u, int p) {↵
    S res = init(u);↵
    for (int v : adj[u]) {↵
        if (v == p) continue;↵
        res = merge(res, dp(v, u));↵
    }↵
    return res;↵
}↵
int main() {↵
    dp(1, -1);↵
}↵
~~~~~↵
↵
An example of a tree DP using this structure is maximum independent set (MIS).↵
↵
<spoiler summary="Code">↵
~~~~~↵
struct S {↵
    // return value of DP↵
    int take, notTake;↵
};↵
S init(int u) {↵
    // initialise the base state of dp[u]↵
    return {1, 0};↵
}↵
S merge(S left, S right) {↵
    // returns the new dp state where old state is left and transition using right↵
    return {left.take + right.notTake, left.notTake + max(right.take, right.notTake)};↵
}↵
S dp(int u, int p) {↵
    S res = init(u);↵
    for (int v : adj[u]) {↵
        if (v == p) continue;↵
        res = merge(res, dp(v, u));↵
    }↵
    return res;↵
}↵
int main() {↵
    dp(1, -1);↵
}↵
~~~~~↵
</spoiler>↵
↵
Suppose struct $S$ requires $|S|$ bytes and our tree has N vertices. Then this naive implementation of tree dp requires $O(N\cdot |S|)$ memory as `res` of the parent is stored in the recursion stack as we recurse down to the leaves. This is fine for many problems as most of the time, $|S| = O(1)$, however in the case of the above question, $|S| = 25^2\cdot 24$ bytes and $N = 10^5$, which will require around $1.5$ gigabytes of memory, which is too much to pass the memory limit of $256$ megabytes.↵
↵
Optimization↵
==================↵
↵
We try to make use of the idea of HLD and visit the visit the vertex with the largest subtree size first.↵
↵
~~~~~↵
struct S {↵
    // return value of DP↵
};↵
S init(int u) {↵
    // initialise the base state of dp[u]↵
}↵
S merge(S left, S right) {↵
    // returns the new dp state where old state is left and transition using right↵
}↵
int sub[MAXN];↵
void getSub(int u, int p) {↵
    sub[u] = 1;↵
    pair<int, int> heavy = {-1, -1};↵
    for (int i = 0; i < adj[u].size(); i++) {↵
        int v = adj[u][i];↵
        if (v == p) continue;↵
        heavy = max(heavy, {sub[v], i});↵
    }↵
    // make the vertex with the largest subtree size the first↵
    if (heavy.first != -1) {↵
        swap(adj[u][0], adj[u][heavy.second]);↵
    }↵
}↵
S dp(int u, int p) {↵
    // do not initialize yet↵
    S res;↵
    bool hasInit = false;↵
    for (int v : adj[u]) {↵
        if (v == p) continue;↵
        if (!hasInit) {↵
            res = init();↵
            hasInit = true;↵
        }↵
        res = merge(res, dp(v, u));↵
    }↵
    if (!hasInit) {↵
        res = init();↵
        hasInit = true;↵
    }↵
    return res;↵
}↵
int main() {↵
    getSub(1, -1);↵
    dp(1, -1);↵
}↵
~~~~~↵
↵
If we analyse the memory complexity properly, we will realise that it becomes $O(N + |S|\log N)$. The $O(N)$ comes from storing the subtree sizes, and the $O(|S|\log N)$ comes from the DP itself. ↵
↵
Proof↵
------------------↵
↵
The two main changes from our naive DP is that we initialise our `res` only after we visit the first child, and we visit the child with the largest subtree size first. Recalling the definitions used in HLD, we define an edge $p_u \rightarrow u$ to be a heavy edge if $u$ is the child with the largest subtree size among all children of $p_u$. We will call all other non-heavy edges as light edges. It is well known that the path from the root to any vertex of a tree will involve transversing many heavy edges but at most $O(\log N)$ light edges.↵
↵
Making use of this idea, if we visit heavy edges first, `res` of the parent of heavy edges will not be stored in the recursion stack since we only initialise our `res` after visiting the first edge, hence only `res` of the parent of light edges will be stored in the recursion stack. Then since there is at most $O(\log N)$ light edges, the memory complexity is $O(|S|\log N)$.↵
↵
[My solution to the above problem](https://qoj.ac/submission/83502)↵
↵
Conclusion↵
==================↵
↵
I have not seen this idea anywhere before and only came up with it recently during Universal Cup. Please tell me if it is actually a well-known technique or there are some flaws in my analysis. Hope that this will be helpful to everyone!

Rev.	By	When	Δ	Comment
en10	maomao90	2024-04-07 12:31:56	126	Tiny change: '.\n\n[cut]\n\n# Opti' -> '.\n\n[cut] \n\n# Opti'
en9	maomao90	2023-03-04 03:27:10	27	Tiny change: '\n ' -> '\n sub[u] += sub[v];\n '
en8	maomao90	2023-03-03 13:23:19	23	Tiny change: '\n ' -> '\n getSub(v, u);\n '
en7	maomao90	2023-03-03 09:50:46	21
en6	maomao90	2023-03-03 09:32:29	36
en5	maomao90	2023-03-03 07:51:24	1506	(published)
en4	maomao90	2023-03-03 07:12:35	132
en3	maomao90	2023-03-03 07:11:07	3083	Tiny change: 'uires $O(N\|S\|)$ memo' -> 'uires $O(N\cdot \|S\|)$ memo'
en2	maomao90	2023-03-02 19:01:59	153
en1	maomao90	2023-03-02 19:00:27	222	Initial revision (saved to drafts)

History