#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	dXqwq	3436
8	Radewoosh	3415
9	Otomachi_Una	3413
10	Um_nik	3376

#	User	Contrib.
1	Qingyu	157
2	adamant	153
3	Um_nik	146
3	Proof_by_QED	146
5	Dominater069	145
6	errorgorn	141
7	cry	139
8	YuukiS	135
9	TheScrasse	134
10	chromate00	133

Space Complexity of a Trie

Most blogs state that the space complexity of a trie is

$$$ O(N \cdot L \cdot R) $$$

where:

$$$N$$$ = number of strings
$$$L$$$ = maximum length of a string
$$$R$$$ = alphabet size (typically $$$26$$$ for lowercase letters)

This is usually justified by:

The trie can have at most $$$L$$$ levels
Each string contributes at most one node per level
Each node stores an array of size $$$R$$$

Hence:

$$$ \text{Space} \le N \cdot L \cdot R $$$

However, this is a very loose bound. Let us analyze this carefully.

Key Observation

At depth $$$d$$$, the number of distinct nodes is at most

$$$ \min(N, R^d) $$$

because:

Each node represents a unique prefix of length $$$d$$$
There are at most $$$R^d$$$ possible prefixes at depth $$$d$$$
And we cannot have more nodes than strings ($$$N$$$)

Example Calculation

Assume:

$$$ N = 10^6, \quad R = 26 $$$

Compute nodes per level:

$$$ \begin{aligned} \text{Level } 0 &: 26^0 = 1 \\ \text{Level } 1 &: 26^1 = 26 \\ \text{Level } 2 &: 26^2 = 676 \\ \text{Level } 3 &: 26^3 = 17576 \\ \text{Level } 4 &: 26^4 = 456976 \\ \text{Level } 5 &: 26^5 = 11881376 \gt N \end{aligned} $$$

From level 5 onward, the number of nodes is capped by $$$N$$$.

Total Number of Nodes

Let us count nodes:

Levels 0 to 4: $$$26^0 + 26^1 + 26^2 + 26^3 + 26^4$$$
Levels 5 to $$$L-1$$$: each can have at most $$$N$$$ nodes

Hence, total nodes:

$$$ 26^0 + 26^1 + 26^2 + 26^3 + 26^4 + (L - 4) \cdot N $$$

(We use $$$L-4$$$ because the first 4 levels are counted separately: 1,2,3,4 → 4 levels, so remaining $$$L-4$$$ levels are “full” levels capped by $$$N$$$.)

Total Space Usage

Each node stores an array of size $$$R = 26$$$, so total memory:

$$$ 26 \cdot \Big( 26^0 + 26^1 + 26^2 + 26^3 + 26^4 + (L-4) \cdot N \Big) $$$

Comparison With Naive Bound

Naive bound:

$$$ 26 \cdot N \cdot L $$$

We compare the prefix contribution:

$$$ 26^0 + 26^1 + 26^2 + 26^3 + 26^4 \lt 4 \cdot N $$$

Compute sum:

$$$ 26^0 + 26^1 + 26^2 + 26^3 + 26^4 = 1 + 26 + 676 + 17576 + 456976 = 475255 $$$

For $$$N = 10^6$$$:

$$$ 475255 \lt 4 \cdot 10^6 $$$

✅ True.

Hence, the naive bound overestimates memory by about 8×, because it assumes $$$N$$$ nodes at each level, while in reality early levels have far fewer nodes.

Tight Space Complexity

In general, the tight upper bound on the number of nodes is:

$$$ \sum_{d=0}^{L-1} \min(N, R^d) $$$

So total space:

$$$ O\Big(R \cdot \sum_{d=0}^{L-1} \min(N, R^d)\Big) $$$

For fixed alphabet size $$$R$$$:

$$$ \boxed{ \text{Space} = O(N \cdot L) \text{ (with a much smaller constant than } O(N \cdot L \cdot R)\text{)} } $$$

Takeaway

$$$O(N \cdot L \cdot R)$$$ is a very loose bound
Early levels of the trie have far fewer nodes than $$$N$$$
Real space usage is often ~8× smaller than the naive calculation
Correct intuition: $$$\min(N, R^d)$$$ nodes at depth $$$d$$$

Full text and comments »