Space Complexity of a Trie
Most blogs state that the space complexity of a trie is
where:
- $$$N$$$ = number of strings
- $$$L$$$ = maximum length of a string
- $$$R$$$ = alphabet size (typically $$$26$$$ for lowercase English letters)
However, this bound is very loose and often overestimates the actual memory usage by a large constant factor.
Where the Naive Bound Comes From
The usual reasoning is:
- The trie can have at most $$$L$$$ levels
- Each string contributes at most one node per level
- Each node stores an array of size $$$R$$$
Hence,
This is technically correct as an upper bound, but it ignores a crucial structural limitation of tries.
Key Observation
At depth $$$d$$$, the maximum number of distinct nodes is:
Why?
- Each node at depth $$$d$$$ represents a unique prefix of length $$$d$$$
- Over an alphabet of size $$$R$$$, there are at most $$$R^d$$$ such prefixes
- The number of nodes cannot exceed the number of inserted strings $$$N$$$
More Accurate Node Count
Assume:
Maximum number of nodes at each level:
From level $$$5$$$ onward, the number of nodes is capped by $$$N$$$.
Total Number of Nodes
Let $$$k$$$ be the smallest depth such that $$$26^k \ge N$$$.
Here, $$$k = 5$$$.
Total number of nodes:
Substituting:
Total Space Usage
Each node stores an array of size $$$R = 26$$$:
Comparison with Naive Bound
Naive bound:
Compare the two:
The extra term in the naive bound is:
while the actual prefix contribution is:
Dividing both by $$$26$$$:
Compute:
For $$$N = 10^6$$$:
Thus,
Constant Factor Improvement (The 8× Effect)
For large $$$L$$$, the dominant terms are:
- Naive: $$$\;\;26 \cdot N \cdot L$$$
- Actual: $$$\;\;26 \cdot N \cdot (L - 5)$$$
So the ratio is approximately:
For typical constraints (e.g. $$$L \approx 40$$$–$$$50$$$):
But when counting only necessary nodes instead of assuming $$$N$$$ nodes at every level, the naive bound overcounts the early levels by roughly:
In practice, this leads to memory usage that is often about 8–10× smaller than the naive estimate.
Final Complexity (Tight Bound)
Number of nodes:
Total space:
For fixed $$$R$$$:
Takeaway
- $$$O(N \cdot L \cdot R)$$$ is a very loose upper bound
- A trie cannot have $$$N$$$ nodes at every level
- The naive estimate often overcounts memory by ~8–10×
- Correct intuition:



