Word Ladder: For 1e5 Queries on a Static Dictionary

Hello ,

During learning graphs, I encountered a famous and classic problem Word Ladder problem. Given a beginWord and endWord, find the minimum number of single-character transformations, where each intermediate word must be from a given list. Given all words are of the same length.

This shows up on LeetCode, GFG, interviews, everywhere.

But almost every version of the problem is single-query. You solve it once and move on.

Now I want to discuss for a scenario where the dictionary is fixed and you need to answer queries in range of 1e5.

N ≤ 5000 words L ≤ 10 (word length) Q ≤ 100000 queries

This completely changes the problem.

The typical BFS solution (where we generate neighbours (26*L) at each stage and check whether they are in the dictionary or not and proceed) became far too slow.

This article explores that space in depth and shows efficient techniques depending on the ratio of N vs Q. To understand the original classical problem better, you may refer to _https://www.geeksforgeeks.org/problems/word-ladder/1_

Method 1: "Generate 26·L neighbors per BFS" (per query BFS)

This is the classic solution.

int ladderLength(string beginWord, string endWord, vector<string>& wordList) {
    unordered_set<string> dict(wordList.begin(), wordList.end());
    if (!dict.count(endWord)) return 0;

    queue<pair<string,int>> q;
    q.push({beginWord, 1});

    unordered_set<string> vis;
    vis.insert(beginWord);

    while (!q.empty()) {
        auto [w, d] = q.front(); q.pop();

        if (w == endWord) return d;

        for (int i = 0; i < w.size(); i++) {
            char orig = w[i];
            for (char c = 'a'; c <= 'z'; c++) {
                if (c == orig) continue;
                w[i] = c;
                if (dict.count(w) && !vis.count(w)) {
                    vis.insert(w);
                    q.push({w, d + 1});
                }
            }
            w[i] = orig;
        }
    }
    return 0;
}

Per query BFS:

O(N × 26 × L) ≈ 5000 × 260 = 1.3 million ops

For 100k queries:

1.3M × 100k = 1.3e11 operations → TLE, impossible

Method 2: Pattern indexing (per-query BFS using ot/ht/ho*)

int ladder(string beginWord, string endWord, vector<string> wordList) {
    int L = beginWord.size();
    unordered_set<string> dict(wordList.begin(), wordList.end());
    if (!dict.count(endWord)) return 0;

    unordered_map<string, vector<string>> buckets;
    for (auto &w : dict) {
        for (int i = 0; i < L; i++) {
            string p = w;
            p[i] = '*';
            buckets[p].push_back(w);
        }
    }

    queue<pair<string,int>> q;
    unordered_set<string> vis;

    q.push({beginWord, 1});
    vis.insert(beginWord);

    while (!q.empty()) {
        auto [w, d] = q.front(); q.pop();
        if (w == endWord) return d;

        for (int i = 0; i < L; i++) {
            string p = w;
            p[i] = '*';

            for (auto &nei : buckets[p]) {
                if (!vis.count(nei)) {
                    vis.insert(nei);
                    q.push({nei, d+1});
                }
            }
            buckets[p].clear();  // important optimization
        }
    }
    return 0;
}

Per query BFS:

O(N × L) = 5000 × 10 = 50k operations

For 100k queries:

50k × 100k = 5e9 → borderline slow, likely TLE in C++ under 1-2s

Method 3: Precomputed adjacency (O(N²·L) cost once) + BFS per query

Solving this question, I first came up with a logic where words in the dictionary can be considered as nodes and those words having only one character different have an edge joining them. So in this way I developed an adjacency list but this came out to be O(N²·L) which was not good for the original problem but for this variation it is efficient.

bool differ1(const string &a, const string &b) {
    int c = 0;
    for (int i = 0; i < a.size(); i++)
        c += (a[i] != b[i]);
    return c == 1;
}

vector<vector<int>> buildAdj(const vector<string>& dict) {
    int n = dict.size();
    vector<vector<int>> adj(n);

    for (int i = 0; i < n; i++) {
        for (int j = i+1; j < n; j++) {
            if (differ1(dict[i], dict[j])) {
                adj[i].push_back(j);
                adj[j].push_back(i);
            }
        }
    }
    return adj;
}

int bfsAdj(int start, int goal, const vector<vector<int>>& adj) {
    if (start == -1 || goal == -1) return 0;
    int n = adj.size();

    vector<int> dist(n, -1);
    queue<int> q;

    dist[start] = 1;
    q.push(start);

    while (!q.empty()) {
        int u = q.front(); q.pop();
        if (u == goal) return dist[u];

        for (int v : adj[u]) {
            if (dist[v] == -1) {
                dist[v] = dist[u] + 1;
                q.push(v);
            }
        }
    }
    return 0;
}

Analysis of the Three Main Approaches

There are three reasonable strategies for solving multiple Word-Ladder queries when the dictionary is fixed: (1) On-the-fly neighbor generation, (2) pattern-indexed BFS, and (3) full adjacency precomputation. Each comes with different trade-offs depending on the number of queries Q compared to the dictionary size N.

Neighbor Generation (26·L BFS) For each BFS step, we try replacing every character in the word with all 26 letters. This gives up to 26·L possible neighbors and avoids any preprocessing. Its complexity per query is roughly O(N·L), which is fast for a single query, but becomes prohibitive when Q is large. For example, with N = 5000 and Q = 100000, the total work crosses several billion operations, which is not realistic for standard time limits.
Pattern Indexing (“ot”, “ht”, “ho”)* This is the classic editorial solution: you preprocess N·L patterns, each mapping to the set of words that match that pattern. Neighbor lookup becomes faster (L patterns + bucket lookups), but BFS still needs to examine a large portion of the dictionary for each query. Its total cost is about O(N·L + Q·N·L); much better constants than method 1, but still too slow once Q grows into the tens of thousands.
Full Adjacency Precomputation (O(N²·L) once, BFS per query thereafter) When the dictionary is static, the graph structure does not change. We can precompute all edges by checking each pair of dictionary words and connecting those that differ in exactly one position. This preprocessing takes O(N²·L) time once, but every query afterward becomes a simple BFS on a ready-made adjacency list. The per-query time drops to O(N), and the total becomes O(N²·L + Q·N), which is far more efficient than the previous methods when Q is large (for example, when Q ≥ N). This approach leverages amortization: the heavy preprocessing cost is paid once, and repeated queries become cheap.

In short:

For small Q, method (1) or (2) is perfectly fine.

For moderate Q, pattern indexing scales better than neighbor generation.

For large Q, especially when Q ≈ N or higher, full adjacency precomputation is the superior solution, as it minimizes repeated work across queries

	Rev.	Язык	Кто	Когда	Δ	Комментарий
	en2		Prakharg_8651	2025-12-02 09:55:01	1624
	en1		Prakharg_8651	2025-12-01 23:48:35	7489	Initial revision (published)

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3611
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	dXqwq	3436
8	Radewoosh	3415
9	Otomachi_Una	3413
10	Um_nik	3376

№	Пользователь	Вклад
1	Qingyu	161
2	adamant	149
3	Um_nik	146
4	Dominater069	143
5	errorgorn	141
6	cry	138
7	Proof_by_QED	135
7	YuukiS	135
9	chromate00	134
10	TheScrasse	133

Method 1: "Generate 26·L neighbors per BFS" (per query BFS)

Method 2: Pattern indexing (per-query BFS using ot/ht/ho*)

Method 3: Precomputed adjacency (O(N²·L) cost once) + BFS per query

Analysis of the Three Main Approaches

История