IamDragonite's blog

By IamDragonite, history, 4 months ago, In English

Frequency Splitting (Binary Decomposition) Knapsack

Problem Overview

You are given a multiset of items. Each item has a weight w, and weight w appears cnt times. Your task is to determine which total sums are achievable. i.e., solve the subset-sum/knapsack feasibility problem.

Constraints

  • Total sum of all weights S = Σ (w · cnt) ≤ 1e6
  • Many items may have identical weights
  • Only reachability matters (we do not need the number of ways)

Naïve Approach

Treat every copy as a separate item and run a standard bitset DP:

bitset<MAXS> dp;
dp[0] = 1;

for (int w : all_items)
    dp |= (dp << w);

Time complexity:

  • Number of items = n
  • One bitset shift costs O(S / W) where W is machine word size (typically 64)

So total complexity is:

O(n · S / W)

Worst case example:

  • n = 1e6, S = 1e6 → ~1.5 × 10^10 word operations → Time Limit Exceeded

Key Observations

  • Many items have identical weights.
  • We only care whether a sum is achievable (feasibility).
  • The total sum S is bounded.

These facts allow a strong optimization.


Frequency Splitting (Binary Decomposition)

Core idea

If weight w appears c times, decompose c into powers of two:

c = 1 + 2 + 4 + ... + r

Replace the c identical items by items of weights:

w, 2w, 4w, ..., r·w

This reduces c copies to O(log c) items while preserving exactly the same set of reachable sums.


Correctness Proof (sketch)

  • Any integer 0 ≤ t ≤ c can be expressed as a sum of distinct powers of two (binary expansion). Multiplying by w yields any t·w.
  • Therefore every total obtainable by choosing up to c original copies of weight w is obtainable by choosing a subset of the decomposed items, and vice versa.

Thus decomposition is sum equivalence preserving.


Binary Decomposition Code (core snippet)

for (auto [w, c] : freq) {
    int p = 1;
    while (c > 0) {
        int take = min(p, c);
        dp |= (dp << (take * w));
        c -= take;
        p <<= 1;
    }
}

Each group contributes O(log c) shifts instead of c shifts.


Dry Run Example (20 Elements, 2 Weights)

Input multiset:

  • Weight 2 → 12 copies
  • Weight 5 → 8 copies

Total elements = 20 Total sum:

S = 12·2 + 8·5 = 64

Decomposition:

  • Weight 2 (c = 12): 12 = 1 + 2 + 4 + 5 → items: 2, 4, 8, 10
  • Weight 5 (c = 8): 8 = 1 + 2 + 4 + 1 → items: 5, 10, 20, 5

Result:

  • Original items: 20
  • After splitting: 8
  • All reachable sums in [0..64] that were achievable before remain achievable.

√S Optimization

Choose threshold:

B = floor(sqrt(S))

Split items into:

  • Heavy items: w ≥ B
  • Light items: w < B

Process heavy items directly and decompose light items.

Why this helps:

  • Each heavy item has weight at least √S. If there are k heavy items then k·√S ≤ S ⇒ k ≤ √S. So the heavy group has at most √S items.
  • Light weights are small and often have large multiplicities, binary decomposition compresses them to O(log c) pieces each, and the total number of resulting light pieces can be bounded (practically O(√S) after the aggregation heuristic).

Final Algorithm (summary)

Step 1 : initialize:

bitset<MAXS> dp;
dp[0] = 1;
int B = sqrt(S);

Step 2 : process heavy weights directly:

for (auto [w, c] : freq)
    if (w >= B)
        while (c--) dp |= (dp << w);

Step 3 : process light weights using binary decomposition:

for (auto [w, c] : freq)
    if (w < B) {
        int p = 1;
        while (c > 0) {
            int take = min(p, c);
            dp |= (dp << (take * w));
            c -= take;
            p <<= 1;
        }
    }

Time Complexity (analysis)

  • One bitset shift costs O(S / W).
  • Number of DP transitions:
  • Heavy group: at most √S shifts (because each heavy item ≥ √S and total sum S limits their count).

  • Light group: after decomposition we typically get O(√S) shifts (practically bounded by B or B·log S in contest settings B = sqrt(S) is effective).

Overall practical complexity:

O(S · sqrt(S) / W)

Example with S = 1e6:

  • sqrt(S) ≈ 1000
  • Rough cost ≈ (1e6 · 1000) / 64 ≈ 1.6 × 10^7 word operations → feasible within contest limits.

A naive approach would be ~1000× slower in that regime.


When to use this trick / when not to use

Use when:

  • The problem is knapsack/subset-sum feasibility.
  • Large multiplicities of equal weights.
  • Total sum S is large (e.g. ≥ 1e5).

Avoid when:

  • You need the number of ways (counts) because decomposition changes counting multiplicity.
  • Weights are mostly unique .
  • S is very small naive DP is simpler and fast.

Code Snippet

const int MAXS =1000005;
bitset<MAXS> dp;
dp.reset();
dp[0] = 1;
int B = max(1, (int)sqrt(S));

for (auto [w, c] : freq) {
    if (w >= B) {
        while (c--) dp |= (dp << w);
    }
}
for (auto [w, c] : freq) {
    if (w < B) {
        int p = 1;
        while (c > 0) {
            int take = min(p, c);
            dp |= (dp << (take * w));
            c -= take;
            p <<= 1;
        }
    }
}

Final takeaway

Binary decomposition (frequency splitting) + √S heavy/light split + bitset DP turns the infeasible naive knapsack into an efficient solution for large S (e.g. S ≤ 1e6). Use this when reachability is required and there are many duplicate weights.


Full text and comments »

  • Vote: I like it
  • +35
  • Vote: I do not like it