0k-'s blog

By 0k-, history, 5 years ago, In English

I'm curious about a hypothetical problem (I don't know if this appears anywhere): input is a string $$$s$$$ where $$$1 \leq |s| \leq 20$$$ and a number $$$n$$$ where $$$1 \leq n \leq |s|!$$$. Output the $$$nth$$$ lexicographical permutation of $$$s$$$. It is guaranteed that $$$n$$$ will not be larger than the total number of permutations of $$$s$$$.

It's easy to solve this when each character in the string is unique (in general, a set). There's plenty of resources about how to generate the nth permutation of a set. Code snippets like this one that generate a permutation of the range $$$[0,n)$$$ that can be used as the indices for the original string, or the factorial/factoradic number system can be used as in here.

This doesn't work for all strings because it doesn't handle repeated characters. For example with finding the third lexicographic permutation of "aab":

n:   Result:      Actual:
1    012 = abb    011 = abb
2    021 = abb    101 = bab
3    102 = bab    110 = bba

The above approaches fail because they treat the two b characters as different elements, meaning that the resulting permutation abb gets double-counted.

The only resources I can find about strings where characters aren't guaranteed to be unique (in other words, multisets) only talk about finding the next permutation, and if they mention finding the nth permutation it's to calculate the next permutation $$$n$$$ times (or something of equivalent time-complexity), which is obviously too slow outside of tiny strings.

Is there any good approach to solving this sort of problem?

  • Vote: I like it
  • +24
  • Vote: I do not like it

»
5 years ago, # |
  Vote: I like it +14 Vote: I do not like it

The number of permutations of a string $$$S$$$ is $$$f(S) = \frac{\mathrm{len}(S)!}{\mathrm{num}_a(S)! \cdot \mathrm{num}_b(S)! \cdots \mathrm{num}_z(S)!}$$$.

To pick the element at the first position, try letter $$$a$$$, then letter $$$b$$$, ..., then letter $$$z$$$. For each letter $$$\gamma$$$, calculate the total number of possible remainders $$$f(S\text{ without }\gamma)$$$ without that letter using the above formula. If the lexicographic number $$$n$$$ is less (strictly if counted from $$$0$$$) than the total number of possible remainders $$$f(S\text{ without }\gamma)$$$, assign $$$S := S\text{ without }\gamma$$$ and proceed to picking the element at the second position. Otherwise, assign $$$n := n - f(S\text{ without }\gamma)$$$ and try the next letter as $$$\gamma$$$.

For the second position and further, the algorithm is the same.