Блог пользователя lawsonh

Автор lawsonh, история, 5 лет назад, По-английски

Hello codeforces,

I have been offered a challenge by a friend, I was wondering if any algorithms could help succeed in this challenge.

Given a fixed set S of ~100000 unique 32 bit numbers, how would you devise an algorithm such that a human can compute on pen and paper whether a given number x is in S?

The algorithm should be feasible to be memorised given a year of practice.

After you have practiced your algorithm, you will be placed in a room with nothing except a blank piece of paper and pencil. There is an examiner in the room who will dictate a number to you, you have 60 seconds to determine whether that number was part of S. The set S of 100000 numbers is not available as a printout in the room.

  • Проголосовать: нравится
  • +18
  • Проголосовать: не нравится

»
5 лет назад, скрыть # |
Rev. 2  
Проголосовать: нравится -32 Проголосовать: не нравится

Find the lagrange polynomial of that integer sequence If you know the sequence beforehand.

»
5 лет назад, скрыть # |
 
Проголосовать: нравится 0 Проголосовать: не нравится

Very interesting. I have ideas for ~10000 that can work with an year of practice, but 100,000 sounds to be pushing the limits for me. I am assuming all the numbers are randomly drawn from a uniform distribution and finding any pattern among the bits is pointless, otherwise there might be ways to compress it.

  • »
    »
    5 лет назад, скрыть # ^ |
     
    Проголосовать: нравится 0 Проголосовать: не нравится

    Well, for ~10000 numbers, could you just directly learn all the numbers using a spaced repetition system like Anki? Though usually Anki works for question answer pairs, not sure how you would construct the flash cards in this case.

    • »
      »
      »
      5 лет назад, скрыть # ^ |
      Rev. 2  
      Проголосовать: нравится 0 Проголосовать: не нравится

      There is major system for encoding the numbers and there is Memory Palace technique for storing them. There is also Ben system which lets you encode 10 binary digits into a single image. Assuming you can do that, you have ~30K images, which can be stored in around 15000 memory locations

      There are some dedicated people who memorize very huge number of digits of Pi for example, with this technique. Last i checked the record was 100K digits (which is equivalent to about 30K images as well).

      But I would only advise someone very serious to go for this approach. I am someone fairly acquainted with the field of memory sports, and while it's not rocket science, it would need lot of effort.

      If you memorize everything linearly in a sorted order, you will have to do a binary search on your memory palace for the element, which won't be trivial but ok with little rehearsal.

  • »
    »
    5 лет назад, скрыть # ^ |
     
    Проголосовать: нравится 0 Проголосовать: не нравится

    And yes, they are from a uniform distribution. Though it is a fixed set of random numbers that you know at the start of the year.

»
5 лет назад, скрыть # |
Rev. 2  
Проголосовать: нравится +10 Проголосовать: не нравится

The record for remembering the digits of pi is ~60k digits, in our case we have 100k numbers, and most numbers will have 10 digits, which is 1 million digits in total, so the best human can remember only 6% of the digits we will try to remember. In terms of designing an algorithm for helping you remember, what comes to my mind is designing some hash function in terms of the sum of digits of the number or the remainder of the number by some k, but such algorithm will probably have many exceptions that you need to remember and 60 seconds is not enough time to compute things with paper. So you will need to try “all” easy to compute hash functions and check which one has the fewest amount of exceptions that you need to remember, but I dont think such an hash function exists, the hash function will likely have at least 100k false positives, which is more numbers than what you need to remember to begin with. I suppose you will only be able to improve the odds of guessing it right by memorizing the frequency of each digit in each position for example.

»
5 лет назад, скрыть # |
Rev. 2  
Проголосовать: нравится +16 Проголосовать: не нравится

From https://arxiv.org/pdf/1912.08258.pdf: "theoretical lower bound for an approximate membership data structure with a false positive probability e is $$$−\log_2 e$$$ bits per key"

But as VLamarca mentions, the world record for memorizing pi is 60k digits, about 2e5 bits, which is only enough for 2 bits per element. Therefore, a world-class human can at best achieve 25% false positive: 87.5% overall accuracy if the queries are 50% true/false. This ignores the practical difficulty of computing such a hash by hand: realistic implementations like bloom filters and xor filters would be even less efficient.