Блог пользователя fullbuster

Автор fullbuster, история, 8 лет назад, По-английски

I am trying to learn hashing and for this i am solving a problem in which there are n array's of 0 and 1 and given two arrays we have two tell whether they are equal or not.

I have heard that this problem can be solved using hashing but i am unable to come up with a hash function which can give a unique hash value for a given permutation of 0 and 1 in form of array.

Can someone suggest me such a hash function and also how to prove that it is valid.

  • Проголосовать: нравится
  • +4
  • Проголосовать: не нравится

»
8 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

You can use normal Rolling Hash for strings. H[i] = BASE * H[i - 1] + NumAt(i)

In your case, BASE can be 3 and NumAt(i) can be arr[i] + 1.

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    And how do we ensure it won't have any collisions?

    • »
      »
      »
      8 лет назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      Consider a string containing digits from 0 to 9. Let's say that its hash will be its decimal representation. See any collisions?

    • »
      »
      »
      8 лет назад, # ^ |
        Проголосовать: нравится +1 Проголосовать: не нравится

      In rolling hash you actually take the numbers modulo a prime P, because you expect the amount of unique strings to be much more than your data type can handle. So its' really h[i] = (BASE*h[i-1] + NumAt(i)) % P.

      You can't ensure there won't be any collisions, because the only way to perfectly represent 2^n different objects is with 2^n different values. Thing is, with this hash, with prime P and BASE preferably a prime bigger than the size of your alphabet (in this case (0,1)) you can assume your hash values are uniformly and randomly distributed over the interval [0,P-1]. That is, the probability of collision is two different strings having the same random hash value, which is 1/P. By picking P around 10^9 this is already 1/10^9 per query, and by doing tricks like taking a pair of hashes (two different bases / mods and saying s1 = s2 only when (hash1(s1), hash2(s1)) = (hash1(s2),hash2(s2)) you can reduce this probability even further.

»
8 лет назад, # |
  Проголосовать: нравится +40 Проголосовать: не нравится

All permutations of 0 and 1 are only [0,1] and [1,0]. So h([0,1]) = 0 and h([1,0]) = 1 works.

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится -16 Проголосовать: не нравится

    no there is an array consisting of 0 and 1s and that array is permutation

    eg: [1,0,1,1,1,0,1,0]

»
8 лет назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

If you don't want to see any collision, I think you can insert all strings to a binary Trie and use the position of the endpoint to denote a string. Obviously there can't be any collision.