Given a string S of length n characters, is it possible to calculate the Hash of its substring [i, j] (From index i to index j. Inclusive) in O(1) using some form of precomputation ? Maybe a modification of the Rolling Hash ?
Similar Problem
Problem Statement
I have seen it being used in a similar problem where in a string was given in a compressed form. Meaning, e.g. if the string is "aaabccdeeee"
then the compressed form is:
3 a 1 b 2 c 1 d 4 e
How data was stored
They are stored in an str[]
array as :
str[] = [{'a','3'}, {'b','1'}, {'c','2'}....]
HASHING Concept that was used in the solutions
And programmers had used the following hash concept to find if the given substring is a Palindrome or not. Given a substring of string S as (i,j), they computed the hash of substring [i , (i+j)/2] and the reverse hash of substring [(i+j+2)/2, j] and checked if they were equal or not. So if they wanted to check if in string S = "daabac"
whether substring [1, 5] is a a palindrome or not, they computed the following :
h1 = forward_hash("aa") h2 = reverse_hash("ba") h1 == h2
Code for the Hashing Concept
The hash precomputation was done as follows :
/* Computing the Prefix Hash Table */ pre_hash[0] = 0; for(int i=1;i<=len(str);i++) { pre_hash[i] = pre_hash[i-1]*very_large_prime + str[i].first; pre_hash[i] = pre_hash[i]*very_large_prime + str[i].second; } /* Computing the Suffix Hash Table */ suff_hash[0] = 0; for(int i=1;i<=len(str);i++) { suff_hash[i] = suff_hash[i-1]*very_large_prime + str[len(str)-i+1].first; suff_hash[i] = suff_hash[i]*very_large_prime + str[len(str)-i+1].second; }
And then the hash was computed using the following functions :
/* Calculates the Forward hash of substring [i,j] */ unsigned long long CalculateHash(int i, int j) { if(i>j) return -1; unsigned long long ret = pre_hash[j] - POW(very_large_prime, [2*(j-i+1)])*pre_hash[i-1]; return ret; } /* Calculates the reverse hash of substring [i,j] */ unsigned long long CalculateHash_Reverse(int i, int j) { unsigned long long ret = suff_hash[j] - POW(very_large_prime,[2*(j-i+1)])*suff_hash[i-1]; return ret; }
What I am trying to do
I am looking for a general approach to the above concept. Given a Pattern P, I want to check if the pattern P is present in a string S. I know the index (i)
to check where it may be present. And I also know the length of pattern P represented as |P|
. In short I want to check if hash of S[i, i+|P|]
and hash of P
match or not in O(1)
using some form of pre computation on S
. Is it possible ?
Ignoring the time taken to compute hash of P else it would be O(1+|P|)