Array     Bit Manipulation     C++     Hash Table     Medium     String    

Problem Statement:

You are given a binary string s, and a 2D integer array queries where queries[i] = [firsti, secondi].

For the ith query, find the shortest substring of s whose decimal value, val, yields secondi when bitwise XORed with firsti. In other words, val ^ firsti == secondi.

The answer to the ith query is the endpoints (0-indexed) of the substring [lefti, righti] or [-1, -1] if no such substring exists. If there are multiple answers, choose the one with the minimum lefti.

Return an array ans where ans[i] = [lefti, righti] is the answer to the ith query.

A substring is a contiguous non-empty sequence of characters within a string.

 

Example 1:

Input: s = "101101", queries = [[0,5],[1,2]]
Output: [[0,2],[2,3]]
Explanation: For the first query the substring in range [0,2] is "101" which has a decimal value of 5, and 5 ^ 0 = 5, hence the answer to the first query is [0,2]. In the second query, the substring in range [2,3] is "11", and has a decimal value of 3, and 3 ^ 1 = 2. So, [2,3] is returned for the second query. 

Example 2:

Input: s = "0101", queries = [[12,8]]
Output: [[-1,-1]]
Explanation: In this example there is no substring that answers the query, hence [-1,-1] is returned.

Example 3:

Input: s = "1", queries = [[4,5]]
Output: [[0,0]]
Explanation: For this example, the substring in range [0,0] has a decimal value of 1, and 1 ^ 4 = 5. So, the answer is [0,0].

 

Constraints:

  • 1 <= s.length <= 104
  • s[i] is either '0' or '1'.
  • 1 <= queries.length <= 105
  • 0 <= firsti, secondi <= 109

Solution:

Intuition

The idea is to store all (upto 32 digits long) binary substrings of "s" in a hashmap and then check the hashmap for each query.

Here is the logic to enumerate all non-zero binary substrings of a string of length 1 to 32:

 
for (int i=0; i<s.length(); i++)
{
    if (s[i]=='0') continue;
    int num = 0;
    for (int j=i; j<min(i+32,s.length()); j++) 
    {
        num = (num<<1) + (s[j]-'0');
        cout << "substring:" << s.substr(i,j-i+1) << ",value:" << num << endl; 
    }
}
 

We just need to expand this logic to also consider the substring "0" and also to create a hashmap of num: [i,j] instead of just printing.

The hashmap stores the positions of first instance of a substring.

While querying, we are looking for query[0]^query[1] because a^c=b => c=a^b. If we find it in hashmap we append its positions to result else append [-1,-1].

Code

 
class Solution {
public:
    vector<vector<int>> substringXorQueries(string s, vector<vector<int>>& queries) 
    {
        unordered_map<int, vector<int>> H;
        for (int i=0; i<s.length(); i++)
        {
            if (s[i]=='0')
            {
                if (!H.count(0)) H[0]={i,i}; 
                continue;
            }
            int num = 0;
            for (int j=i; j<min(i+32,(int)s.length()); j++) 
            {
                num = (num<<1)+(s[j]-'0');
                if (!H.count(num)) H[num] = {i,j};
            }
        }
        vector<vector<int>> res;
        for (auto &query: queries)
        {
            int val = query[0]^query[1];
            if (H.count(val)) res.push_back(H[val]);
            else res.push_back({-1,-1});
        }
        return res;
    }
};
 

Complexity

TC: $O(n+m)$, SC: $O(n+m)$ where $n=\vert s \vert$, $m=\vert Q \vert $

Note on limits

Actually we can also change the for loop to only consider substrings of upto 30 digits long: for (int j=i; j<min(i+30,(int)s.length()); j++). This is because both first and second are in the range $[0,10^9]$, we know that $10^9$ in binary is 30 digits long. Hence their XOR can be at max 30 digits long ie their maximum value can be 2^{30}-1.

In the current solution written above, the hashmap will also contain -ve numbers which is not useful for us but does us no harm. By changing 32 to 30 in the for loop, we can make hashmap a little bit smaller.