Substring XOR Queries - Clean HashMap solution
Problem Statement:
You are given a binary string s
, and a 2D integer array queries
where queries[i] = [firsti, secondi]
.
For the ith
query, find the shortest substring of s
whose decimal value, val
, yields secondi
when bitwise XORed with firsti
. In other words, val ^ firsti == secondi
.
The answer to the ith
query is the endpoints (0-indexed) of the substring [lefti, righti]
or [-1, -1]
if no such substring exists. If there are multiple answers, choose the one with the minimum lefti
.
Return an array ans
where ans[i] = [lefti, righti]
is the answer to the ith
query.
A substring is a contiguous non-empty sequence of characters within a string.
Example 1:
Input: s = "101101", queries = [[0,5],[1,2]] Output: [[0,2],[2,3]] Explanation: For the first query the substring in range[0,2]
is "101" which has a decimal value of5
, and5 ^ 0 = 5
, hence the answer to the first query is[0,2]
. In the second query, the substring in range[2,3]
is "11", and has a decimal value of 3, and 3^ 1 = 2
. So,[2,3]
is returned for the second query.
Example 2:
Input: s = "0101", queries = [[12,8]]
Output: [[-1,-1]]
Explanation: In this example there is no substring that answers the query, hence [-1,-1] is returned
.
Example 3:
Input: s = "1", queries = [[4,5]] Output: [[0,0]] Explanation: For this example, the substring in range[0,0]
has a decimal value of1
, and1 ^ 4 = 5
. So, the answer is[0,0]
.
Constraints:
1 <= s.length <= 104
s[i]
is either'0'
or'1'
.1 <= queries.length <= 105
0 <= firsti, secondi <= 109
Solution:
Intuition
The idea is to store all (upto 32 digits long) binary substrings of "s"
in a hashmap and then check the hashmap for each query.
Here is the logic to enumerate all non-zero binary substrings of a string of length 1 to 32:
for (int i=0; i<s.length(); i++)
{
if (s[i]=='0') continue;
int num = 0;
for (int j=i; j<min(i+32,s.length()); j++)
{
num = (num<<1) + (s[j]-'0');
cout << "substring:" << s.substr(i,j-i+1) << ",value:" << num << endl;
}
}
We just need to expand this logic to also consider the substring "0"
and also to create a hashmap of num: [i,j]
instead of just printing.
The hashmap stores the positions of first instance of a substring.
While querying, we are looking for query[0]^query[1]
because a^c=b => c=a^b
. If we find it in hashmap we append its positions to result else append [-1,-1]
.
Code
class Solution {
public:
vector<vector<int>> substringXorQueries(string s, vector<vector<int>>& queries)
{
unordered_map<int, vector<int>> H;
for (int i=0; i<s.length(); i++)
{
if (s[i]=='0')
{
if (!H.count(0)) H[0]={i,i};
continue;
}
int num = 0;
for (int j=i; j<min(i+32,(int)s.length()); j++)
{
num = (num<<1)+(s[j]-'0');
if (!H.count(num)) H[num] = {i,j};
}
}
vector<vector<int>> res;
for (auto &query: queries)
{
int val = query[0]^query[1];
if (H.count(val)) res.push_back(H[val]);
else res.push_back({-1,-1});
}
return res;
}
};
Complexity
TC: $O(n+m)$, SC: $O(n+m)$ where $n=\vert s \vert$, $m=\vert Q \vert $
Note on limits
Actually we can also change the for loop to only consider substrings of upto 30 digits long: for (int j=i; j<min(i+30,(int)s.length()); j++)
. This is because both first
and second
are in the range $[0,10^9]$, we know that $10^9$ in binary is 30 digits long. Hence their XOR can be at max 30 digits long ie their maximum value can be 2^{30}-1
.
In the current solution written above, the hashmap will also contain -ve numbers which is not useful for us but does us no harm. By changing 32 to 30 in the for loop, we can make hashmap a little bit smaller.