C++     Dynamic Programming     Hard     Memoization     Recursion     String    

Problem Statement:

Given an input string s and a pattern p, implement regular expression matching with support for '.' and '*' where:

  • '.' Matches any single character.​​​​
  • '*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

 

Example 1:

Input: s = "aa", p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input: s = "aa", p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input: s = "ab", p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

 

Constraints:

  • 1 <= s.length <= 20
  • 1 <= p.length <= 20
  • s contains only lowercase English letters.
  • p contains only lowercase English letters, '.', and '*'.
  • It is guaranteed for each appearance of the character '*', there will be a previous valid character to match.

Solution:

Without * character in pattern, we can have following logic for regex:

 
bool isMatch(string s, string p)
{
    int slen = s.length(), plen = p.length();
    if ((plen==0) && (slen==0)) return true;
    if ((plen==0) || (slen==0)) return false;
    bool fm = (s[0]==p[0] || p[0]=='.');
    return fm && isMatch(s.substr(1), p.substr(1));
}
 

Here fm denotes first character matching. Notice how we have enforced edge cases. We are saying that both s and p must end together for true. With the addition of * character in pattern, things change significantly. Now, p can end after s to account for zero matches in case of *, notice however that s cannot end after p has ended. Also, we will need to add check in s before getting fm.

 
bool isMatch(string s, string p) 
{
    int slen = str.length(), plen = p.length();
    if (plen==0) return (slen==0);
    bool fm = (slen>0) && ((p[0]==s[0]) || (p[0]=='.'));
    if (plen>=2 && p[1]=='*')
        return isMatch(s,p.substr(2))||(fm && isMatch(s.substr(1), p));
    else
        return fm && isMatch(s.substr(1), p.substr(1));
}
 

Here, the non-asterisk logic is same. However for asterisk, we have two cases: 0 match and 1 match. For zero match, we just increment two positions in p and for one match, we increment 1 position in s and again run with the same p (because * signifies any number of matches).

This passes for all tests but gives TLE at the end.

Now, Let us add memoization to get AC.

 
class Solution {
    vector<vector<int>> dp;
public:
    bool isMatch(string s, string p) 
    {
        int slen=s.length(), plen=p.length();
        if (plen==0) return (slen==0);
        if (dp.size()==0) dp = vector<vector<int>>(slen+1, vector<int>(plen+1,2));
        if (dp[slen][plen]!=2) return dp[slen][plen];
        bool fm = (slen>0) && (s[0]==p[0] || p[0]=='.');
        if (plen>=2 && p[1]=='*') 
            return dp[slen][plen] = (fm && isMatch(s.substr(1), p)) || isMatch(s, p.substr(2));
        else
            return dp[slen][plen] = fm && isMatch(s.substr(1), p.substr(1));
    }
};
 

We get AC for this.

Same solution in java:

 
class Solution {
    int[][] dp;
    public boolean isMatch(String s, String p) 
    {
        int slen=s.length(), plen=p.length();
        if (plen==0) return (slen==0);
        if (dp==null)
        {
            dp = new int[slen+1][plen+1];
            for (int[] row: dp) Arrays.fill(row, 2);
        }
        if (dp[slen][plen]!=2) return (dp[slen][plen]==1);
        boolean fm = (slen>0) && (s.charAt(0)==p.charAt(0) || p.charAt(0)=='.');
        boolean ret;
        if (plen>=2 && p.charAt(1)=='*') 
            ret = (fm && isMatch(s.substring(1), p)) || isMatch(s, p.substring(2));
        else
            ret = fm && isMatch(s.substring(1), p.substring(1));
        dp[slen][plen] = (ret)?1:0;
        return ret;
    }
};