Extra Characters in a String - Top-down and Bottom-up DP
Problem Statement:
You are given a 0-indexed string s and a dictionary of words dictionary. You have to break s into one or more non-overlapping substrings such that each substring is present in dictionary. There may be some extra characters in s which are not present in any of the substrings.
Return the minimum number of extra characters left over if you break up s optimally.
Example 1:
Input: s = "leetscode", dictionary = ["leet","code","leetcode"] Output: 1 Explanation: We can break s in two substrings: "leet" from index 0 to 3 and "code" from index 5 to 8. There is only 1 unused character (at index 4), so we return 1.
Example 2:
Input: s = "sayhelloworld", dictionary = ["hello","world"] Output: 3 Explanation: We can break s in two substrings: "hello" from index 3 to 7 and "world" from index 8 to 12. The characters at indices 0, 1, 2 are not used in any substring and thus are considered as extra characters. Hence, we return 3.
Constraints:
1 <= s.length <= 501 <= dictionary.length <= 501 <= dictionary[i].length <= 50dictionary[i]andsconsists of only lowercase English lettersdictionarycontains distinct words
Solution:
Let us think through a basic solution. Here is the logic:
- The largest possible answer is length of string. This is the case where we just add 1 for all values till we reach end.
 1+dfs(s,index+1)denotes that we have skipped current index and looking for answer starting from next index.- We have a chance of getting lower value if we check set membership for each substring starting from current index. This is the basic idea.
 
I will provide 3 solutions using above logic. I have kept the variable names same to ease understanding things better.
Naive recursion(TLE)
 
class Solution {
    unordered_set<string> dictSet;
public:
    int dfs(string s, int index)
    {
        if (index>=s.length()) return 0;
        int res = 1 + dfs(s, index+1);
        for (int len=1; len<=s.length()-index;len++)
            if (dictSet.count(s.substr(index,len)))
                    res = min(res, dfs(s, index+len));
        return res;
    }
    int minExtraChar(string s, vector<string>& dictionary) 
    {
        dictSet = unordered_set<string>(dictionary.begin(),dictionary.end());
        return dfs(s,0);
    }
};
 Recursion with memoization (AC)
 
class Solution {
    unordered_set<string> dictSet;
    vector<int> memo;
public:
    int dfs(string s, int index)
    {
        if (index>=s.length()) return 0;
        if (memo[index]!=-1) return memo[index];
        int res = 1 + dfs(s, index+1);
        for (int len=1; len<=s.length()-index;len++)
            if (dictSet.count(s.substr(index,len)))
                    res = min(res, dfs(s, index+len));
        return memo[index] = res;
    }
    int minExtraChar(string s, vector<string>& dictionary) 
    {
        dictSet = unordered_set<string>(dictionary.begin(),dictionary.end());
        memo = vector<int>(s.length()+1,-1);
        return dfs(s,0);
    }
};
 Bottom-up DP (AC)
 
class Solution {
public:
    int minExtraChar(string s, vector<string>& dictionary) 
    {
        unordered_set<string> dictSet = unordered_set<string>(dictionary.begin(),dictionary.end());
        vector<int> dp(s.length()+1,0);
        for(int index=s.length()-1; index>=0; index--)
        {
            dp[index] = 1 + dp[index+1];
            for (int len=1; len<=s.length()-index; len++)
                if (dictSet.count(s.substr(index,len)))
                    dp[index] = min(dp[index], dp[index+len]);
        }
        return dp[0];
    }
};