Given an input string (s) and a pattern (p), implement regular expression matching with support for ‘.’ and ‘*’.
Note:
'.' Matches any single character. '*' Matches zero or more of the preceding element.
Example 1: Input: Pattern: a.b Valid Strings: acb - any character can be present in place of "." aab - same reason as above adb - same reason as above In-Valid Strings: ab - Because in middle some element should be there to fill the place of "." acab - "." can hold only 1 element cb - because "a" is not presenet.
Example 2: Pattern: a*b Valid Strings: b - because * matches 0 or more character of the element behind it ab - same reson as above aab aaab In-Valid Strings: a - because "b" should be present at the end. acb - because * accepts elements of preceding element. As * is preceding by "a". This pattern is invalid.
Example 3: Pattern: a * b . * y Here a* will accept 0 or more occurance of a b one "b" should be present . * 0 or more occurance of any character y one "y" at the end should be prensent. Valid Strings: by bly ably In-Valid Strings: ay ab
We solve this by using bottom-up dynamic programming approach.
First we take a boolean 2D matrix “bArray[i] [j] ” to store the truth values. Then we iterate through all the elements and fill the boolean array.
Where “i” represents the index of string “s”.
and “j” represents the index of pattern “p”
For any value of bArray[i] [j] will be equal to one of the below 4 conditions:
Condition 1: The value of bArray[i] [j] will be equall to the value of bArray[i -1 ] [j -1] when string [i] == pattern [j] || pattern [j] == '.'
Condition 2: The value of bArray[i] [j] will be equall to the value of bArray[i] [j -2] if the pattern [j] == "*".
Condition 3: The value of bArray[i] [j] will be equall to the value of bArray[i - 1] [j] when string [i] == pattern [j - 1] || pattern [j - 1] == '.'
Condition 4: The value of bArray[i] [j] will be false, when string [i] is not equal to pattern [j].
We shall consider an example:
Pattern = x a * b . c String = x a a b y c
The above strings can be visualized as shown below:
There bArray[0] [0] is true. Because we have considered the string is empty and pattern is also empty. Hence it is True. 0th column is false. Because we have considered the string is empty but the pattern is not. Hence we have written it as false.
0th row is also false. Because our pattern doesn’t allow empty strings. If our pattern is of type ” a * b *” then it would accept empty strings.
Here, 0 =zero
Lets start by taking the elements from bArray[1] [1], string[1] = x and pattern [1] = x. Then we have to apply condition 1. hence the value of bArray[1] [1] will be bArray[i-1] [j-1] i.e bArray[0] [0] that is true. Hence bArray[1] [1] = true.
bArray[1] [2], string[1] = x, pattern[2] = a. Condition 4 is applied. Hence the value of bArray[1] [2] will be false.
bArray[1] [3], string [1] =x, pattern[3] = "*". In this case, we can have 0 or more occurance of a. Hence condition 2 is applied, i.e. bArray[i] [j- 2] the value is true. Here the pattern is "x a *" and the string is "x", it can match 0 or more occurance of a. Hence it is true. Hence bArray[1] [3] = true.
bArray[1] [4], string[1] = x, pattern[4] = b contidtion 4, false.
bArray[1] [5], string[1] = x, pattern[5] = ".", Condition 1. The value of bArray[i - 1] [j - 1], bArray[0] [4] is false. Hence the result is false.
bArray[1] [6], string[1] = x, pattern[6] = c. Condition 4, false.
Final representation after pass 1 is below:
Final representation of the matrix after all the passes:
The final answer is “True” hence the string matches the pattern.
Time complexity is O(m * n), because for every string in m = 1, 2, . . and for every element in pattern n = 1, 2, . . . we are calling the function once.
Space complexity is also O(m * n), because we save the boolean entries in temp 2d array.
Solution in C++
#include<iostream> #include<string> using namespace std; bool checkRegrex (string str, string pattern) { bool bArray[str.length()] [pattern.length()]; memset(bArray, false, sizeof(bool)*(str.length()+1)*(pattern.length()+1)); bArray[0] [0] = true; // to handle sitations like a* or a*b* or a*b*c* for(int j = 1; j<=pattern.length() ; j++) { bArray[0][j] = pattern[j-1] == '*' ? bArray[0][j-2] : false; } // we execute codition 1, 2, 3 and 4 in this for loop for(int i = 1; i <= str.length(); i++) { for(int j = 1; j <= pattern.length(); j++) { if (pattern[j - 1] == '.' || pattern[j - 1] == str[i - 1]) { bArray[i][j] = bArray[i-1][j-1]; } else if (pattern[j - 1] == '*') { bArray[i][j] = bArray[i][j - 2]; if (pattern[j-2] == '.' || pattern[j - 2] == str[i - 1]) { bArray[i][j] = bArray[i][j] | bArray[i - 1][j]; } } else { bArray[i][j] = false; } } } return bArray[str.length()][pattern.length()]; } int main() { bool result = false; result = checkRegrex("abab", "a*"); if(result) { cout<<"Pattern and string is a match"<<endl; } else { cout<<"Pattern and string is NOT a match"<<endl; } }
Output:
Pattern and string is a match
No Responses