REGular EXpressions from HackerRank (with Solutions in Python 3)
Click on the subheadings to view the question on HackerRank website.
Highlighted
part in the Test String is matched with regex pattern.
Regex Pattern - wikipedia
Test String - https://en. wikipedia
.org/
Regex_Pattern = r'hackerrank' # Do not delete 'r'.
import re
Test_String = input()
match = re.findall(Regex_Pattern, Test_String)
print("Number of matches :", len(match))
dot - The dot (.) matches anything (except for a newline).
Regex Pattern - A.B.C.D.
Test String - A+B+C=DE
regex_pattern = r"^...\....\....\....$" # Do not delete 'r'.
import re
import sys
test_string = input()
match = re.match(regex_pattern, test_string) is not None
print(str(match).lower())
\d - The expression \d matches any digit [0-9].
\D - The expression \D matches any character that is not a digit.
Regex Pattern - \D\D\D\D
Test String - Hack
101
Regex_Pattern = r"\d\d\D\d\d\D\d\d\d\d" # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
\s - \s matches any whitespace character [ \r\n\t\f ].
\S - \S matches any non-white space character.
Regex Pattern - \s
Test String - A
B
Regex_Pattern = r"\S\S\s\S\S\s\S\S" # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
\w - The expression \w will match any word character. Word characters include alphanumeric characters (a-z, A-Z and 0-9) and underscores ( _ ).
\W - \W matches any non-word character. Non-word characters include characters other than alphanumeric characters (a-z, A-Z and 0-9) and underscore ( _ ).
Regex Pattern - \w\w\w
Test String - $one
Regex_Pattern = r"\w\w\w\W\w\w\w\w\w\w\w\w\w\w\W\w\w\w" # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
^ - The ^ symbol matches the position at the start of a string.
$ - The $ symbol matches the position at the end of a string.
Non-word characters include characters other than alphanumeric characters (a-z, A-Z and 0-9) and underscore ( _ ).
Regex Pattern - ^123
Test String - 123
456
Regex_Pattern = r"^\d\w\w\w\w.$" # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
[] - The character class [] matches only one out of several characters placed inside the square brackets.
Regex Pattern - [aeiou] is a vowel
Test String - o is a vowel
| e is a vowel
Regex_Pattern = r'^[123][120][xs0][30Aa][xsu][.,]$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
[^] - The negated character class [^] matches any character that is not in the square brackets.
Regex Pattern - [^aeiou] is not a vowel
Test String - k is a vowel
| p is a vowel
Regex_Pattern = r'^[\D][^aeiou][^bcDF][\S][^AEIOU][^.,]$' # OR r'^[^\d][^aeiou][^bcDF][^\s][^AEIOU][^.,]$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
A hyphen (-) inside a character class specifies a range of characters where the left and right operands are the respective lower and upper bounds of the range. For example:
- [a-z] is the same as [abcde...wxyz].
- [A-Z] is the same as [ABCDE...WXYZ].
- [0-9] is the same as [0123456789]. In addition, if you use a caret (^) as the first character inside a character class, it will match anything that is not in that range. For example, matches any character that is not a digit in the inclusive range from to . It's important to note that, when used outside of (immediately preceding) a character or character class, the caret matches the first character in the string against that character or set of characters.
Regex Pattern - [x-z][4-8][A-K]
Test String - x5F
Regex_Pattern = r'^[a-z][1-9][^a-z][^A-Z][A-Z]' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
{x} - The tool {x} will match exactly repetitions of character/character class/groups.
Regex Pattern - \w{4}
Test String - H_ck
Regex_Pattern = r'^[a-zA-Z02468]{40}[13579\s]{5}$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
{x,y} - The {x,y} tool will match between and (both inclusive) repetitions of character/character class/group.
Regex Pattern - \w{1,4}\d{4,}
Test String - Hk132156545654654654
| Hack1021
Regex_Pattern = r'^\d{1,2}[a-zA-Z]{3,}\.{0,3}$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
* - The * tool will match zero or more repetitions of character/character class/group.
Regex Pattern - Ab*s
Test String - As
| Abbbbbs
Regex_Pattern = r'^\d{2,}[a-z]*[A-Z]*$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
+ - The + tool will match one or more repetitions of character/character class/group.
Regex Pattern - Ab+s
Test String - As | Abbbbbs
Regex_Pattern = r'^\d+[A-Z]+[a-z]+$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
$ - The $ boundary matcher matches an occurrence of a character/character class/group at the end of a line.
Regex Pattern - \w*s$
Test String - Challenges
| Hints
Regex_Pattern = r'^[a-zA-Z]*[sS]$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
\b - \b assert position at a word boundary.
Regex Pattern - \bcat\b
Test String - Acat | A cat
Regex_Pattern = r'\b[aeiouAEIOU][a-zA-Z]*\b' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
() - Parenthesis () around a regular expression can group that part of regex together. This allows us to apply different quantifiers to that group.
These parenthesis also create a numbered capturing. It stores the part of string matched by the part of regex inside parentheses.
These numbered capturing can be used for backreferences.
Regex Pattern - It is (not)? your fault
Test String - It is not your fault
| It is your fault
Regex_Pattern = r'(ok){3,}' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
| - Alternations, denoted by the | character, match a single item out of several possible items separated by the vertical bar. When used inside a character class, it will match characters; when used inside a group, it will match entire expressions (i.e., everything to the left or everything to the right of the vertical bar). We must use parentheses to limit the use of alternations.
Regex Pattern - (and|AND|And)
Test String - And
the award goes to A and
D company
Regex_Pattern = r'^(Mr\.|Mrs\.|Ms\.|Dr\.|Er\.)[a-zA-Z]+$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
\group_number - This tool (\1 references the first capturing group) matches the same text as previously matched by the capturing group.
Regex Pattern - (\w)(\w)(\w)(\w)y\4\3\2\1
Test String - malayalam
Regex_Pattern = r'^([a-z])(\w)(\s)(\W)(\d)(\D)([A-Z])([a-zA-Z])([aeiouAEIOU])(\S)\1\2\3\4\5\6\7\8\9\10$' # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
Backreference to a capturing group that match nothing is different from backreference to a capturing group that did not participate in the match at all.
Capturing group that match nothing
Regex Pattern - (\b?)o\1
Test String - o
Here, \b? is optional and matches nothing. Thus, (\b?) is successfully matched and capture nothing. o is matched with o and \1 successfully matches the nothing captured by the group.
Capturing group that didn't participate in the match at all
Regex Pattern - (\b)?o\1
Test String - o
In most regex flavors (excluding JavaScript), (b)?o\1 fails to match o. Here, (\b) fails to match at all. Since, the whole group is optional the regex engine does proceed to match o. The regex engine now arrives at \1 which references a group that did not participate in the match attempt at all. Thus, the backreference fails to match at all.
Regex_Pattern = r"^(\d\d)(-?)(\d\d)\2(\d\d)\2(\d\d)$" # Do not delete 'r'.
import re
print(str(bool(re.search(Regex_Pattern, input()))).lower())
NOTE - Branch reset group is supported by Perl, PHP, Delphi and R.
(?|regex) - A branch reset group consists of alternations and capturing groups. (?|(regex1)|(regex2)) Alternatives in branch reset group share same capturing group.
Regex Pattern - (?|(Haa)|(Hee)|(bye)|(k))\1
Test String - HaaHaa
| kk
Given below is a Perl code.
$Regex_Pattern = '^(\d\d)(?|(---)|(-)|(\.)|(:))(\d\d)\2(\d\d)\2(\d\d)$';
$Test_String = <STDIN> ;
if($Test_String =~ /$Regex_Pattern/){
print "true";
} else {
print "false";
}
NOTE - Forward reference is supported by JGsoft, .NET, Java, Perl, PCRE, PHP, Delphi and Ruby regex flavors.
Forward reference creates a back reference to a regex that would appear later. Forward references are only useful if they're inside a repeated group. Then there may arise a case in which the regex engine evaluates the backreference after the group has been matched already.
Regex Pattern - (\2amigo|(go!))+
Test String - go!go!amigo
Given below is a Perl code.
$Regex_Pattern = '^(\2tic|(tac))+$';
$Test_String = <STDIN> ;
if($Test_String =~ /$Regex_Pattern/){
print "true";
} else {
print "false";
}