Target Audience: Regex newbies
I would like to talk more about regex this time. This might be helpful for someone who thought regex to be very complex.
Let us use regex to validate an email (like firstname.lastname@email.com or firstname@email.com)
In regex, what matters is the pattern. Let us examine our pattern one by one.
how should we validate firstname( it should start with a alphabet and can have any number)
def FIRSTNAME = /[a-z]+\d*/
eg: james
we define the pattern always between the backslashes ‘/’
[a-z] – means a to z
[a-z]* – means a to z can occur one or more times
\d – means any digit/number
\d* – means any digit/number zero or more times
def DOT = /[.]?/
[.] – means a dot should be present
[.]? – means a dot should be present zero or one time
please be aware that . without the square brackets means ‘can have any character’
def SECONDNAME = /\w*/
eg: gosling
\w - means any word (includes number as well)
\w* – means any word can occur zero or more times
def AT = /[@]/
[@] – means must have @ symbol
def DOMAIN = /[a-z]+\w*/
// eg: sun
[a-z]+ – means a to z can occur one or more times
\w* – means any word including numbers can occur zero or more time
def TLD = /[a-z]{2,4}/
//eg: com, us, info
[a-z]{2,4} – means a to z can occur 2 to 4 times (ie can be “us” or “com” or “info”)
Now putting it all together
def emailRegex = /[a-z]+\d*[.]?\w*[@][a-z]+\w*[a-z]{2,4}/
or even better
def emailRegex = /$FIRSTNAME$DOT$SECONDNAME$AT$DOMAIN$DOT$TLD/
If you want to validate an email in groovy, following is the code
1: email="james.gosling@sun.com"
2:
3: FIRSTNAME = /[a-z]+\d*/
4: DOT = /[.]?/
5: SECONDNAME = /\w*/
6: AT = /[@]/
7: DOMAIN= /[a-z]+\w*/
8: TLD=/[a-z]{2,4}/
9:
10: print (email ==~ /$FIRSTNAME$DOT$SECONDNAME$AT$DOMAIN$DOT$TLD/ )
==~ – means look for the matching pattern
Try validating with other email values and try optimizing your program by bringing in more constraints.
Disclaimer: This program by no means is complete and cannot be used for a full fledged email validation, please google ‘regex’ for better regex patterns.
