From Issue #158
June 2007
The Internet Engineering Task Force (IETF) document, RFC 3696, “Application
Techniques for Checking and Transformation of Names” by John
Klensin,
gives several valid e-mail addresses that are rejected by many PHP
validation routines. The addresses:
Abc\@def@example.com,
Microsoft Office Professional Plus 2007,
customer/department=shipping@example.com and
,
Windows 7 64 Bit!def!xyz%abc@example.com
are all valid. One of the more popular regular expressions found in the
literature rejects all of them:
"^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)
↪*(\.[a-z]{2,3})$"
This regular expression allows only the underscore (_) and hyphen
(-) characters, numbers and lowercase alphabetic characters. Even
assuming a preprocessing step that converts uppercase alphabetic
characters to lowercase, the expression rejects addresses with
valid characters, such as the slash (/), equal sign (=), exclamation
point (!) and percent (%). The expression also requires that the
highest-level domain component has only two or three characters, thus
rejecting valid domains,
Microsoft Office Pro Plus 2010, such as .museum.
Another favorite regular expression solution is the following:
"^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$"
This regular expression rejects all the valid examples in the preceding paragraph.
It does have the grace to allow uppercase alphabetic characters,
Office 2010 Standard, and
it doesn't make the error of assuming a high-level domain name has only
two or three characters. It allows invalid domain names, such as
example..com.
Listing 1 shows an example from PHP Dev Shed (www.devshed.com/c/a/PHP/Email-Address-Verification-with-PHP/2).
The code contains (at least) three errors. First, it fails to recognize
many valid e-mail address characters,
Office 2007 Pro Plus Key, such as percent (%). Second, it
splits the e-mail address into user name and domain parts at the at sign
(@). E-mail addresses that contain a quoted at sign, such as
Abc\@def@example.com will break this code. Third, it fails to check
for host address DNS records. Hosts with a type A DNS entry will accept
e-mail and may not necessarily publish a type MX entry. I'm not
picking on the author at PHP Dev Shed. More than 100 reviewers gave
this a four-out-of-five-star rating.