[HN Gopher] Your e-mail validation logic is wrong
___________________________________________________________________
Your e-mail validation logic is wrong
Author : Tomte
Score : 240 points
Date : 2021-05-24 11:27 UTC (11 hours ago)
(HTM) web link (www.netmeister.org)
(TXT) w3m dump (www.netmeister.org)
| danrl wrote:
| I have a very short email address in the format a@b.tld and for
| my special friends that don't know how to validate correctly I
| have created abc@ur-email-validation-is-broken.b.tld
|
| I need to use the latter ~5% of the time. Most often I take my
| business to someone else for the sake of principle.
| anttisalmela wrote:
| It doesn't really hurt if some more exotic email addresses are
| not accepted, no one can really use them anyway.
| asddubs wrote:
| in fact it probably catches more mistakes than deliberate weird
| email addresses to be stricter than the standard mandates
| jiofih wrote:
| Well, you can't use them because of attitude like yours. You're
| not only annoying those of us who want to use plus signs or
| international domains, but ensuring the "exotic" half of the
| world that doesn't use the Latin alphabet is kinda left out.
| anttisalmela wrote:
| Non-ascii alphabets need a lot more support than just
| accepting them. But I wouldn't really consider plus signs
| exotic anyway.
| jiofih wrote:
| What does that even mean? We shouldn't support them because
| it requires more work?
| [deleted]
| abrowne wrote:
| If you are consistent. A couple times I've successfully signed
| up for something with a username+string@gmail.com address but
| then have been unable to unsubscribe because my address is
| "invalid".
| NullPrefix wrote:
| I kinda recall a lawsuit many years ago about a unsubscribe
| confirmation email.
| cratermoon wrote:
| I've decided that the best way to validate email address is to
| not validate them, but require that any signup be finalized by
| the individual following a link emailed to them.
|
| This allows a person to use any damn thing they want as their
| email address, provided it works and they can get the email.
| manmal wrote:
| My cheap-o approach to this is: Check there's an @, and that
| there is a dot afterwards. This excludes local domains
| obviously, but I don't want those anyway.
| Arubis wrote:
| 100% agreed here. Accept a text field; maybe validate that it
| has an @ in it and a . after the @.
|
| Send that address a confirmation email. Now you've got
| consensual opt-in and you've somewhat protected yourself from
| adding a wrong address to your recurring mailing list.
|
| Prevent abuse with long (seconds) delays between submissions
| from the client. If the user thinks they did it right, they're
| waiting on their email inbox anyway; if they immediately
| realize they made a typo, it'll take 2-3s to fix.
|
| The RFCs were written when manually (not from cron) sending
| email to another user on your local system as a thing that
| actually happened. I'm certain you actively want to avoid that
| now.
| eli wrote:
| Yup I've been working in email marketing for a long time and
| this is what I do if I need a regex. I remember when .mobi
| TLD came out and people with those address had a terrible
| time signing up for things because a bunch of developers got
| too cute and assumed a TLD could only be 2 or 3 characters.
| You want to be really lax in what you validate.
| serial_dev wrote:
| This is also my preferred approach.
|
| If I can send you an email and you can verify that you have
| access to that email, your email is "valid enough" for me.
|
| Then, the validation is basically "is there an @ and after a
| dot in there?". I find that after that, every hour spent on
| improving the validation will just cause more emails falsely
| flagged as invalid, more support requests from the people who
| couldn't sign up with valid emails, it's code we need to
| maintain, anytime edits the validation logic risks breaking
| sign ups completely.
|
| So with more "improvements" to the validation, you just cause
| more problems. Then why do it?
|
| I hear the reputation arguments, but in practice, it never
| happened to any of the organizations I worked for.
|
| What happens though very often is naive engineers trying to
| solve problems the business doesn't have with knowledge they
| lack...
| mro_name wrote:
| > naive engineers trying to solve problems the business
| doesn't have with knowledge they lack.
|
| premature implementation is the source of most evil. :-)
| welder wrote:
| If sending emails is 100% free, but you still have to worry
| about your sender reputation. [1] Sending a large amount of
| mail to invalid emails will start getting your emails put in
| people's spam folders. That's the reason email validation
| services exist, to prevent sending to invalid emails. [2]
|
| Also, humans make mistakes. You should detect spelling errors
| and typos then suggest corrections. [3]
|
| [1] https://www.mailjet.com/blog/news/3-factors-that-impact-
| your...]
|
| [2] https://www.mailgun.com/email-validation/
|
| [3] https://www.npmjs.com/package/mailcheck
| mro_name wrote:
| Even if 0% free you'll have to do the opt-in anyway, or how
| on earth will you figure out if the recipient wants your
| email?
|
| It's hard to be smart with something like names.
| Avamander wrote:
| Oh don't worry about this at all because spammers are going
| to sign up with legitimate e-mail addresses that are going to
| get your reputation lowered. Very common tactic and you won't
| be saved by some dumb regex that would just probably hurt a
| few real users.
| eli wrote:
| Mickey@mouse.com is a perfectly valid address but it isn't my
| address. If that matters for your application you need to
| spend the capital to send an email. No way around it.
| bobthecowboy wrote:
| Even worse, I have commonfirstnamecommonlastname@gmail.com
| and get several emails a day that I didn't sign up for. Now
| the person who did sign up isn't getting them _and_ I have
| to figure out how to opt out of them. Sometimes these
| website accounts already have payment /personal details
| associated with them, which I now have access to (and
| indeed, sometimes _have to view_ ) in order to find the
| "stop sending me email" button.
|
| Always send the confirmation "did you sign up?" email.
| Always.
| jdhawk wrote:
| So you use another 3rd party validation service, paying
| $300-500/million addresses.
| deckard1 wrote:
| This is really just a problem for spammers going out and
| either buying mailing lists that haven't been validated or
| scraping the web for email addresses. In the case of the
| spammer, they would probably care a lot more about their
| bounce rate than their false negative rate (i.e. valid
| addresses that fail some sort of validation regex). In fact,
| they would probably tune their validation to actually throw
| away addresses that didn't look correct just to be safe.
|
| Obviously, this is a different scenario than your bank not
| accepting your valid (per RFC) email address. Which is why
| any sort of blanket advice is pretty dumb. Not that I care to
| aid spammers...
|
| The other scenario might be a site that puts up a "paywall"
| type thing, where you are forced to enter an email address to
| gain quick access to something, but doesn't want to bother
| you with going and verifying an email (e.g. instant
| discounts, downloading a PDF, etc.). Or in-person email
| address collection when you buy something in a store. It's
| never a good idea to collect email addresses of people that
| have no desire to subscribe to your marketing.
| radicalriddler wrote:
| I have two things.
|
| The amount of times I've tried to sign up with my protonmail
| account to a service and it doesn't pass validation simply
| because it's a protonmail account (not a gmail, outlook, hotmail
| or aol apparently). makes me wish everyone did follow the RFC. I
| actually emailed a service one time, and they responded that it's
| due to protonmail usually being associated with shady stuff wtf.
|
| The second. I had to implement an email validator at one of my
| previous jobs, and fell down the RFC rabbit hole. Not only did I
| have to follow the RFC as per my bosses request, but I also had
| make sure that Amazon SES allowed it. Came out of the office
| wanting to just walk out onto the road. The weird things that not
| only email servers allow, but also, what do email clients allow.
| saurik wrote:
| This is all a massive misunderstanding. An email address is the
| local name and the host; a host can't contain an @, so the only
| thing you frankly need do is split on "last @" and demand the
| user not escape anything. As for validation, go ahead and try to
| resolve the domain to make sure it works (and, if you want to
| verify the local part, do an online check with their server).
|
| If this squicks you for some reason--as maybe that format is non-
| obvious with respect to the lack of a need to escape @--give the
| user _two_ boxes with a hardcoded @ between them and have them
| type the two parts separately: pre-parsed input need not ever be
| escaped, as you aren 't going to parse it at all; no need to
| implement " dequoting.
|
| All of these escaping rules are then to support embedding this
| identifier into SMTP. The rules for embedding the same identifier
| into MIME are different... and even more complex! In MIME they
| support random stuff like "comments" in the middle of the
| string... is that part of the email address identifier? No.
|
| An email address simply is not defined by the format you use to
| send it as part of an SMTP command, nor is it defined by the
| format you use to send it as part of a MIME message header :/.
| Into is an identifier that exists separately from either of those
| two (different) protocols and one would expect any number of ways
| to escape that content.
|
| To demonstrate how ridiculous this all is, imagine someone comes
| up with a JSON protocol for mail submission and then documents
| how email addresses now should use \u encoding and escape
| quotation marks... does that mean users should type that into
| your app? No.
|
| Hell: your email address form is taking an email address and then
| sending it over HTTP... the escaping rules for HTML form fields
| are different still, yet no one is asking users to type HTML-
| escaped strings into other applications, right?
|
| The core thing wrong then with your email validation is that you
| are simply validating the wrong thing: unless you are developing
| an SMTP server, the rules for how to escape and parse _escaped_
| email addresses in RFC5321 are irrelevant; and, likewise, unless
| you are developing a MIME parser, the rules for how to escape and
| parse _escaped_ email addresses in RFC5322 are also irrelevant.
|
| The only thing that matters from either of these specifications
| is the underlying basic rule for what semantically can exist in a
| hostname and a localpart, and RFC5321 is _extremely_ lax: you can
| use any "ASCII graphic or space", and so excludes only ASCII
| control characters and 8-bit characters... and then, as
| mentioned, another RFC removes the 7-bit limitation and opens up
| the world of Unicode.
|
| (To push on it even further: it isn't even clear to me that one
| should consider the ASCII control character limitation to be
| fundamental to the email address identifier or a weird limitation
| of the current version of SMTP; and since none of those email
| addresses are going to _work_ , I think one may as well just
| consider the local part to be any string of Unicode code points.)
|
| Think about this: it is up to your SMTP library to correctly
| escape the email address you give it for SMTP, and here's the fun
| part: if you give it a _pre-escaped_ email address, then clearly
| it is going to have to _double escape it_ , right? So,
| semantically, these extended discussions of quoted strings and
| character limitations are always just so ridiculous :/... you
| absolutely _should not_ be dealing in SMTP-escaped addresses or
| asking your user to understand SMTP (and the same goes for MIME).
|
| (BTW, if you want some "real hell", one of these two protocols--I
| forgot which... I presume SMTP--seriously supports an _empty_
| local part. If that doesn 't tell you everything you need to know
| about un-opinionated these RFCs are with respect to "anything
| goes" then I don't know what will ;P.)
| grouphugs wrote:
| it's 2021, i am too poor to make a yahoo email account
| [deleted]
| BenjiWiebe wrote:
| Why not accept absolutely anything in the email address field,
| and just require an emailed link to be clicked before marking the
| email as validated?
| rblatz wrote:
| Because it causes conversion drop off.
| zzo38computer wrote:
| I cannot send a message to the email address they provide, but
| not because of anything wrong with the email address itself, but
| because that email address is version 6 internet, and I have
| version 4 internet.
| jacobobryant wrote:
| I just outsource this to Mailgun. User signs up, I send them a
| confirmation email, account doesn't get created till they click
| the link. If the email address is invalid, Mailgun returns an
| error and I show a page that says "We couldn't send an email to
| <address>. If you're sure that's a valid address, please try
| again." (Also use recaptcha for bot detection).
| biztos wrote:
| This is a great run-down of the trouble with e-mail addresses.
|
| I worked in e-mail security for quite a while. "Write an e-mail
| address parser" was my go-to technical interview question.
|
| It was pretty easy to see if the candidate had ever given any
| real thought to e-mail (most had not); and you could also pick up
| a lot of signals about engineering style, for instance if they
| started with a regex (fewer did than I expected). And it was
| trivial to adjust the difficulty: if someone thought the question
| was easy and had a fast solution, you could just throw them a
| test-case like the ones in this article.
|
| (Note: the actual title is "Your E-Mail Validation Logic is
| Wrong" -- and it's only about addresses, the author isn't
| implying that e-mail systems can't validate messages nor for that
| matter addresses.)
| duxup wrote:
| I'd sort of raise the issue that "Writing an email parser from
| scratch is a bad idea due to the sheer complexity involved. If
| you're looking for serious email address validation there may
| be better options out there that have dealt with this
| complexity rather than start from the ground up."
|
| Not to say I wouldn't try just for the sake of working through
| it as an example / 'where would you start' discussion.
|
| But if we're pretending this is a real world task I'd probably
| discuss how this is an endless / possibly ultimately futile
| time sink and there might be better options than starting at
| point A ;)
| LambdaComplex wrote:
| What if the answer they gave was "This is a very hard problem
| that honestly isn't worth solving, just check against a . _@._
| regex and call it a day? "
| CydeWeys wrote:
| "Can you write me a parser that has a <1% false negative and
| <1% false positive rate on real email addresses?"
|
| A similar enough issue happens in coding interviews anyway.
| Sometimes the interviewee is aware of a library that
| essentially solves the problem for you. In those cases I give
| them some credit for knowing of it and then ask them to
| implement it anyway, as if the library didn't exist (because
| there are a large number of problems out there for which a
| solution doesn't yet exist, and when hiring a SWE you need to
| find someone who can write new solutions from scratch for
| those situations; whether a given toy interview problem is
| such a situation doesn't matter for the purpose of evaluating
| said skills).
| biztos wrote:
| I would usually structure the question a bit, give a couple
| test cases with different formats and ask something like
| "write a class..." if in Python, etc. I wasn't trying to
| trap anyone who might actually think /\w+@\w+/ covers the
| range of all possible addresses.
|
| Digression: I do miss the days when you could assume a
| candidate for a position at Aquatic Widgets Incorporated
| would know something about water, or about widgets, or at
| least would have looked up what an aquatic widget is before
| bothering to come in for an interview, but those days have
| long since departed the realm of Software Engineering as
| far as I can tell. Which may be a good thing from the
| engineers' point of view, I'm not sure.
| biztos wrote:
| That would demonstrate an understanding of email -- it _is_ a
| very hard problem -- but probably also an unpleasant attitude
| you might not want in a co-worker. Whether the problem is
| worth solving is very often not your call as an engineer.
| harg wrote:
| > Whether the problem is worth solving is very often not
| your call as an engineer.
|
| IMO well functioning teams do consider the thoughts of
| their technical members when deciding which problems to
| solve.
|
| The person "making the call" on whether having perfect
| email validation is worth solving may not have an
| appreciation of how difficult it actually is, so having a
| discussion with engineers on how much work/time it would
| take should play a big part in prioitising it.
|
| Additionally, things like validating email on signup are
| mostly solved (albeit imperfectly) so one can and should
| use existing implementations and focus on building their
| product.
| serial_dev wrote:
| Yes, and it's a technical question. You wouldn't let
| business people decide which database to use, how to
| store data in a database, how to send data from backend
| to frontend, etc... those questions should be up to the
| technical team to decide.
|
| Password strength requirements and email validation are
| just like the database examples, and if a company doesn't
| let these technical questions be answered by the
| technical people, that's a bad sign.
| duxup wrote:
| >Whether the problem is worth solving is very often not
| your call as an engineer.
|
| True, but as an engineer you do need to provide accurate
| feedback regarding "Hey, this is gonna work much of the
| time but email is hard, this is a complex problem. If we do
| this from scratch we're going to miss a lot of things
| potentially".
| jjk166 wrote:
| No offense but if an organization does not listen to
| engineering in determining how to deal with a technical
| problem, that is an enormous red flag.
|
| While maybe the engineer won't actually make the call, the
| engineer should inform management's understanding of the
| costs of the approach and the efficacy of alternatives, and
| management should go along with that recommendation unless
| they have a good reason not to. Of course tone is
| important, someone saying "fuck no, I ain't doing that"
| likely indeed would be unpleasant to work with, but a
| respectful "I would recommend against doing that" is the
| sign of a confident and intelligent professional.
| mrunkel wrote:
| So only a@b?
|
| I think you're missing some stuff in your regex.
| AnIdiotOnTheNet wrote:
| Eh, that's why you use a validation email. Only bother with
| 'validation' at all to catch something obviously wrong.
| [deleted]
| gvx wrote:
| HN markup strikes again (OP wrote .X@.X, where X is an
| actual asterisk, which HN renders as .<i>@.</i>)!
| eli wrote:
| No the problem is developers confusing validation with
| verification. You can't validate your way to a correct address
| and it's wrong to try.
|
| If your goal is to catch typos you're better off with very lax
| validation plus a library that suggests corrections like
| "gmail.com" for "gnail.com" (both of which are of course
| technically valid domains)
| ericcholis wrote:
| In addition to a very simple regex, you can do some light
| verification on DNS and SMTP
|
| - nslookup -type=mx email.com
|
| - _pick the highest priority MX server_
|
| - telnet mx1.email.com 25
|
| - _validate SMTP handshake_
|
| - _Start a connection:_ EHLO email.com
|
| - mail from:<sender@youremail.com>
|
| - rcpt to:<recipient@email.com>
|
| Obviously, this might be outside the capabilities of some hosts
| or users. There's a bunch of services that expose this workflow
| for you as an api. (https://trumail.io/, for example)
| teh_klev wrote:
| See point 10 in the article:
|
| _" The domain name does not need to resolve"_
|
| Also the mail server may be temporarily offline or unreachable.
| jusssi wrote:
| Non-resolving or offline mail servers count to your bounce
| rate if you have a 3rd party service handling your outgoing
| mail. So for that purpose, it is an invalid address in the
| sense that you should avoid sending anything to it.
| teh_klev wrote:
| Those are rules made up for the convenience of marketers
| and have nothing to do with the technical aspects of mail
| delivery as defined in the RFCs.
|
| Edit just to clarify:
|
| KPI's such as bounce rates etc aren't a function of how
| mail is delivered (RFC5321). These are KPI's collected and
| collated by non-SMTP applications sitting on top of SMTP
| infrastructure monitoring bounces.
|
| Nowhere in RFC5321 does it mention that a mail server
| should or must not delivery mail in respect of bounce
| rates. These are operator defined metrics outside of the
| scope of RFC5321, that may be aided by additional software
| or services such as spam detection.
| johncolanduoni wrote:
| On the contrary, those kind of rules are made up to try
| and keep marketers in check to some degree. Why would a
| marketer want to get dinged for sending an email to a
| nonresponsive domain?
| teh_klev wrote:
| You're going to need to quote the RFC(s) that
| specifically mention bounce tracking to keep marketers in
| check.
|
| My original reply arose because there are times when a
| receiving domain or destination email address can be _"
| temporarily"_ unavailable. I pointed this out to
| demonstrate that services that pre-validate recipient
| addresses upon submission of a form don't take into
| account transient outages due to any number of valid
| factors.
|
| SMTP was designed with this in mind, i.e. try to re-
| deliver up to some acceptable threshold and then at some
| point give up (the hard bounce which is the thing that
| should cause the "ding", especially if they keep retrying
| beyond "soft bounces").
| johncolanduoni wrote:
| You're going to need to show me the RFC(s) that
| specifically mention bounce tracking is for the
| convenience of marketers. Or maybe give up on every
| practical aspect of a technology defined in RFC(s) being
| covered by those RFC(s). SMTP seems a particularly bad
| example if you expect to be able to write a useful
| program using only the RFC(s), since every MTA has a
| whole host of workarounds for non-spec behavior.
| teh_klev wrote:
| > You're going to need to show me the RFC(s) that
| specifically mention bounce tracking is for the
| convenience of marketers.
|
| Perhaps re-read "jusssi"'s comment then mine. I didn't
| assert that bounce tracking was for the convenience of
| marketers, or suggest it was mentioned in any way in the
| RFC's, _they_ implicitly did and I wanted to point out
| the error in their understanding.
|
| > SMTP seems a particularly bad example if you...etc
|
| But the central theme of this whole HN discussion thread
| is about SMTP.
|
| If you're interested, sections 6 of RFC5321[0] are where
| bounce messages are mentioned (just three times in the
| whole RFC - bouncing, bounced and bounce) with no
| reference to marketers. See also 6.1:
|
| _Some delivery failures after the message is accepted by
| SMTP will be unavoidable. For example, it may be
| impossible for the receiving SMTP server to validate all
| the delivery addresses in RCPT command(s) due to a "soft"
| domain system error, because the target is a mailing list
| (see earlier discussion of RCPT), or because the server
| is acting as a relay and has no immediate access to the
| delivering system._
|
| Which brings us back to my original comment, far above,
| that services that check once if an email address is
| "valid" using trumail.io or whatever when upon form
| filling are flawed solutions.
|
| [0]: https://datatracker.ietf.org/doc/html/rfc5321
| icedchai wrote:
| Ok, so the article is wrong. For an email to be valid right
| now, yes, the domain part _has to resolve._ If you 're
| accepting any email address that might be an email in the
| future, then they are correct, but for 99.9% of use cases:
| yes, the domain has to resolve.
| [deleted]
| jeffbee wrote:
| Close but you also need to fallback to AAAA or A lookup of the
| domain when the MX record doesn't exist. Also, do you really
| want transient unavailability to stop your signup flow? The
| whole point of the way mailers are written is the mail gets
| delivered even in the face of transient unavailability.
| wyldfire wrote:
| > The local part is case-sensitive.
|
| This seems more like a bug than a feature. Maybe in 1983 the
| average email user knew what DNS was and could be expected to
| know one part of the email address would be case sensitive and
| the other not.
|
| But email RFCs are probably like any other RFCs out there and
| specify existing behavior for the sake of interoperability.
| deckard1 wrote:
| yeah, and that's a hill I'm willing to die on.
|
| Imagine the average non-technical person talking to some
| customer service agent on the phone and having to figure out if
| her email is JANEWATSON@gmail.com, JaneWatson@gmail.com,
| janewatson@gmail.com, or Janewatson@gmail.com. Could you
| imagine the horror and complete security nightmare of multiple
| people running around using the _same_ gmail address with
| different case. Those Jane email addresses above would be _four
| different people_. We 'd be receiving mail intended for other
| people all day long.
| aeharding wrote:
| I just want my .dev email address to not be rejected.
| permo-w wrote:
| I think the author is stretching the words "valid" and "invalid"
| past their limits here for the sake of hooking you into the
| article. Yeah, in some countries it's a valid social practice to
| spit on the floor in public, but in most, it's not.
|
| let's say I'm a dev at google, and I'm writing some aspect of
| gmail. Is !"PS$%@gmail.com a valid email? No. So the word valid
| is clearly not being used correctly here.
|
| At the core of smtp these emails are allowed, but in practice
| they almost never are, and so the opposite is true. All the cases
| he described are, in practice, invalid.
| duckfang wrote:
| Email validation :
|
| Accept email from user.
|
| Send email to that address with a link to verify.
|
| Go/no-go test if link is clicked.
|
| (If you're doing some fever garbage or otherwise trying to parse
| it, you're doing it wrong.)
| swiley wrote:
| This doesn't work with hotmail where sometimes a robot will
| click the link but refuse to deliver the mail.
| JoyfulPanda wrote:
| There is is awesome talk about E-Mail by Ricardo Signes:
|
| https://www.youtube.com/watch?v=JENdgiAPD6c
|
| The first 5 minutes are perl specific, but the rest is email and
| just hilarious.
| sneak wrote:
| This is a bunch of weird edge cases that nobody uses in real life
| except maybe the plus trick.
|
| American Express and Walgreens don't let you set a
| whatever@whatever.email address because they check for a TLD
| known at the time of their app's validation code, or something.
| novok wrote:
| I use a amex@whatever.com & walgreens@whatever.com email for
| both amex and walgreens?
|
| I've run into 2, old-ish institutions that didn't quite work
| with my whatever@whatever.com and had to modify it slightly for
| them.
| sneak wrote:
| .email is a new-ish TLD.
| csours wrote:
| This feels like a discussion for backend implementations/email
| forwarders, not for email signups... but hey while this has some
| attention - For god's sake, put a button that says "This ain't
| me", at least for important stuff.
|
| I'm sorry, but I just can't bring in Clyde's truck for the oil
| change, cause Clyde ain't me!
|
| I also cannot attend Cassidy's parent teacher conference,
| apologies, I am not in Ohio.
| cratermoon wrote:
| >This feels like a discussion for backend implementations/email
| forwarders, not for email signups...
|
| And yet I've worked multiple places where product people asked
| for "simple email validation" on user signup. If they insist, I
| ask them to provide some actual test cases that they care
| about. Sometimes the product folks can be convinced to drop the
| validation requirement if they can be shown that anyone who
| can't sign up because their email address doesn't validate will
| simply move on and not sign up.
|
| In the case where your product is B2B and all the employees of
| your customers are users (say an HR product), then the first
| time a VIP at an important customer complains, that's usually
| enough to convince your stakeholders to disable the email
| validation.
| cphoover wrote:
| If your validation function works for 99.99% of your user's email
| addresses and it's a big unnecessary lift to get that other .01%
| your logic is not wrong.
| LinAGKar wrote:
| Just get rid of that pointless filtering altogether.
| tgv wrote:
| I don't think I've seen a bang path since 1990. The claim "Your
| E-Mail Validation Logic is Wrong" is just pedantry.
| icedchai wrote:
| A little later for me. I last used bang paths in 1994, when I
| had a UUCP feed.
| strken wrote:
| What you've seen since the 80s ended is unfortunately only a
| subset of all the horrible edge cases your users will run
| into.
| ok123456 wrote:
| What sendmail rules would you even use in 2021 to process a
| bang path? Just deny.
| mfbx9da4 wrote:
| I once had a crack at building a sensible email validation
| library.
|
| * Validate the string contains "@" and a "." to the right of it.
|
| * Validate common typos
|
| * Validate disposable emails
|
| * Validate MX records
|
| * Validate SMTP server and mailbox
|
| https://github.com/mfbx9da4/deep-email-validator
|
| I don't have the time to keep it maintained but it works for the
| most part!
| quercusa wrote:
| These days, I think 80% of email validation is just catching
| 'gmial.com'
| Yaina wrote:
| What this article really showed me that this RFC is actually
| pretty harmful.
|
| Supporting all of the rules outlined in the spec is probably a
| huge burden for maintainers of mail clients and servers.
| Obviously some parts of the spec are going to be omitted. It's
| hard to blame them for it, but the same person that rightfully
| skipped over implementing the routing thingy might've also
| wrongfully assumed there won't be a Japanese character in the
| address. And that's what's so bad.
|
| You might introduce more issues in your system, by taking the
| full spec into consideration for your validation, instead of
| using the whatwg regex someone posted here.
| nradov wrote:
| Well if there are problems with the RFC then you should work
| with the IETF to correct those. They have an open standards
| development process.
| awestroke wrote:
| Another option is to just ignore the RFC
| forgetfulness wrote:
| That does mean that there will only be an ad-hoc
| undocumented standard for email addresses, rather than one
| that's serviceable.
|
| Web application validation forms add a different layer to
| the standard and are sort of hard to tame; anyone can push
| together a few lines of PHP or Javascript code and conjure
| their own email address standard out of thin air.
| gifnamething wrote:
| Will be? There _is_ an ad-hoc standard.
|
| If the standard fails to be used, the standard is
| defective.
| numpad0 wrote:
| Isn't it just not very nice to ignore a Request for
| Comments
| buro9 wrote:
| I have a tld that was recently created (2014) and I still cannot
| use it in an email address reliably.
|
| The domain in question being david.kitchen, so an email may be
| email@david.kitchen
|
| The issue I encounter more than any other is trivial: Most sites
| still have a tld validation that only accepts domains that end in
| net|com|org and some other small list of accepted suffixes such
| as co.uk
|
| The list of TLDs is constantly expanding
| https://newgtlds.icann.org/en/program-status/sunrise-claims-...
| so even `[a-z0-9.-]+@[a-z0-9.-]+\\.[a-z0-9]+` would be better
| than what I see in the wild.
| toyg wrote:
| yeah, I routinely use .email and .cloud and it's so annoying
| when the occasional site goes "THAT IS NOT AN EMAIL ADDRESS
| !!111!1 YOU HAXXXOR!".
| sparrc wrote:
| I have the same issue. I use sparr.email and it fails
| validation on a few critical websites, namely my online
| utilities account (seattle public utilities) and payroll
| processor (ADP).
| tmk1108 wrote:
| I managed to buy firstname.dev a while ago and this was one of
| my fears of using it as my email address. I ended up switching
| to a .com one just to avoid any issues. I certainly don't want
| government services emails not to work just because maybe they
| didn't account for .dev TLD
| vidarh wrote:
| I was involved in setting up .name back in 2001. We spent ages
| contacting people with validation rules based on the old set of
| TLDs. Given that was the first expansion of the gTLD space in a
| long time, it wasn't so unreasonable _then_. But it 's just
| astounding that it's still and issue 20 years later.
| duped wrote:
| sometimes you get regressions too. Kaiser Permanente
| invalidated my email address earlier this year.
| arkitaip wrote:
| Someone found a sleeper regexp on Stack Overflow...
| znpy wrote:
| and the regex you provide doesn't even account for unicode..
| flerchin wrote:
| The one I see over and over is failing to trim the email before
| doing validation. This is especially egregious at account
| creation where you want no friction. Users enter their email with
| their smartphone, and it may append a space at the end. More than
| once, I've had a relative call me trying to figure out why
| $website wouldn't accept their email as valid.
| schwinn140 wrote:
| Link is just hanging.
|
| Also, anyone notice the OP posts the same couple of posts
| constantly?
| hutrdvnj wrote:
| There is no point in many cases. Even if you can verify that the
| email address is syntactically valid, you'll still need to check
| that it was not mistyped, and that it actually goes to the person
| you think it does. The only way to do that is to send them an
| email and have them click a link to verify.
|
| However, if you still want to validate an email address then use
| a library. All popular programming languages have email
| validation libraries. Yes, it's an extra dependency if it's not
| included in the std lib or the framework you use, but email
| validation is wrong in 99% of the cases, if you wrote it
| yourself.
| yawaramin wrote:
| Or use the browser. HTML form validation has <input
| type="email"> which checks that the entry is a valid email
| address.
| u801e wrote:
| I provide my email address with the +companyname suffix on the
| local part as a way to filter my email into various folders based
| on the To header contents.
|
| Unfortunately, many websites are configured to reject email
| addresses that contain a plus character. I've also encountered
| websites in the past that did accept the + character when
| creating the account where the email address serves as the user
| name, but then could not log in because their log in form
| rejected the + character in the user name.
| theshrike79 wrote:
| Fastmail allows for companyname@youraccount.fastmail.com -style
| addresses. Even for your own domains.
|
| Much more reliable than the + -thing, which breaks in the
| weirdest of places.
| aidenn0 wrote:
| I've been using fastmail for years and didn't know that.
| Thanks!
| theandrewbailey wrote:
| I use Fastmail with my own domain name and unlimited email
| inboxes, so I use companyname@mydomain.com to sort incoming
| mail.
| jetpackjoe wrote:
| I do the same thing and believe it or not I've seen websites
| reject emails with their own name in the email.
| hateful wrote:
| I had one do that. When I give the address in person I get
| "do you work here?"
|
| I had to switch my hosting provider at one point because
| they stopped supporting catch-all. I have no idea how many
| "addresses" I've used, since I don't create a specific
| email for each, so I had to get new hosting (note: this was
| over 10 years ago)
| Semaphor wrote:
| I recently got a letter from a companies' law department
| and had to explain the whole thing :D
| rpadovani wrote:
| I use a catch-all to have a <website>@<mydomain>.com login for
| every website.
|
| Samsung doesn't accept emails with "samsung" as prefix, so I
| have samsun@mydomain.com for them. I have no idea what's the
| logic behind.
| SAI_Peregrinus wrote:
| I got sick of companies rejecting email with "+", and bought a
| domain to use for email (among other reasons). Now I've got a
| wildcard entry in DNS, so any valid local part gets routed to
| my inbox. So instead of "username+company@example.com" I can do
| "company@example.com".
| axaxs wrote:
| Can you explain the DNS part? AFAIK the sender just looks for
| MX on the domain itself, regardless of local part.
| toomanybeersies wrote:
| The actual address in the email header should still contain
| the subdomain though.
| hug wrote:
| The address "company@example.com" doesn't point to a
| subdomain, though, the only reference to the company is
| the local part of the address, and so has nothing to do
| with DNS.
|
| If he said he used "joe@company.example.com", then it's
| possible he has a wildcard MX record for *.example.com,
| but that's not at all what he said, although perhaps it's
| what he meant.
|
| Regardless, the question remains unanswered.
| ElFitz wrote:
| I ended up giving up on that after one too many websites
| rejecting my custom domain (which I'm the only one using) on
| signup. These lazy / ignorant colleagues are _annoying_ -_-'
| bcrosby95 wrote:
| I've been using a similar scheme for about 7 years now and
| have never had my email rejected by a website on signup.
| 90minuteAPI wrote:
| The American Kennel Club rejected mine because the domain
| was "too similar" to their name. I guess just because it
| had a "kc" in it? Completely bewildering.
| psutor wrote:
| I use this scheme (company@mydomain.com) and one that I
| remember blocking for this reason is Aliexpress/Alibaba -
| aliexpress@mydomain.com was rejected so I use
| ali@mydomain.com.
|
| No idea what sort of security this is supposed to
| provide.
| ElFitz wrote:
| It happens rarely, but some only accept a very limited
| number of domains (ie Gmail, Outlook, etc).
|
| They probably see it as some sort of security / anti-spam
| mechanism.
| toomanybeersies wrote:
| I use a .xyz domain for my personal email, and I sort of
| regret it.
|
| My emails have a tendency to become spam filter bycatch, to
| the point that when I was job hunting last year I'd have to
| ring people after I sent them my resumes etc. to confirm
| they actually received my email.
|
| And when I give people my email address, I usually have to
| assure them that steve@stevetech.xyz is a legitimate email
| address and not a joke (it's not actually steve, but you
| get the point).
| psutor wrote:
| I host my own email server, and .xyz is one of the 2 or 3
| TLDs I went in the config files and manually blocked
| since nothing but spam comes from it (and lots of it).
|
| Definitely would not recommend using it for your personal
| address.
| richardwhiuk wrote:
| That causes weird behaviour in places, where they assume the
| bit before the @ is a "username".
| 8ytecoder wrote:
| I've been using an own domain with wildcard emails for many
| years now. I'm yet to encounter a single scenario of
| inferred names.
| choward wrote:
| I've been using this strategy for years and have not
| encountered that issue before. That would mean the part
| before the @ would have to be unique across all domains.
| That doesn't make any sense. You couldn't have
| webmaster@domain1.com and webmaster@domain2.com registered
| for example.
| CydeWeys wrote:
| Or ben@gmail.com and ben@hotmail.com couldn't both be
| registered. This scheme is so obviously flawed I can't
| imagine it's widely implemented.
| MivLives wrote:
| What provider do you use for email? That does sound nice.
| btmiller wrote:
| My time to shine! https://btmiller.com/2019/12/12/regain-
| control-over-your-inb...
| xyst wrote:
| The paid version of gmail (google workspace/gsuite) offers
| this as well (they call it "aliases"). I haven't explored
| the option myself, but I do recall seeing something like
| this in the admin panel. Whether they charge for it or not
| is probably something I should look into.
|
| At some point, I need to migrate away from google and build
| out my own personal mail server.
| Angostura wrote:
| In the UK, my domain name provider offers free e-mail
| forwarding for (I think) 10 specific e-mail address, plus a
| catch-all forwarder for anything else. Works quite well.
| gpm wrote:
| I use migadu for this.
|
| I also use greg-*@domain instead of *@domain, since their
| docs claim that setting up *@domain tends to attract more
| spam.
| nullify88 wrote:
| Another Migadu user here slowly degoogling myself. $19 a
| year is a bargain for my usage and the features I get.
| bluehatbrit wrote:
| Also a migadu user, I'm a huge fan and can't speak highly
| enough of them. Their pricing model is a perfect fit for
| me and their support address is really quick to respond.
| _rs wrote:
| Huh, how did I not ever hear/find out about this when I
| was choosing a provider... I think this is the first time
| I've seen them mentioned on HN, despite searching through
| quite a few de-googling threads. Will definitely take a
| closer look!
| fooey wrote:
| https://forwardemail.net/ is fantastic if all you want is
| to forward domains somewhere else.
|
| It's a freemium model, but I've never needed anything in
| the paid tier
| fk33 wrote:
| mailbox.org also provides the functionality to use your own
| domain and a have a wildcard entry, where all emails go
| into your inbox.
| sammorrowdrums wrote:
| I use ProtonMail and sign up to everything with
| <service>@<custom-domain> so I can track what they do with
| my email.
|
| It's not cheap from PM, and there are loads of hosting
| providers that will provide catch-all email for free with
| your hosting package (but with some usually pretty poor
| webmail client) or if you use a mail client it should work
| too.
|
| I like having good webmail and mail app and other things so
| I pay, but there are plenty of good options available.
| Sadly self-hosting email server is not really an option for
| a variety of reasons, but you should easily be able to use
| catch-all e-mail addresses.
| 8ytecoder wrote:
| Fastmail supports it. The best part about fastmail is that
| you can reply from the same address you got the email for.
| This is useful in customer service scenarios that identify
| your account based on email address.
| tstrimple wrote:
| I've tried something similar with Fastmail, and it works
| out well for the most part. I have ran into more than a
| couple services which won't accept email addresses not on a
| whitelisted domain for some reason and I had to use an
| @gmail.com address which forwards to my domain.
| tmk1108 wrote:
| Out of curiosity, are those popular services? I'm in
| process of setting up email on my own domain and it would
| suck having to fallback to Gmail if some service uses an
| accepted list of domains.
| bluGill wrote:
| fastmail is reasonably popular. Gmail is bigger, but
| fastmail is big enough that they cannot be ignored,
| unlike when I ran my own personal server and often found
| myself in blacklists without any knowable way to get off.
| SAI_Peregrinus wrote:
| I'm on fastmail.
| ryandrake wrote:
| I just set up my mail server to use - rather than +, and
| don't encounter this problem.
| fullstop wrote:
| Ages ago, back in myspace days, their system would permit +
| when creating an account, but could not handle this in their
| forgot password / password reset system. I never was able to
| delete my account because of this.
| torstenvl wrote:
| All social media accounts are delectable on a long enough
| timescale.
| fullstop wrote:
| As it happens, it was eventually done for me:
|
| https://mashable.com/article/myspace-data-loss/
| inopinatus wrote:
| Sony's SEN used to have an account creation page that would
| permit +, but subsequent sign-in interpreted it as a URL-
| encoded whitespace. No login for you
| ezekg wrote:
| lol you should have tried to enter a URL-encoded plus sign,
| %2B.
| innocenat wrote:
| I find that a lot of website don't allow + sign precisely
| because of Gmail usage.
| caymanjim wrote:
| I got sick of + not being accepted and switched to using - for
| all my aliases, which works everywhere I've tried. It's
| annoying, but practical (assuming you run your own mail server,
| or have the ability to manage it client-side).
| 8ytecoder wrote:
| Plenty of hosted solutions support wildcard - including
| GSuite and Fastmail.
| vidarh wrote:
| If you use Gmail here's a fallback option: Gmail ignores "." in
| the local part. So foo.bar is the same as f.ooba.r to Gmail.
| Obviously quite limited and more hassle to keep track of.
| grey-area wrote:
| This pattern is often abused by spambots trying to avoid dupe
| detection, so using it excessively may lead to your login
| being treated as spam.
| caymanjim wrote:
| One of my primary pet peeves with Gmail. It leads to a lot
| more junk mail arriving in my inbox. My real Gmail address is
| 'first.m.last', and almost all the spam I get is addressed to
| 'firstmlast'. Gmail is great at filtering out spam so that I
| don't see most of it, but if not for their unconventional
| filtering of recipients, I'd get even less. I also get a lot
| of email from idiots who don't know their own address and
| provide mine instead, and literally all of that would bounce
| without their . handling.
| ben509 wrote:
| Same here. I send everything that is firstlast@gmail
| straight to junk.
|
| > I also get a lot of email from idiots who don't know
| their own address
|
| Holy crap there are a lot of them. I've got one bank
| sending me the dude's statements. He's also been on some
| interesting trips, seen all his hotel stays, etc.
| brewdad wrote:
| Same. I don't have a very common name but there are at
| least two other people who share it. One has used my
| GMail address to apply for jobs and for his unemployment
| benefits. I'm guessing he isn't having much luck with
| either one.
|
| The other finally figured it out but his wife still
| hasn't after more than a decade. It gets really old
| receiving reminders to service a vehicle I've never owned
| from a dealership 2000 miles away among other similar
| crap.
| judge2020 wrote:
| I thought it would be nice to have my name without numbers
| as my gmail, but with all the stories i've heard, I think
| i'm glad I have the numbers now.
| ThalesX wrote:
| I used this wonderful trick to sign up for my government issued
| eID (it was something else but works for explaining). What they
| decided to do is to simply remove the + and don't let me know
| about it.
|
| my_email+service@foo.bar thus became my_emailservice@foo.bar
|
| I tried logging in, resetting passwords, nothing worked. I had
| to go to the authorities and make a written request to allow
| them to interrogate the database by the equivalent of my social
| security number, and that's when we realized they just stripped
| the +.
| jbgreer wrote:
| Ditto, with the same hassles mentioned by you and others, such
| that I'm actively looking at email services that handle this
| sort of thing better using approaches such as mentioned below -
| domain@mydomain style registration addresses.
| agustif wrote:
| You can have unlimited handles with fastmail if you're
| looking for that
| moojd wrote:
| I was unable to provide my email address for a retail rewards
| program last week because the input field for the domain was a
| dropdown in their POS. Not the TLD, the entire part of the
| email after '@'!
| StavrosK wrote:
| Jeez, wow. How many domains were in that box?
| bassdropvroom wrote:
| "There are other emails besides gmail and hotmail? Woah!" -
| the person who thought that was a good idea, probably.
| skhr0680 wrote:
| Until about a decade ago, this was extremely common in Japan.
| RIP mobile email, another victim of smartphones in general
| and the iPhone in particular
| jacobkg wrote:
| Yes this is terrible. On the other hand, if your goal is to
| prevent people from signing up using disposable domains, the
| blacklist approach (which I have tried before) is a never
| ending game of whack a mole.
|
| Sounds like this was in person at store though which is extra
| weird because seems unlikely that scammers would be trying to
| sign up en masse at a physical location (unlike if the form
| is connected to the internet)
| chrismorgan wrote:
| I think the best syntax validation technique for email addresses
| now is found in the HTML spec:
| https://html.spec.whatwg.org/multipage/input.html#valid-e-ma....
| As they say, this is a wilful violation of RFC 5322, because
| that's simultaneously too strict, too vague and too lax to be
| useful. They give a grammar, and the following regular expression
| implementing it: /^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~
| -]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9]
| (?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
|
| Remember that the web is a platform that lives and breathes this
| stuff. A lot of thought went into this grammar for valid email
| addresses. This is a good way of filtering out obviously bad
| stuff while allowing all realistic and sane inputs.
|
| One part of all this that I'm _not_ aware of the situation around
| is "8. You can put emojis in the local part." The HTML spec's
| validator is all ASCII. It does remind you to punycode the domain
| labels, but makes no mention of internationalised local parts,
| and I've never learned about non-ASCII local parts or how well
| they're supported. I gather they may require the sender to be
| capable as well as the receiver, whereas internationalised domain
| names were made compatible with all systems via punycode.
| yjftsjthsd-h wrote:
| > while allowing all realistic and sane inputs.
|
| Isn't that a way of saying "while disallowing perfectly valid
| options"?
| jcranmer wrote:
| What's disallowed are a) IP literal addresses and b)
| localparts that require quoting. These email addresses are
| highly likely to break many processing steps anyways; I've
| only ever seen category b in sendmail configs (it can be
| useful for internal email rerouting purposes).
|
| There's a distinction to be drawn between the requirements of
| the actual MTA/MUA/MSA layers and user applications built on
| top of them. For the latter, considering emails to be invalid
| if they contain IP literals or quoted localparts is going to
| be more helpful than harmful (there's less scope for
| vulnerabilities in doing so). It's just like assuming email
| addresses are case insensitive: it's inappropriate if you're
| an MTA, but for everybody else, go ahead and assume they are.
| Avamander wrote:
| > What's disallowed are [...]
|
| A-ha, but here you're wrong because you've excluded IDNs.
| This is really why you should not try to be clever.
| jcranmer wrote:
| IDN A-labels would still be accepted. Using the U-label
| is likely to require the same level of support as EAI,
| because without EAI support, non-ASCII strings are likely
| to horribly, horribly screw up the lower levels of the
| stack, and I wouldn't recommend supporting EAI without
| actually testing to make sure your stack can really
| handle EAI. (Not to mention EAI localparts being their
| own can of worms).
| crazygringo wrote:
| Huh. Interesting this doesn't support international email [1]
| addresses, e.g. kvitochka@poshta.ukr or
| Dorte@Sorensen.example.com.
|
| Seeing as the web has _long_ supported Unicode, where are
| e-mail addresses currently at in that evolution?
|
| Are full Unicode e-mail addresses something that is decently
| supported today, or still largely theoretical? Is this regex
| sufficient? What kind of e-mail addresses do people in China
| most commonly use, for instance?
|
| [1] https://en.wikipedia.org/wiki/International_email
| Avamander wrote:
| > where are e-mail addresses currently at in that evolution?
|
| Baby shoes because of anglosphere programmers that can't
| fathom people wanting to use their own alphabets and thus
| forget to support it.
| xeromal wrote:
| This is a pretty pessimistic take. The real answer lies
| somewhere between budget and speed. If someone asked me to
| support non-latin alphabet, I'd have no idea where to start
| and the amount of people that would use that feature isn't
| worth the consideration. It's not that I don't fathom it,
| it's that I don't have time for that shit.
| crazygringo wrote:
| You don't need to be so accusatory or ungenerous about it.
|
| Clearly "anglosphere programmers" fathom it every day when
| they use UTF-8 almost universally in webpages. Also, you
| know, things like emoji are pretty popular in the
| "anglosphere" as well.
|
| It's obvious that the real reason is an ancient e-mail RFC,
| and that while upgrading webpages to UTF-8 was relatively
| easy, in that it only needs 2 parties to support it -- the
| browser and the server -- upgrading e-mail is almost
| infinitely more complicated, because you have to wait for
| virtually all email code in the world to be upgraded, since
| an e-mail address is pretty useless if it doesn't work
| everywhere.
|
| It other words, it's a coordination problem. Not an
| ignorance problem.
|
| And unfortunately, Punycode [1] doesn't seem to be a
| particularly viable stepping-stone/compatibility solution
| here. E.g. if a user tries to use domeinMing Li
| @example.com and it fails, asking them to instead type in a
| seemingly-gibberish eckwd4c7cu47r2wf@example.com, where
| that could also conflict with a real e-mail address of that
| name.
|
| [1] https://en.wikipedia.org/wiki/Punycode
| eli wrote:
| How sure are you that the 61 character limit won't change in
| some future DNS improvements? People used to think TLDs would
| only ever be up to 3 characters long.
|
| More importantly, what problem is this even trying to solve?
| Someone accidentally typing a 300 character domain? If they are
| intentionally feeding you gibberish they'll just give you more
| realistic looking gibberish.
| biztos wrote:
| That regular expression fails to validate a bunch of the
| examples from the article. And also single-word addresses,
| which are pretty useful if you want to route email locally.
|
| So what makes it the best?
|
| [Edit: it also assumes you've already parsed out the "real"
| address from the rest of the text field, which to me makes it a
| half-validator at most.]
| crazygringo wrote:
| Yes, that's the explicit point of it.
|
| But it would seem to be the best for general-purpose web use,
| e.g. signing up for a newsletter with an e-mail address
| that's pretty much guaranteed not to break anything.
|
| Instead of being conservative in output, it's intentionally
| being conservative in input.
| judge2020 wrote:
| Hope you're not in PHP, Perl, or Ruby!
|
| http://emailregex.com/
| jcranmer wrote:
| That's a pretty bad page, given that it gives regexes that
| match very different things for different languages, without
| a) explanation what the differences are, and b) any rationale
| for why you may or may not want to choose between the
| different versions, let alone c) why different languages
| "deserve" different versions of the regex.
|
| This is already a field where there is a lot of
| misinformation flying around, and a page that merely
| regurgitates all of that misinformation without the
| perspicacity to realize that its purported information is
| internally incoherent is not helpful.
| mk81 wrote:
| Clicked thinking this might be useful information; was mostly
| disappointed.
| burke wrote:
| I've never seen much point in trying to do better than .+@.+,
| unless you're going to pull out the (gargantuan!) authoritative
| version for some reason.
| kevinmchugh wrote:
| Implementing the authoritative version is a waste since you'll
| also need to keep an up-to-date list of TLDs, and more
| importantly, you might have a typo in the input that gives a
| valid-but-incorrect email.
|
| After doing your simple regex, the best move is to just send a
| verification email and wait for the user to click the link, if
| you really need to be sure.
| kristaps wrote:
| Yep, these arcane rules are maybe relevant to the 5 or so
| people writing mailservers, but not to web developers.
| wffurr wrote:
| Yeah, do simple validation, and then just send an email. Even a
| validated email can still be non-deliverable if there'sa typo
| in the domain or the first portion.
| dkersten wrote:
| Or the user typos their address and it goes to the void or to
| someone else. Even a valid address can't be assumed correct.
| I get a ton of emails to my old gmail that aren't meant for
| me because some people are too dumb to get their email
| addresses correctly (even for someone's covid vaccination
| appointment confirmation and details recently...) or just
| make a mistake.
| michaelt wrote:
| Too many sites refuse to let me register as
| "><script>alert("XSS");</script>@example.com
|
| The oppression must end!
| SAI_Peregrinus wrote:
| I think the mismatched quotes actually do make that one
| invalid.
| Avamander wrote:
| Honestly if that'd ever work, the website has bigger problems
| anyways.
| mro_name wrote:
| shouldn't is read Bobby Tables rather than a mere XSS?
| StavrosK wrote:
| I gave a presentation on this topic in FOSDEM a few years ago:
|
| https://www.youtube.com/watch?v=xxX81WmXjPg
|
| (Loudness warning)
| tehwebguy wrote:
| Can't seem to find it anymore but wasn't there a post about how
| the letter "d" by itself was a valid email address at one point?
| crystaln wrote:
| So <any chars>@<any chars> seems beyond good enough. There is no
| benefit to validating beyond that for almost all cases.
| lilyball wrote:
| I'm of the opinion that all you should validate is that there is
| an @, the text to the right side (of the final @) is a well-
| formed dotted DNS domain, and that there exists at least one
| (non-whitespace) character to the left of the (final) @.
|
| Yes, I can craft garbage emails that pass this quite easily, but
| who cares? If I'm crafting fake emails I can make valid ones too.
| This rule ensures I typed the @ and the dot in my domain (we
| really don't need to support dotless email domains and it's
| better to catch "foo@gmailcom") and it won't reject all the weird
| random emails people might have.
| LogicX wrote:
| Nice article.
|
| My only technical nit would be the statement "if there was an MX
| record".
|
| Many systems will fall back to an A record to attempt delivery in
| absence of a MX record.
| pmontra wrote:
| I just check that the string contains at least an @ character.
| That ensures that we're not rejecting people with uncommon
| patterns in their email address and takes very little time to
| design, develop and test.
|
| In a project we're doing something fancier: we check the result
| of sending mail and store it in the database record for the
| account (Mandrill notifies us on a webhook.) Then we might take
| actions for bouncing addresses. The actual impact on the project
| has been zero so far.
| _wldu wrote:
| Perfect is the enemy of good.
|
| If it is a string that has an @ sign, a dot and is at least six
| characters long, it's probably a valid email address.
|
| _a@b.cc_
|
| No need to go further than this. It's not worth the time.
| criddell wrote:
| You don't need the dot. Item 12 in the list is "You can have
| dotless domain names."
| Biganon wrote:
| Do you know many people who own a TLD?
| criddell wrote:
| Last time I checked, some of the TLDs did have an MX
| record. Perhaps they use it for support or something? I
| could imagine emailing info@tld or admin@tld or
| support@tld.
| harg wrote:
| indeed, the point the parent comment was making is that
| effective email validation need not perfectly implement the
| RFC.
|
| dotless domains are going to be so rare in practice (unless
| your project has some niche use-case) that you can probably
| ignore them and call them invalid for the sake of simplicity.
| criddell wrote:
| It's worth alerting the user that they likely made a
| mistake. However, I think they should be allowed to
| continue with a dotless address.
| k__ wrote:
| My main email, which is only 8 characters long, got denied
| sometimes based on length.
|
| Doesn't happen very often, though.
| slifin wrote:
| There's a lot of busy work in computing in the name of preventing
| mistakes
|
| At a certain level of complexity it's easier to just let mistakes
| happen and provide correction tools if & when required
|
| I remember getting our IPs blacklisted trying to programmatically
| ask email servers if the email address we were provided was real
| eldenbishop wrote:
| It's not even about crazy emails. My wife works for an Aerospace
| company so her work email was blahblah@blah.aero and a huge
| number of websites still don't recognize that as a valid top
| level domain.
| xyst wrote:
| g-mail smtp is returning this error:
|
| "Verify that you have addressed this message correctly. Check
| your SMTP server settings in Mail preferences and verify any
| advanced settings with your system administrator.
|
| The server response was: The recipient address
| <'*+-/=?^_`{|}~#$@[ipv6:2001:470:30:84:e276:63ff:fe72:3900]> is
| not a valid RFC-5321 address. <...> - gsmtp"
|
| Not even Google engineers can get it right. We are doomed.
| ds wrote:
| If your email is <RFC>fan 69(tm)@root I am not going to let you
| signup. Sending emails cost money and bouncing emails affects
| your sender reputation. Also, for every user out there using
| <RFC>fan 69(tm)@root as their email address, there is going to be
| thousands of people accidently entering their email address
| incorrectly and not getting a alert about it. Yes you could do
| fancy shit like checking mx records and whatnot, but come on- Im
| not going to maintain/build that infrastructure for the one out
| of a million people who are trying to use that address.
|
| Developer time is precious at a startup and supporting <RFC>fan
| 69(tm)@root while still denying b ob@gmailcom is very, very far
| down the list of things to do.
|
| In summary: I don't suggest doing 'perfect' email validation to
| RFC spec. You will save money/devtime and make more of your users
| happy by not doing it.
| toomanybeersies wrote:
| I also came to the same conclusion some years ago. Or more
| specifically, my manager brought me around after I tried
| arguing that it was worth the time to make sure that users
| could use an IPv6 address as their domain (the lack of periods
| after the @ would cause
| `user@2001:0db8:85a3:0000:0000:8a2e:0370:7334` to fail
| validation)
|
| He made a very convincing argument that while an IP address is
| technically a valid domain, but how many legitimate users were
| seriously using an IP address as their email domain? (zero)
| welder wrote:
| Yes, in practice I've found the exact same thing. Either use an
| email validation service or be more restrictive than the RFC.
| [1] Also prompting "Did you mean bob@gmail.com?" when the user
| types "bob@gmaail.com" helps a lot with human input errors. [2]
|
| [1] https://www.mailgun.com/email-validation/
|
| [2] https://www.npmjs.com/package/mailcheck
| ivraatiems wrote:
| Except that as someone with an email at a .co domain, I get
| really irritated when it asks me "do you mean
| [mydomain].com?"
|
| I always have to tell people, in real life, "it's .co, not
| .com," just in case - humans do this too.
| cygned wrote:
| Worse, I had services trying to be smart correcting .co to
| .com
| welder wrote:
| Yep, it's solving for the majority case. As long as it
| doesn't block signup you can just ignore it.
| 3np wrote:
| How about not doing any pre-validation (save for whitespace
| stripping) and have a validation e-mail (which you should
| require anyway) take care of any typos?
|
| With precious dev time, you can do better by doing less.
| edoceo wrote:
| You risk sending a junk message tho, which affects your
| sender-spam score with other providers.
|
| I just make folks email me first.
| jacobobryant wrote:
| I do the validation email, works great. Just be sure to
| protect the sign up form with some type of bot detection (I
| use recaptcha, but simpler methods are fine for most
| sites).
| josephcsible wrote:
| This logic is why so many Web sites today won't let you use a
| plus sign in email addresses, which ruins a really nice Gmail
| feature.
| kyrra wrote:
| As people said, true spammers know to just strip off the "+"
| in the email address. This is actually a fun reason to set up
| your own domain and set up email forwarding for *@example.com
| to go to your main gmail or whatever account, then the
| "username" part of the email I just set to the domain of the
| account I'm signing up for. So I'll use amazon@example.com
| when signing up at Amazon (or whatever site).
| em-bee wrote:
| well, you could turn it around and use + addresses
| everywhere, so that any legitimate response must be to one
| of your + addresses. then treat anything without + as spam.
| frereubu wrote:
| Didn't you find you got a deluge of spam to generic
| addresess like admin@, info@, offers@ and so on? I tried
| this, although it was probably about 15 years ago now, and
| reverted it because I got about the same amount of spam as
| genuine emails.
| progforlyfe wrote:
| Although hardly anyone uses yahoo mail anymore, they
| actually have this feature built in. Basically email
| aliases.
| shard wrote:
| That makes me want to use an email address of the form
| +myname@mydomain.com, just to see how websites would handle
| stripping out everything starting from the +.
| zzo38computer wrote:
| I do a similar thing, except that the email is actually
| hosted at my domain rather than being forwarded, and that I
| have a list of email addresses that I accept and reject all
| others; if I receive too much spam at one address, I
| disable receiving at that address.
|
| I have found this to work; I hardly receive any spam at
| all, and do not need any separate spam filter.
| judge2020 wrote:
| Note that you should only do this for maybe 6-18
| characters, some sites will test send an email to [30-100
| character random string]@example.com and see if it bounces
| - if it doesn't, it'll suspect that domain to be some
| spammer with a catch-all email inbox and block it.
| alufers wrote:
| Do you know what sites do that? I have my own domain and
| I haven't seen anybody do that. The obvious solution is
| to configure your mail server to only accept usernames
| before the '@' that adhere to some rule which only you
| know. Like checking if it is a palindrome or something
| obscure like this.
| bbarnett wrote:
| I watch multiple corp's mail logs extensively, this is
| not even remotely a common thing.
|
| Worse, I know at least 5 or 6 people personally, which do
| catch all. It seems like a very poor method to reliably
| catch spammers.
| jasonjayr wrote:
| That's a terrible approach, plenty of valid, legitimate
| non-spamming domains use catchalls of arbitrary length
| for all sorts of reasons.
|
| Additionally, sending a test email like that might also
| get the sender placed on a black list for triggering a
| spam trap inadvertently.
| pmontra wrote:
| That's a worrying strategy because there are many reasons
| for using a catchall. Example: one email per site to
| track companies selling personal data, then maybe bounce
| that single email address.
|
| Do you know any site blocking domains with a catchall?
| LorenPechtel wrote:
| Yeah, if you have a domain of your own the sensible thing
| is a catchall, use a different address everywhere and
| block the ones that spam.
| academia_hack wrote:
| The + is also useful for knowing who sold your email
| address on or was responsible for a data breach. If I start
| getting spam to <my name>+hulu@gmail.com, then I know I
| could chase down Hulu on Twitter for an explanation.
| rootusrootus wrote:
| > So I'll use amazon@example.com when signing up at Amazon
|
| I go a little farther. I figure an attentive spammer might
| figure out that if I use amazon@johnsmith.net to sign up
| for Amazon, I may have exactly the scheme where
| *@johnsmith.net will work, so they can just add that to the
| spam list as a wildcard and pick a new address every time.
| So instead, I use john101@johnsmith.net, john102, john103,
| etc, to try and obscure my strategy and prolong the life of
| the domain forwarding.
| licebmi__at__ wrote:
| I kinda imagine that spammer go for low hanging fruit. So
| spammers won't bother with defeating a catchall domain
| forwarding, as it's unlikely to give them returns.
| Although a motivated attacker might decide to try to send
| interesting phishing.
| willcipriano wrote:
| I just have a entire domain for the purposes of spam.
| Anything sent to there ends up in my bulk folder. I use
| amazon@domain.com so I can tell who sells my email or
| gets hacked. Never noticed someone trying to send a email
| to any addresses I haven't previously used.
| wastholm wrote:
| > Never noticed someone trying to send a email to any
| addresses I haven't previously used.
|
| At least a few years ago, I noticed a lot of spam to
| <random first name>@<my domain> -- i.e., completely made-
| up addresses that I had never used. Since messages sent
| to those addresses were guaranteed to be spam, I started
| treating them as free training data for the spam filter.
|
| I don't know if this still happens, though, because I
| haven't looked.
| Grollicus wrote:
| This is currently happening to my email domain. Gets
| rejected as it doesn't have a valid hash (recipient
| name), but the logfiles are full of <3
| letters>@mydomain.com and <english_word>@mydomain.com
| rejections.
| klyrs wrote:
| Yeah, this is an age-old issue -- in the early 00s, my
| mom got a domain and used the email
| <first_initial>@<domain>.com. She gave up battling the
| deluge of spam after about a year. We looked through the
| logs, and saw that her next choice of handle was also
| getting tons of spam, too, because it was also short.
| batch12 wrote:
| I do the same thing. I use whatever@domain.email. The
| addresses are temporary if I want them to be and I can
| automatically lock the senders to a list that is either
| automatically learned after x days or manually curated.
| I've seen some 'marketing' mail get filtered but no hacks
| yet.
| cgriswald wrote:
| I've got amazon@domain.com email for my domain and I've
| never created such an account, much less given it out.
| Without some uniqueness in the username, I'm not sure you
| can tell a company sold or lost your data.
| bcrosby95 wrote:
| Spamming is a numbers game. I kinda doubt enough people
| are using this scheme to make figuring this out
| worthwhile for a spammer.
| nerdponx wrote:
| I've wondered about this with big companies like
| Facebook, Google, Amazon, etc. as well as behind-the-
| scenes spyware/ad firms who are all probably very
| interested in linking my identity across user accounts,
| email addresses, device fingerprints, etc. I've hoped
| that there aren't enough people doing it (yet) for these
| orgs to find it worth the effort.
| rootusrootus wrote:
| Given the sheer amount of money involved, I believe it is
| likely that there are players in the market who are far
| more capable than we give them credit for.
| macNchz wrote:
| There very much are companies doing this and selling it
| as a service...here's an API that you can query with a
| piece of contact information to retrieve all sorts of
| additional information, including hashes of alternate
| email addresses, mobile device ids, social media
| profiles, and plenty of other stuff:
| https://platform.fullcontact.com/docs/apis/enrich/person-
| ins...
| rootusrootus wrote:
| At a certain point -- probably the moment it becomes a
| business unto itself -- this kind of data collection
| should be subject to all the same rules we've come up
| with for credit bureaus. It should be a legal requirement
| that I can get the entire profile they have built for me.
| nerdponx wrote:
| I was wondering specifically if they have special cases
| to identify such "personal" email domains and use them
| for record linkage.
|
| It seems like an obvious thing to try, but maybe not
| worth the effort of implementing it, given the high risk
| of false positives and the low % of people who actually
| do stuff like this (not to mention they're probably not
| people who click on ads anyway).
| typicalbender wrote:
| Hard truth is you're not worth enough for a spammer to
| look for that pattern, it's a numbers game and you're
| just making it harder on yourself.
|
| Also unless you're keeping a lookup table you're losing a
| great benefit of the wildcard. You can, and I have caught
| a few places, tell when a company sells your email. If I
| get an email from company XYZ to my email abc@example.com
| I know exactly who sold my email and to whom.
| rootusrootus wrote:
| I agree that I'm probably not worth the effort, but if
| this kind of domain wildcard strategy were to become more
| popular it is entirely feasible for a rudimentary machine
| learning algorithm to detect its use.
|
| > unless you're keeping a lookup table you're losing a
| great benefit of the wildcard
|
| That's true, I don't keep a lookup table per se, though I
| do have a deleted items folder that I could look back in.
| I'm not sure what I would do, though, if I knew what
| particular company sold my email address? Send them a
| nastygram they will just ignore? I just block the address
| and move on.
| j_wtf_all_taken wrote:
| I don't really think that's the same. Forgetting the "+" in
| the validation regular expression is something else than
| refusing to implement all kinds of extra checks to support
| very weird and very unused things.
| [deleted]
| ridaj wrote:
| Why do so many people responding to this seem to assume the
| plus sign is to fool spammers? Of course it's not useful for
| antispam. It's mostly meant to make it easier to trace where
| a (legit) email comes from, for example to set up filters.
| https://gmail.googleblog.com/2008/03/2-hidden-ways-to-get-
| mo...
| jjav wrote:
| > This logic is why so many Web sites today won't let you use
| a plus sign in email addresses, which ruins a really nice
| Gmail feature.
|
| Contrary to popular belief, it is not a gmail feature.
|
| I first heard of the + as destination filtering in the very
| early 90s at CMU where it was broadly used. Every single
| email address I've had since then has support the same (and
| notably, apart from a test account, I've never used gmail
| much, so that's not including gmail).
| tyoma wrote:
| The best are sites that let you sign up with a '+' but not
| log in. Zappos used to be the most prominent example.
| tshaddox wrote:
| I've seen sites send emails where the unsubscribe link
| doesn't work because the URL contains the email address I
| signed up with and that email address contains a character
| that their web server doesn't play well with.
| scubbo wrote:
| I once had a site _silently strip_ the + from signup email.
| So when I submitted `myname+yoursite@gmail.com` as my email
| address, they started sending mail to
| `mynameyoursite@gmail.com`. Madness.
| not2b wrote:
| This is common; spammers know the semantics of '+' for
| gmail and will strip it. You need to assume that it will
| happen.
| cgriswald wrote:
| GP said the site stripped the "+" only, essentially
| sending his email to another address entirely. Spammers
| strip the "+" and whatever follows it, so the spam ends
| up at the same address.
| dangoldin wrote:
| Interesting. How did that work? Does that mean that they
| would only create the user account under the + suffix? I
| imagine they must have had two email fields - the canonical
| email for login and then a separate notification email?
| nybble41 wrote:
| I've run across at least one _banking_ site which accepted
| a password on the sign-up page which was later rejected by
| the login page. The validation scripts on the login page
| used a more limited set of permissible special characters
| which didn 't include parentheses. Fortunately it was only
| a client-side check, so it was relatively simple to bypass
| it once using developer tools and change the password.
| wtetzner wrote:
| Why would you ever validate the characters of a password
| on the login page? What a weird thing to do.
| marcod wrote:
| American Express at one point let me set a password over
| 8 characters, but logging in after only worked if I
| provided only the first 8.
| lozaning wrote:
| At one point I know they also weren't case sensitive.
| PebblesRox wrote:
| Reminds me of a patio11 post (which I haven't been able to
| track down) where he said he gets people signing up with a
| '+' but then forgetting to include the extra part when they
| log in later. His login code accepts both versions and
| increments a counter to track how many people were too
| smart for their own good.
| raffijacobs wrote:
| Can't you use "." Anywhere in your email to use the same
| multiple times in Gmail?
| josephcsible wrote:
| Yes, assuming the websites following this logic don't block
| that too, but then you have to keep track of a mapping of
| dots to websites yourself instead of it being obvious from
| what you put after the plus sign.
| alfon wrote:
| Or use a password manager.
| gumby wrote:
| I configured my mail server to use _ as a sub mailbox
| identifier to stop creeps who block +. I assume they are
| doing it to make sure their precious spam shows up in my
| inbox.
| toxik wrote:
| OTOH, it being a standardized thing, a spammer would
| absolutely just strip that plus part off. Better do it
| secretly like a catch-all.
| gxnxcxcx wrote:
| I think that when using email aliases to identify spam
| sources, the crucial part is that you can filter the
| stripped address (as well as any unapproved alias) to be
| directly identified as spam and then the +alias part
| becomes a key to properly get into the inbox.
|
| That whole setup for tidiness is broken the moment a
| desired website does not accept an alias in your address,
| of course.
| nybble41 wrote:
| It's not really all that standardized. The use of a '+'
| character to indicate an alias or label is merely
| convention--if you run your own server you can set the
| separator to any character you wish, or disable the feature
| altogether. As far as the RFCs are concerned the '+'
| character is just part of the account name and there is no
| reason why it cannot be a _mandatory_ part of the account
| name on any particular server, such that stripping off the
| '+' and any trailing characters results in an invalid
| e-mail address, or even someone else's e-mail account. For
| sending email or using an email address as an account
| identifier it's definitely incorrect to treat
| abc+xyz@example.com and abc@example.com as equivalent. The
| same goes for account names which differ only in
| capitalization or placement of periods: some servers are
| case-insensitive and ignore periods in account names (e.g.
| Google) but these are server-specific traits and compliant
| email senders should not assume that every server will work
| the same way.
|
| The '+' alias feature is a fairly common configuration,
| though, so for source labels it's better to either treat
| all unlabeled messages as spam or else use a more opaque
| labeling scheme (unique-hash@example.com) which doesn't
| hint at an alternative untracked email address.
| stonogo wrote:
| Subaddressing is standardized in RFC 5233.
| nybble41 wrote:
| For the Sieve Email Filtering Language, yes. Which is not
| actually part of SMTP. And even in RFC 5233 the specific
| separator sequence is up to the server; the RFC only
| specifies queries for ":user", ":detail", and
| ":localpart" to filter on the different fields
| independent of the choice of separator.
| alkonaut wrote:
| Sites likely prefer your canonical/standard email address
| over any plus version. It would be easy to trim anything
| after the plus too I guess and just email you at your normal
| address
| mderazon wrote:
| Spammers aside, I'm interested to know what strategy
| different saas companies do in regards to users creating an
| account with + alias - Do you let users create multiple
| accounts with the same email but different + alias ? Or do
| you recognize that it's an alias and say that the account
| already exists ?
|
| Not all email providers support the + notion so you'd have to
| run domain lookup on some hard coded list
| markonen wrote:
| You absolutely should check the MX records, though. It's easy
| and catches tons of typos. I was floored by the difference when
| I implemented this as pre-check before a Stripe Checkout form.
| throwaway09223 wrote:
| How do you reconcile your concern for the cost of sending
| emails with your unwillingness to do super basic validation
| like checking an MX record?
| nawgz wrote:
| From where I sit, both of those concerns sit on the same side
| of fence. GP argues against extensive developer time spent on
| validating edge-case emails, and says they do so in no small
| part to avoid having emails bounce etc., as doing MX or other
| validation to follow-up on these edge-case emails validity
| within your service does nothing to imply others have put in
| this same costly and nearly superfluous support, likely
| leading to more emails bouncing and accordingly degrading the
| trust in their business as a sender
| jcelerier wrote:
| > Sending emails cost money and bouncing emails affects your
| sender reputation.
|
| that works as long as <RFC>fan 69(tm)@root does not write
| articles for ZDNet
| vorpalhex wrote:
| My email address is valid and has been valid for a really long
| time.. but about 5% of ecommerce shops refuse to accept it.. so
| they don't get my money.
|
| Don't get clever, just follow the spec.
| skeeter2020 wrote:
| >> Don't get clever, just follow the spec.
|
| I'd suggest being clever is wasting countless hours to handle
| your edge case. Or writing your own email validation in the
| first place.
| shard wrote:
| > wasting countless hours
|
| Isn't email validation a solved problem in that there are
| services or ready software which provide RFC-compliant
| validation? If some company is wasting countless hours to
| do something because of Not Invented Here syndrome, isn't
| that the same as some company deciding to write
| cryptography algorithms on their own and reaping what they
| sow?
| alkonaut wrote:
| Your money is likely a minuscule part of the revenue and
| supporting your email would likely cost more. This was the
| point, that it _is_ probably clever to choose a validation
| that covers 99.99% of customer emails rather than cover the
| whole spec.
| jchw wrote:
| If you can show that "just follow the spec" ends up opening
| up more opportunity than it closes off, then you can convince
| people. However, when gmail, outlook, etc. do not allow these
| zany e-mail addresses, you're going to have a hella hard time
| convincing me of this unless you are in the 1% of spenders.
| BenjiWiebe wrote:
| Do GMail et al actually prevent you from sending to and
| receiving from these zany addresses? Or merely prevent you
| from creating one @gmail.com?
| jchw wrote:
| Creating one. But when you consider just how many
| customers are using gmail and outlook addresses, and not
| to mention, GSuite/fastmail/etc. addresses under custom
| domains, it makes more sense why rejecting
| @gmail.com@gmail.com is worth more than allowing some
| crazy e-mail feature that is effectively not used.
| not2b wrote:
| The routing features are obsolete; they go back to the
| days when lots of email users weren't on the Internet
| directly and had to use relays. They are still in the
| spec, yes.
| jchw wrote:
| I assume it comes from similar lineage as UUCP paths.
| Either way, email standards are a bit ridiculous. It
| needs the kind of rehaul that occurred with HTML5 of
| looking at what email implementations actually do and
| pushing them in one direction. I suspect that is not
| happening ever, so failing that there will probably
| always be things in the spec that just simply don't work
| across everything anymore.
| lisper wrote:
| > about 5% of ecommerce shops refuse to accept it
|
| That's surprising to me because there is nothing particularly
| weird about your email address. What exactly do they complain
| about?
| mixmastamyk wrote:
| Quotes included or not?
| lisper wrote:
| Not. Obviously, or the rejection ratio would be a lot
| higher than 5%.
| brlcad wrote:
| I would assume because it's only 2-chars (me) and they're
| filtering anything <3 as invalid.
| lisper wrote:
| Yeah, that's what I would guess as well. But there's a
| big difference between "follow the [ridiculously
| complicated] spec to the letter" and "don't do obviously
| stupid things like filter out email addresses with short
| names". The latter is good advice, the former not so much
| IMHO.
| Domenic_S wrote:
| For a couple glorious years I had a 2-letter email
| address at a single-letter .com domain. It was rejected a
| surprisingly small number of times.
| isoskeles wrote:
| Ah, that's a good assumption. My initial assumption was
| some sites have a very dumb whitelist of valid email
| domains. This seems more reasonable (although, also
| dumb).
| yupper32 wrote:
| If 5% of ecommerce shops refuse to accept it, it's likely you
| being clever.
|
| My email is refused by 0% of ecommerce shops... because I
| just have a normal email.
|
| Don't be clever, pick a better email.
| woah wrote:
| If you aren't accepting very normal email addresses at
| perfectly valid TLDs, then you are a bad programmer. At
| least import a list of the new TLDs every ten years.
| yupper32 wrote:
| Of course they're a bad programmer. But we live in real
| life, where bad programmers exist.
|
| Get a big brand .com email and you'll never run into an
| issue.
| drdaeman wrote:
| What's "normal", though?
| "<8-10latinalphanumerics>@gmail.com?"
|
| My email is just "me@<my-last-name>.al"[1] which is just a
| tiny bit "unusual" - and over the years it got refused by a
| couple stores because of TLD. And Albania is not Cocos
| Islands, they're surely not popular with spammers.
|
| If a store believes there's only ".com" gTLD and nothing
| else (this had really happened to me, some galaxy-brain
| made a form with a hardcoded ".com" suffix; not even ".net"
| or ".org" were accepted, unfortunately I don't remember the
| site) - well, fuck that store, their loss not mine. Worst
| case, if I really want something they sell, I'll give them
| a throwaway email - which will contribute to their mail
| bounces after some time.
|
| __________
|
| [1] ".al" is a ccTLD for Albania which is not a country of
| my citizenship or residence. I've picked the domain name as
| hack - because my first name is Aleksei and my first and
| middle names form "A.L." initials as well. That, and
| because all relevant .name domains were already taken.
| yupper32 wrote:
| Might sound strange but yes, me@<my-last-name>.al _is_
| being clever. You found a nice short clean email by
| buying a domain from Albania and setting up a me@
| address. That 's clever.
|
| Think about it this way: either you can get some big
| brand .com email with no special username and never have
| an issue, or you can flail around 5% of the time and yell
| at the clouds.
|
| Should everyone accept your email? Of course! I'm just
| saying you live in real life, and in real life people
| suck at building email forms. The problems you run into
| are on you.
| oarsinsync wrote:
| > Should everyone accept your email? Of course! I'm just
| saying you live in real life, and in real life people
| suck at building email forms. The problems you run into
| are on you.
|
| No, the problems they run into are caused by (at best)
| mediocre developers. They're entirely to blame. We have
| specs and standards for a reason.
| yupper32 wrote:
| I honestly don't understand what you're trying to say.
| What's actionable about your view? You going to call up
| every business that doesn't accept your email and tell
| them their programmers suck? Businesses like this are
| never going away. It's a losing battle.
|
| Instead you can just get a big name .com email and call
| it a day. Live your life without trying to make some
| statement about email standards.
| unoti wrote:
| Totally agree with this. Trying to be perfect is a good road to
| paralysis and not getting things done. Software is like people:
| it's ok to not be perfect, especially if they're always trying
| hard to be better and doing good things for society.
| jcranmer wrote:
| The basic rule of thumb I use this: are you implementing email
| at the MTA level (needing to build/parse RFC 5321 commands or
| RFC 5322 blobs directly), or are you using email closer to a
| "universal internet ID" purpose (i.e., application
| perspective)?
|
| If you are in the former category, then yes, follow the spec to
| the letter. If you're in the latter, then screw the precise
| guidelines of the spec and reject emails that are very unlikely
| to be valid: no quoted localparts, no IP address literals. In
| addition, go ahead and say that email is case-insensitive (more
| precisely, case-preserving).
|
| The hard part is if you're writing an email client, because
| you're basically forced to have your hands in both pies.
| WindyLakeReturn wrote:
| It depends upon where you are validating email input at.
|
| For the initial email input, your logic works fine. Once it is
| applied downstream in a process, it begins to get messy.
| Someone might do an incorrect email validation that happens to
| block emails that you have already accepted or which you are
| importing from a valid source. Someone has already given the
| example of a login field not allowing them to use the email
| they signed up with. If such upgrades occur later in a projects
| life cycle, not only might you have to spend developer's time,
| you may also have a production outage.
|
| Personally, I suggest using some, even if imperfect, validation
| when gathering the email initially (for the reasons you point
| out) and then not validating that information any further.
| paulmd wrote:
| I actually run into this all the time with passwords using a
| password manager. Lots of places will accept the _creation_
| of a password that 's long/complex/etc but then when you
| actually try to log in with it it won't accept a long
| password, won't accept certain characters, will silently
| truncate it and throw an invalid password error, etc.
|
| Sometimes disabling Javascript will fix it, sometimes not. I
| occasionally have resort to using "I forgot my password"
| until I figure out what the actual underlying requirements of
| the passwords are.
| CodeMage wrote:
| As a user, I got burned by that several times. Now, when I
| create a new account somewhere, the first thing I do is log
| out and try to log back in.
| zerd wrote:
| Etrade lets you create 32 character password, but if you
| enable 2FA you suddenly can't login because apparently they
| concatenate them together and then check the length. So
| make sure your password is max 26 characters. (they
| might've fixed this but I haven't tried).
| sbierwagen wrote:
| Like GP mentioned, Etrade also does the thing where it
| accepts the . character on password creation, but not
| login. That was fun to figure out.
| hsbauauvhabzb wrote:
| Curious, can you login with 26 characters and your MFA
| seed to bypass MFA entirely?
| feanaro wrote:
| I don't encounter this very often myself. So far the only
| place I've seen this is Paypal. _facepalm_
| lcuff wrote:
| Yup! Same thing with the ridiculous verify-my-identify
| questions. One I encounter all the time is the local
| community college, which let me use spaces in my answers on
| creation, but not at entry time. Grrrr.
| novok wrote:
| I've run into this with labcorp. Their desktop webapp takes
| subdomain emails, but their mobile iOS health webpage login
| thinks a subdomain email is invalid and disables the login
| button. They also don't let you change your account email so
| you can never really fix this issue properly.
| scotu wrote:
| I found websites not allowing perfectly valid tlds, so maybe
| they could be starting not using .com in their regex. (.email)
| paulmd wrote:
| This sounds great but what you think is "common" probably
| isn't.
|
| When I was validating myself for Amazon Prime Student, I
| literally had Amazon refuse to accept my student email in the
| form first.m.last@myschool.edu because there were two '.'s in
| the mailbox portion. I had to send an email to support and it
| was eventually dutifully fixed.
|
| And that's not an uncommon format for, you know, _school
| emails_. And that 's an Amazon engineer who should have known.
|
| I imagine there's developers who think "domain.tld" is the only
| thing valid to put in the domain portion, and that's going to
| fail with "domain.co.uk", or uncommon TLDs, or other perfectly
| valid constructs. And sure "it's only x% of the users" but it's
| a pain in the ass if you're that user. You need to be
| reasonably permissive.
|
| (but on the other hand "myname@..." is not valid either, and
| that will fail and cost you money as well... hence leading us
| back to 'just follow the spec')
| the_arun wrote:
| Instead of every developer implementing validation logic,
| shouldn't we have validation libraries to take care of this?
| NikolaNovak wrote:
| I get your point, but it ends up pretty arbitrary who picks up
| what part of spec to implement / which part of spec they deem
| "common sense".
|
| e.g. It drives me BONKERS how many systems absolutely reject my
| single-letter email (~"N@domain.com"), which I created
| specifically to make it easy and safe to type on mobile devices
| etc. Others will reject the "+" sign, or underscore, or
| dot/period, or (brilliantly) two periods or underscors, etc etc
| etc :=/
| Strom wrote:
| There are also blacklists for names you can use. My real
| e-mail is _admin@myname.com_ but Facebook doesn 't allow me
| to use that e-mail, warning me that only personal e-mails are
| allowed. Paradoxically I ended up using my work e-mail to get
| around the restriction.
| hwbehrens wrote:
| My email address ends with the .cc TLD, and the number of
| websites which say "Did you mean to type .ca?" and then
| _refuse to let me continue_ without changing it drives me
| similarly batty.
| [deleted]
| forty wrote:
| 100% agree. This is especially true if the address mail is
| going to be displayed somewhere for example, it's generally a
| good idea to limit email address to a sunset of what the RFC
| allows.
|
| To adapt from a famous quote: "all email validation logics are
| wrong, but some of them are useful" ;)
| goto11 wrote:
| But why restrict the syntax arbitrarily in the first place? It
| is not going to catch the common typos anyway. Most typos will
| just result in a wrong but still syntactically valid email
| address.
| harryf wrote:
| I've always wondered if it's possible to have a valid email
| address which is also an SQL injection attack, XSS or similar
| ?
| novok wrote:
| I've run into some places where a subdomain email is not ok,
| which has been pretty annoying. All email validators should be
| able to at least take first.last+company@subdomain.example.com
| macksd wrote:
| Especially when using a country TLD, suffixes like .co.za are
| appended to the name of the actual ISP or email provider.
| eli wrote:
| There's also incredibly low stakes in allowing a technically-
| invalid email address to pass validation. Just use a very
| permissive pattern (e.g. contains an '@') and be done with it.
|
| No matter what you will constantly be getting addresses that
| conform to the spec but cannot actually receive mail.
| fencepost wrote:
| Hah, even beyond the question of address format variations there
| are also commercial services that do some level of email address
| validation - and one of them regards my business email address as
| invalid (firstname@companyname.com) - adding another letter
| works, adding punctuation works, it's just my specific first
| name.
|
| Unfortunately it's done via black box on their server(s), so it's
| not like I can even dig through the code and figure out what's
| going wrong.
| devfatigue wrote:
| What if we flipped email validation around and made the users
| email a one time code to validate their email?
| roachpepe wrote:
| Not really an email issue so sorry if maybe somewhat off topic
| but on the subject of validation I can't not bring this up -
| this, remember the guy whose name was Null?
| https://www.wired.com/2015/11/null/
| mLuby wrote:
| Validation errors are common, but warnings are not.
|
| I'd like to see more of "Patterns like [what you entered] are
| uncommon--are you sure?" instead of "Patterns like [what you
| entered] are not allowed--change it to proceed."
| richeyryan wrote:
| I recently implemented this using the great Mailcheck library.
| So if someone types "gnail.com" or "gmail.con" it detects it
| and we can show "Did you mean gmail.com?". If someone ignores
| the suggestion, fair enough. If someone purposely wants to give
| us a junk email, fair enough. At least we're not frustrating
| them needlessly.
|
| https://github.com/mailcheck/mailcheck
| zzo38computer wrote:
| Mostly, yes. However, some things should probably still be
| prohibited, such as:
|
| - An email address ending with ".invalid", unless invalid email
| addresses are supposed to be allowed (which in some cases is
| useful, but you can then disable sending email to such an
| address, using it only for identification). (I do use such an
| email address for identification on NNTP.)
|
| - Email addresses without at least one at sign.
|
| - Email addresses containing control characters (at least ASCII
| control characters).
|
| - If the domain name does not resolve or resolves to a loopback
| address or LAN address (except for some specialized cases where
| such a thing is desirable). The same is true for literal IP
| addresses; if it is a loopback or LAN address then it should be
| disallowed, but otherwise it can be allowed.
| high_byte wrote:
| "How to Hack Things with These 13 Simple Tricks"
| mro_name wrote:
| Actually email validation is simple: do the opt-in.
|
| If confirmed, it's valid.
| user3939382 wrote:
| TFA won't load for me, but I'd like to make a short PSA: RFC
| 5322.
|
| Lookin' at you, Walgreens.
| sparrc wrote:
| I have my own custom email domain (sparr.email) that fails
| validation surprisingly often.
| Wronnay wrote:
| https://web.archive.org/web/20210408080002/https://www.netme...
| mooreds wrote:
| Hah, just went over the email address validation logic in our app
| last week because a client asked.
|
| Turns out we do minimal validation (make sure there is a local
| and a domain, that there are not two periods next to each other,
| and a few other things) but what we really rely on is
| deliverability.
|
| In other words, if your email needs to be verified, we'll try to
| send an email to the address you provide. If the link is clicked
| (or the code entered), that's good enough for us.
|
| Applications using our service (we're an auth provider) can
| decide for themselves if they need email address validation. It's
| a boolean flag on the user object. If they do, they can use the
| functionality we provide to ensure it.
| slavik81 wrote:
| I created myusername@hotmail.com in 1999, then immediately lost
| the password. I couldn't recover my account or use the same name
| again, so I created myusername_@hotmail.com.
|
| In the twenty-two years that have followed, the only website that
| has had a problem with my email address is Chapters Indigo, which
| explicitly rejects it as invalid.
|
| For email validation, keeping it simple is best.
| arkitaip wrote:
| Most of these are just overcomplicating validation. What really
| matters is account verification, i.e. sending an email to the
| specified email address in order to verify its authenticity
| before sending any kind of email (transactional, marketing) to
| the account.
|
| At this point, not doing email verification should be considered
| a dark pattern because it causes so much trouble when people's
| email addresses are used without their permission.
| delecti wrote:
| And "permission" isn't even the only issue. Months of Doordash
| account emails were lost to the ether because I made a typo
| (gmail.lcom) in my personal email, and it was basically
| impossible to change the email on an account (their SMS
| verification seems broken). It does explain why I never got
| order confirmations though, that had seemed odd.
| jerf wrote:
| It seems to be a common antipattern for somewhat smaller
| sites to make the email address the primary key on the
| account too, and then it's virtually impossible to ever
| change it after that. As you scale up it becomes impossible
| to ignore the fact that people change email addresses
| sometimes, but I've lost track of the number of smaller sites
| that assume it's a safe primary key.
| cperciva wrote:
| _raises hand_
|
| I can confirm that this is a very stupid mistake to make.
| :-(
| dokem wrote:
| Sorry no, your email address is wrong. Make a new one.
| bcrl wrote:
| I think the best email address I ever knew of was n@ai .
| Unfortunately, the .ai TLD eventually decided it wasn't a good
| idea to have an MX recorded resolving on the TLD.
| arkitaip wrote:
| It's cute but doomed to fail because it goes against everything
| most people understand about email addresses.
| [deleted]
| [deleted]
| bilater wrote:
| Unpopular opinion: Just ignore these edge cases and focus on the
| 99.99% of the sane population that doesn't get off on having a
| weird email address.
|
| More generally: If an edge case exists and has nothing to do with
| accessibility (it was caused by a user having a different
| workflow like needing a screen reader or being in a less
| developed part of the world with slow internet) then you should
| dismiss them and not make your code/life unnecessarily
| complicated.
| jillesvangurp wrote:
| The only relevant email validation is verifying a user can
| click on a link or enter some code sent to them to the address
| specified by them. Without verifying ownership, the email
| address is worthless as an identifier so making it conform to
| some syntax is not that relevant and you should not use
| unverified email addresses as identifiers (identity theft is a
| thing).
|
| Obsessing about regular expressions for these addresses is
| generally a waste of time except for maybe preventing a lot of
| failed attempts to send stuff to a clearly invalid email
| address. A simple string contains '@' is probably good enough
| for that. Worst case the email address does not work and you
| discard the entered information after some reasonable time
| frame. The user has the option to try again and do a better job
| of typing their email address.
___________________________________________________________________
(page generated 2021-05-24 23:01 UTC)