[HN Gopher] Protecting your email address via SVG instead of Jav...
___________________________________________________________________
Protecting your email address via SVG instead of JavaScript
Author : FrostKiwi
Score : 256 points
Date : 2024-05-13 07:33 UTC (15 hours ago)
(HTM) web link (rouninmedia.github.io)
(TXT) w3m dump (rouninmedia.github.io)
| cwillu wrote:
| Email is still plain-text within an xml document referenced in
| the page source.
| _joel wrote:
| The idea being that spam bots don't parse svg's looking for
| email addresses, just the page html. I'm not sure how effective
| this really is with modern spam protection, however.
| turboturbo wrote:
| The idea also seem to be that spam bots don't look for
| `href="mailto:something"` in the DOM
| rrr_oh_man wrote:
| That seems surprising, tbh
| edave64 wrote:
| The mailto is inside the SVG, not the HTML document. So
| that's not "also" it's the same idea of bots not looking at
| the svg at all
| majestic5762 wrote:
| yeah, useless stuff portrayed as smart
| shanehoban wrote:
| Try to query it though via document.querySelectorAll('a') for
| example. It's a good first line of defense as a lot of scraping
| techniques do this approach.
|
| However, if you have a headless browser setup for scraping, and
| simply fetch the current URL while on the page[0], you can get
| the plain text, and do a regex search for email addresses which
| will get you the email address - albeit this is a strange
| approach to take I admit.
|
| [0]: fetch('./').then((res) => res.text()).then((text) =>
| console.log(text))
| nolok wrote:
| > It's a good first line of defense as a lot of scraping
| techniques do this approach.
|
| Most basic scrappers, the ones that are not for your testing
| or devtools or automation or ... Actually use basic text,
| without any interpretation. They grep the source code, they
| don't run a dom and javascript engine, because it's a major
| difference in computing needs and speed.
|
| I am not saying there is no evil scrapper doing dom
| evaluation, there are tons, I am reacting to your "FIRST line
| of defense", that one is scrambling the raw text, which is
| why we got there.
|
| What parent is saying, is that this is trying to upgrade the
| defense that we have generated to stop the threat that
| evolved, but it forgot why we got there and thus makes itself
| vulnerable to the original threat.
| cqqxo4zV46cp wrote:
| If they're saying it, I think that they're wrong. One of
| those naively written scrapers won't pick up an email
| address 'protected' in this way. It's simply continuing the
| game of cat and mouse.
| animuchan wrote:
| Absolutely. The basic tools just fetch sites recursively
| and use regular expressions. The advanced tools are
| Chromium-based, so will render SVGs just fine (and then
| potentially run OCR / AI to extract text even from JPEGs).
|
| This technique protects from a "neither here nor there"
| subset of programs, I wonder how large is that set in
| practice.
| nkozyra wrote:
| You can just query for all the image elements and then read
| any svg using the document model.
|
| This is trivial to overcome for most basic scrapers and not
| much harder even if you try to obfuscate with paths for more
| sophisticated ones.
| throwaway11460 wrote:
| Don't have time to test myself right now - what about
| accessibility, can a screen reader read it?
| Operyl wrote:
| Given the entire bottom section, it seems like accessibility
| was taken into account here.
| throwaway11460 wrote:
| Unfortunately then I think it won't help at all - going
| through the accessibility tree is a standard web crawling
| play.
| gostsamo wrote:
| I tested and seems accessible on the live demo. Not sure if is
| as protected as the author claims though, but it might throw
| some bots for a spin.
| rrr_oh_man wrote:
| Man, I've always wondered how to test apps with a (simulated)
| screen reader, but never got too far
| throwaway11460 wrote:
| I use this: https://chromewebstore.google.com/detail/aria-
| devtools/dneem...
|
| Not sure about desktop apps.
| gostsamo wrote:
| My secret is that I'm not simulating. Being blind forces
| you into it. :D
|
| For testing purposes, the nvda screen reader is free and
| open source. I'm not sure if there is a driver for it to
| have an api access to what it would output, but it might be
| a fun project to try for a11y testing purposes.
| dylan604 wrote:
| > but it might throw some bots for a spin.
|
| Until some bot dev sees this, accepts the challenge, and then
| solves it as a function within their package that never needs
| updating again because it is now done. So, live it up while
| it is not solved. After that, just shrug your shoulders at
| yet another idea no longer being useful
| gostsamo wrote:
| The key in this case is that this is not a problem for me
| even if someone implements such a protection.
|
| The rest is mice and traps.
| SahAssar wrote:
| If concealing it in an object tag works then you could just have
| the object tag show it as plain text or html, right? Not sure why
| its an svg.
| juped wrote:
| probably because the scraper has "that's an image, skip it"
| logic
| okasaki wrote:
| Is it still necessary to obfuscate email addresses? Mine isn't
| and I get around 50 generic spam emails per month to gmail.
| ale42 wrote:
| I think that nowadays most spam lists come from data breaches
| and address-collecting malware. It's cheaper than running a bot
| to scan the web for addresses. We get spam on addresses that
| were never published online.
| RaoulP wrote:
| I think so too. And I think the majority of data breaches
| that have lead to spam for me are from ages ago, from random
| services I signed up for as a teenager.
|
| For a few years after that I did the "+" Gmail alias thing,
| to try to filter and catch companies. But I realised that's
| easy and obvious to strip, so it wasn't worth the effort
| (although I have caught PayPal leaking my email somehow).
| ale42 wrote:
| If you self-host your email, you can use "." as a delimiter
| instead of the "+". People would already need to know they
| can strip that part...
| RaoulP wrote:
| Sounds good! I might go even further and just use a
| custom address for each service, i.e. paypal@example.com
| or something.
|
| But self-hosting email is an adventure I'm nervous to
| embark on.
| nobody9999 wrote:
| >Sounds good! I might go even further and just use a
| custom address for each service, i.e. paypal@example.com
| or something.
|
| Which is exactly what I do. As soon as I see spam sent to
| any particular email address, I know who it is that
| leaked the address and I can block it without issue.
|
| >But self-hosting email is an adventure I'm nervous to
| embark on.
|
| Why are you nervous about it? I've been doing so for
| decades and haven't had many issues at all. There are a
| bunch of all-in-one solutions like mailinabox[0] (I roll
| my own, but as I said, I've been doing this for decades)
| and others which would likely make things simpler for
| you. Go for it! You won't be disappointed.
|
| [0] https://en.wikipedia.org/wiki/Mail-in-a-Box
| samatman wrote:
| Anecdotally, sending mail to example.com from
| example@mydomain.com can cause a whole host of human-
| factors problems which can be eliminated with something
| like RaoulPtoExample@mydomain.com.
| martyvis wrote:
| Is that all. I get around 70 genuine spam emails to my Gmail
| account every day now (all detected correctly by Gmail)
| tempestn wrote:
| I get a similar volume, and gmail likely detects almost all
| of them. Problem is, it also falsely detects the occasional
| non-spam message, so I do need to periodically scan through
| the spam box, which is a bit of a pain when it contains
| hundreds or thousands of emails.
| RaoulP wrote:
| I think this is a valid question. I see lots of effort at
| obfuscation but don't know if there's still a need.
|
| I barely get spam and have a bigger issue with false positives
| in my spam folder. On the other hand I don't think there are
| many pages on the web that display my email address, so I'm
| curious about others' experience.
| sitzkrieg wrote:
| it isnt but people like to make a problem of it with elaborate
| whatifs
| fp64 wrote:
| I don't get it, I can just curl the svg and grep for mailto?
| rany_ wrote:
| Yes, but these scrapper bots aren't that sophisticated.
| winternewt wrote:
| But they will be as soon as this sees widespread use.
| _joel wrote:
| it won't be widespread imho, not when you share you email
| address with other parties that then lose/sell your
| details. fastmail like 'temporal' email addresses could
| help, however.
| amsterdorn wrote:
| Querying DOM nodes is inherently more complicated than a
| regex on unparsed HTML.
| fp64 wrote:
| Crawl every link, now including SVG, and grep all 'mailto:'
| does not sound super sophisticated? wget
| --recursive --quiet $BASE_URL && grep -roh 'mailto:\([^"]*\)'
|
| works on the example and just prints the email
| planede wrote:
| I think the idea is that email scraper bots typically don't
| bother downloading images referenced by <img> tags.
| magnat wrote:
| > even when a human visitor has their JavaScript turned off, the
| email address displayed on the page remains usable
|
| NoScript on Firefox with default settings don't render <object>
| tags (replaces them with placeholders), so this technique doesn't
| work here.
|
| https://imgur.com/2tCAgAf
| Laaas wrote:
| uBlock Origin can block JS too FWIW. There's a convenient
| button for it in the extended menu.
| brettermeier wrote:
| Thank you, didn't know that!
| jaeh wrote:
| it's the same in chromium.
| yau8edq12i wrote:
| That's a different thing, though. Not sure why you'd make this
| point.
| dannyobrien wrote:
| I would like to push back on the idea that you should obfuscate
| your email address at all.
|
| My email addreas is danny@spesh.com. I get a lot of spam --
| possibly, since I have been distributing that address
| deliberately on the web and inadvertently in hacked datadumps, a
| near maximum amount of spam.
|
| But the benefits of having people easily find a way to contact me
| directly has for me far outweighed the (largely solved) challenge
| of discarding automated spam.
|
| Publish your email address! It's okay! Very little bad will
| happen, and people will be able contact you without going through
| some strange social media intermediary!
| parasti wrote:
| This is appropriate advice for the average HN reader. For
| everyone else, probably not. I've seen first hand otherwise
| intelligent people being unable to discern an obvious (to me)
| online scam from a legitimate business. These are the people
| spammers are targeting. These are the people that need to
| obfuscate their email address.
| _joel wrote:
| So you're saying the same people unable to discern a spam
| email knows how to embed a mailto: link in an XML document
| and write webpages. Ok.
| parasti wrote:
| Never said that. I'm a web developer. People ask me to add
| their emails to web pages. Comment quality on here seems to
| have taken a dive.
| richrichardsson wrote:
| Even sophicasted users can slip up in the right
| circumstances.
|
| Personal anecdote: one morning, whilst still quite sleepy
| received a _very_ well crafted Namecheap phishing expedition.
| I half knew the product they were claiming was lapsed was
| actually fine, but I had just recently renewed so I thought
| perhaps there had been a problem I missed, and it was
| convincing enough that I clicked the link before doing the
| normal sanity checks. Thankfully the address it went to didn
| 't resolve. Hopefully I would have noticed the obviously
| incorrect URL before I entered any details, and I have 2FA
| enabled, but still, I should and do know better, it was just
| perfect timing for a well crafted attack...
| SushiHippie wrote:
| > in hacked datadumps
|
| https://haveibeenpwned.com/
|
| 45 data breaches and 7 pastes
|
| Wow, I don't know if I've ever seen a real address in so many
| breaches haha
| cyptus wrote:
| there is a quite big stackoverflow discussion about ideas how to
| protect your email on your website:
| https://stackoverflow.com/q/163628/1216595
| zigzag312 wrote:
| Sadly stackoverflow closed the discussion. Even though
| discussion is both interesting and valuable.
| karol wrote:
| Spam filters work in 2024.
|
| Does the fact someone independently discovers Gauss method to sum
| up all the numbers 1...100 today make it worth sharing?
|
| My point is that this is a primitive and easy to break workaround
| and better methods exist.
| geuis wrote:
| Why? What's the point?
|
| All you're doing I making it slightly more difficult for the
| people that want to contact you to do so.
|
| OCR has been a thing for years.
|
| Just put your email out there. That's what spam filters are for.
|
| charles@geuis.com. There. Scrape it. Spam it. I don't care.
|
| Edit:
|
| Yes, thank you for signing me up for the DNC (already a member),
| some random Trump org, something about Scientology, and another
| random christian-based website. Honestly, I'm kind of sad at the
| lack of originality given the otherwise extremely ingenious
| community we have here.
| Maxatar wrote:
| But you just proved the point. You might not care to be signed
| up for some random Trump org, Scientology, or whatever, but
| other people do care and if you want to author a website that
| responsibly uses people's emails without subjecting them to
| unnecessary spam, then it's worth taking these techniques (not
| necessarily this specific one) into consideration.
|
| While OCR does exist it's incredibly expensive compared to text
| scraping. The main way to combat spam is to make the cost of
| spamming more expensive than the benefit.
| ceving wrote:
| It does not work if you change the font-size.
| brap wrote:
| I've been using the same gmail address for like 20 years.
|
| I don't think I got a single spam email in the last 5-10 years.
|
| SMS, on the other hand...
| rvnx wrote:
| A couple of modern spammers send you spam from Gmail and say "I
| included my colleague in CC please hit 'reply all' if you are
| interested"
| Etheryte wrote:
| While the specific claim made about copying is true, you can
| right click and select copy email address, simply selecting the
| text and doing copy does not work. Similarly if you do select all
| into copy etc, so all in all, I wouldn't expect a regular user to
| be able to successfully copy this.
| miki123211 wrote:
| While there's nothing stopping this technique from being
| accessible in principle, the example given in the article is a
| really bad one.
|
| The article uses "Email us!" as the label on the svg and a
| elements, which effectively hides the actual email address from
| screen readers. Using aria labels in this way is a really bad
| practice, a screen reader user should have the same experience as
| anybody else unless there's a very good reason to do otherwise,
| and if you think your reason is a good reason, you're probably
| wrong.
|
| The proper way to do this would be to put the actual email
| address in the labels,.
| 47282847 wrote:
| Isn't the whole point of the exercise to not have the document
| contain the email address in a (machine-)readable format?
| Doe-_ wrote:
| The email address wouldn't be in the document directly, only
| in the SVG. Whether the title of the SVG contains "Email us"
| or the email address wouldn't affect how it works.
|
| If the scrapper is searching the DOM rather than simply
| downloading the webpages, then the email will found
| regardless.
| janosdebugs wrote:
| The NVDA screen reader reads this text as: "This is my email
| frame link email us." That is by no means equivalent to
| actually seeing the email address. I found that HTML entity
| encoding every single character of the link takes care of any
| spam problem already and is much more accessible.
| matteason wrote:
| This can also affect voice dictation software like Dragon - if
| a user says 'Click myemail@mydomain.tld' it won't activate the
| link as Dragon is expecting 'Click email us', as that's now
| what the browser exposes as the link text.
|
| That point might be academic anyway as I'm not sure Dragon
| would activate a link inside an SVG
| nloomans wrote:
| I tested the example using the TalkBack screenreader on Android.
| With Firefox I was able to select and click on the link, but it
| did not announce the email address. With Chromium it completely
| ignored the existence of the SVG email. I was unable to select it
| and it was like the email wasn't there at all.
|
| So yeah, I wouldn't call this accessible.
| yreg wrote:
| > Email addresses published on webpages usually need to be
| protected from email-harvesting spambots.
|
| Do they though?
|
| I have had my email address published on my website in a <a
| href="mailto:... for like 20 years and I don't get spam that
| would get through the spam filter.
|
| I use both Gmail and (for some other addresses) a webmail hosted
| by a local company which uses some other filter. Both work well,
| so it's not something only Google can do.
| xyst wrote:
| this used to be a problem in the early 00s. I don't think spam
| filtering was as good back then so protecting your public email
| from spam was necessary.
|
| Also this was a time when mail boxes were often allocated 10-25
| megabytes. So spam bots could easily flood your email.
| WirelessGigabit wrote:
| When I signed up for Hotmail it was 2MB.
|
| Then on April 1st, 2004 Google launched wasn't an April 1st
| joke... GMail with 1GB! I remember getting a beta invite and
| inviting others.
| zufallsheld wrote:
| I host my own Mailserver and all addresses that are publicly
| visible get spam, e.g. my blog or my mail that was visible on
| github.
| nozzlegear wrote:
| Same here, I've had my email plainly visible on my website in
| mailto links and on Github, and I don't get any spam that
| breaks through Fastmail's spam filters.
| digging wrote:
| My preference is to not have my email harvested at all when
| possible, even if I don't personally see the spam emails. (I'm
| not saying it's a critical privacy/security issue, but a
| preference.)
| jakubmazanec wrote:
| So then you never use your email, right?
| digging wrote:
| What?
| r-w wrote:
| I think they're obliquely referring to the scanning
| practices of major providers like Gmail, which most people
| use to filter their spam.
| adrianpike wrote:
| I've also had my email posted in mailto's in a half dozen
| places for... a long time. I remember in the early 00's when
| I'd cargo cult the old "type the whole email out as adrian at
| adrianpike dot com" thing on forums thinking it would work as
| some mystical talisman, and it turns out considering emails to
| be secret isn't worth the time.
| a_random_canuck wrote:
| They do. My wife lost her 10-year-old Instagram account to a
| well crafted phishing attack against an email she had
| published...
|
| Instagram/Meta's customer support is absolutely atrocious and
| disgraceful on this front. They basically treat my wife like
| she's also a spammer and there's no way to recover the account
| or undo any of the changes the spammers made.
|
| It's hilarious how they ask you to "appeal" a ban by clicking a
| single button without giving any chance to rectify what the
| spammers did to her account. Of course their automated bots
| just reject your appeal almost instantly. Shameful.
| hoherd wrote:
| This gave me "Press F to appeal ban" images.
| crtasm wrote:
| Does her email show up on any leaks on
| https://haveibeenpwned.com/ ? I'm wondering if not publishing
| it would have made any difference to receiving phishing
| messages.
| dgb23 wrote:
| This could happen to anyone. You're tired or thinking of
| something else, the attack weirdly aligns and you don't
| notice it until it's too late.
| chefandy wrote:
| Would such an attacker be stymied by this? It seems like
| automated email harvesting wouldn't be a big time saver for
| any attack that required a well-crafted anything. I don't
| know anything about that particular attack, though.
| qingcharles wrote:
| Clicking the appeal button is like a trap to permanently ban
| your account.
|
| You can get it back by paying off a Meta employee through a
| site like Swapd. It's either that or get your comment to the
| front page of HN. Those are the only two customer support
| channels for Meta or Google.
| qingcharles wrote:
| I have two people I designed web sites for in the last year and
| I put both their email addresses in the footer and neither one
| of their accounts has received a single spam message in all of
| that time (not even something dropped into the Spam folder).
| Both sites are popular and have thousands of visitors and get
| scraped by every search engine and AI bot you can think of.
| r-w wrote:
| Interesting. Maybe footer emails tend to be support contact
| addresses rather than personal inboxes. Otherwise I'd find
| that discrepancy very surprising.
| paradox460 wrote:
| The practice of email address "obfuscation" feels like a relic
| of a bygone era, one that was never actually sound in its
| methodology, but spread. A form of cargo-cultism has kept it
| alive
| SoftTalker wrote:
| Yeah just looking at this, it appears to add about 1K of
| overhead and at least one additional http request for
| something that ultimately boils down to a mailto: link, so it
| can still be scraped, and just adds bloat to your web page.
| crazygringo wrote:
| Exactly.
|
| I definitely recall in the early 2000's it absolutely _did_
| lead to spam, and e-mail obfuscation techniques were a real
| thing that genuinely helped.
|
| But by 2015 or so it didn't matter at all anymore, in my
| personal experience. It didn't even lead to spam that needed to
| filtered. Spammers just stopped looking for e-mails that way.
|
| Which makes perfect sense -- most people don't have their
| e-mail address listed anywhere online in the first place, but
| you can _purchase_ gigantic lists of e-mail addresses. That
| either originate from companies that sell their own user lists,
| or people who hacked the companies ' servers.
|
| These days if you want to send spam, trawling the web for
| e-mails makes zero sense. It's practically the least efficient
| thing you could do.
| r-w wrote:
| Unless you're the one trying to sell them, in which case
| that's part of doing business :)
| treflop wrote:
| I've been having all my email addresses posted plain text
| since like 2005 and I've signed up on like every website
| imaginable (my password manager has over 2,000 entries) and
| I've never had a spam problem, at least on Gmail.
| 4u00u wrote:
| very recently, within a day of publishing an email on a footer
| of a page i got a phishing email that was not filtered by spam
| and looked very genuine
| dhosek wrote:
| My thoughts exactly. On the other hand, an email address I used
| with Usenet ca 1999-2001 has had a consistent flood of spam. I
| think most spammers are using the same 20+-year-old list of
| emails.
|
| The email address on my website doesn't even get stuff that
| goes to the spam filter. Nothing, nada zilch.
|
| I do think that there are some mailing lists that get generated
| by trying to guess emails, brute-forcing gmail addresses by
| trying dictionary attacks of the FIRSTNAME.LASTNAME variety or
| 1-10 letters. I get a tiny amount of spam sent to a
| domain@domain.com address I have, but that's typically on the
| order of one message a year.
|
| And all else aside, the overall volume of spam email has
| declined dramatically, even ignoring the effect of the gmail
| spam filter. I'm guessing that email as a spam vector just
| doesn't make sense anymore and most of what goes out is a mix
| of 419 scammers trying to make their quotas and would-be
| scammers who've been scammed into buying that 20-year-old list
| of emails.
| janmo wrote:
| Here is what I do:
|
| <span class="contact-email">rea<span
| class="hidden">nospam</span>l@mai<span
| class="hidden">sjs</span>l.com</span>
|
| I still receive "spam" tho, but it seems they manually collected
| the email because what I receive are B2B proposals clearly
| targeted at the topic of my website.
| jszymborski wrote:
| If the scraper uses a headless browser, I think that it might
| defeat your method. That said, using a headless browser to
| crawl for emails is relatively expensive so perhaps the spam is
| not from your site.
| dns_snek wrote:
| Is there really a point to any of this? It's a fun exercise, but
| also a complete waste of time if you're actually trying to hide
| from spammers. You're making a piece of information public by
| sharing it with the entire world, yet somehow expecting it to
| only stay accessible to the "good guys".
|
| Unless you change your email address at least monthly, all it
| takes is for _one_ person or company to share your contact with
| someone else or enter it into a database /CRM, or _one_ service
| to get breached, then your email address is on a list that
| eventually gets propagated to every spammer worldwide. If you use
| that email with any regularity, the chance of those things
| happening can be rounded up to 100%.
|
| If hiding your email address from scrapers actually worked, spam
| wouldn't exist. I never published my personal contact anywhere,
| yet I get dozens of spam emails per week. They all get filtered
| as spam, it's not a big deal.
| muzster wrote:
| Heavily guarded fortress would indicate something of value
| inside, and the big crooks may spend a little more effort. In the
| age of AI, this becomes even easier. {
| "model" : "gpt-4-turbo", "messages" : [ {
| "role" : "system", "content" : [ {
| "type" : "text", "text" : "return a json array of
| all valid emails found in the image." } ]
| }, { "role" : "user",
| "content" : [ { "type" : "image_url",
| "image_url" : { "url" : "data:image/png;base64,{{
| INSERT_BASE64_PNG_DATA }}" } } ]
| } ], "temperature" : 0.5, "max_tokens" :
| 2048, "top_p" : 1.0, "frequency_penalty" :
| 0.0, "presence_penalty" : 0.0 }
|
| Edit: Converting web page to an image is trivial.
| zipping1549 wrote:
| It won't make sense cost wise though
| omneity wrote:
| Except the cost is only going down over time
| internetter wrote:
| We've had OCR for _decades_ before GPT. I suspect GPT might
| perform _worse_ than OCR. What a waste.
| muzster wrote:
| Agreed - it's a waste. GPT is not too bad at reading text
| from image and with the added bonus that you can reason with
| it.
| hhsectech wrote:
| Interesting idea...but could a crawler not just incorporate some
| AI like LLava2 or convert the SVG to a JPG and use OCR to get the
| email addresses out?
|
| It just seems like this adds a couple of steps to existing
| crawler scripts.
| mediumsmart wrote:
| this works if you write it into the html on fullmoon tuesdays :
|
| _< a href="mailto:so
| me.dude@the
| .otherdudes
| .site">some.
| 4;ude@the.ot
| 04;erdudes.si&#x
| 74;e</a>_
| kevin_thibedeau wrote:
| That works for humans. There's no reason to believe bots aren't
| handling entity parsing.
| robszumski wrote:
| In my experience they haven't been in the past, but LLMs
| change the game by doing it by default.
| rishikeshs wrote:
| how des this work
| xyst wrote:
| Kind of neat but I would rather just have a "throwaway" email if
| I was sharing globally.
|
| In my case, I setup an email alias with a sieve rule (if email
| sent to alias move to "public inquiry" folder). Prior to
| processing rule, spam assassin takes care of the non technical
| folks that couldn't be bothered to run their spam campaign
| through spam assassin testers. Or even nontechnical folks that
| wouldn't know how to setup their domain for sending email (spf,
| dkim, dmarc, ...)
| throwaway598 wrote:
| My domain: 24 years registered to me. A .com.
|
| My email address: Listed at the top of the front page. In a H3
| tag.
|
| This email address's spam problem: Not a problem. 15ish per day
| get to me including Junk folder. Thanks Purelymail.
|
| What is a problem: Transactional email unrelated to transactions,
| Promotional email which is newsletter junk spam, Social networks
| complaining of not being used.
| zufallsheld wrote:
| 15 spam mails do seem quite much to me. I blacklisted addresses
| for less.
| SoftTalker wrote:
| > Social networks complaining of not being used
|
| This is my biggest one. I get more spam from Facebook begging
| me to log in than I do from almost anything else. I haven't
| used the account in about 7 years, you'd think they'd figure it
| out.
| kevincox wrote:
| > you'd think they'd figure it out.
|
| Cost of sending spam: Effectively zero.
|
| Cost of pissing off inactive user: Essentially zero.
|
| Cost of convincing inactive user to come back: Positive.
|
| Add in a bunch of other factors like some product manager
| twisting stats to make it look like they are getting users
| back even if they really aren't and you see why it happens.
| emayljames wrote:
| a much easier way is to convert the email address into html
| entities. It then displays and can be copied, but the actual
| source code doesnt have the email address.
| seanvelasco wrote:
| i bought an premium .app domain a few months ago. not published
| in websites yet. no history of previous owners. just a fact that
| it's listed as a premium domain on registrars.
|
| first emails I received after the gmail welcome email were b2b
| sales from construction companies (i'm not in this field),
| shopify optimizations (i don't run one), agencies suggesting how
| i improve the ui/ux of my site (no website yet).
|
| thankfully, they're all in the spam folder. i'm using google
| workspace.
|
| i believe these spammers get their leads on newly-registered
| domains. so, how do we protect ourselves from that?
| hu3 wrote:
| I believe the only effective protection against these fresh
| domain spammers is what you did:-some pretty good anti-spam
| mechanism such as Gmail.
| franky47 wrote:
| Ironically, the only spam I receive these days comes from the
| address I used here for the "Who wants to be hired" threads.
| zaxomi wrote:
| Cool.
|
| 1 hour later.
|
| Spam-scraper updated to support this.
| mrbluecoat wrote:
| Exactly
| ChrisMarshallNY wrote:
| That's a pretty cool trick.
|
| I was not aware that we could embed CSS in SVG.
| iforgotmysocks wrote:
| I just have a simple contact page that sends message to discord
| webhook
| dxs wrote:
| This is fun [2008]:
| https://web.archive.org/web/20180908103745/http://techblog.t...
|
| "Nine ways to obfuscate e-mail addresses compared
|
| "When displaying an e-mail address on a website you obviously
| want to obfuscate it to avoid it getting harvested by spammers.
| But which obfuscation method is the best one? I drove a test to
| find out."
| cantSpellSober wrote:
| Can't be copied and pasted.
|
| It's _your_ domain, why not just have "contact@example.com" for
| incoming mail instead?
|
| (Novel approach, thanks for sharing!)
| kees99 wrote:
| Not only "protecting your email" is pointless like others have
| already pointed out, it's actively harmful.
|
| There are a fair few sites, where most all content is perfectly
| readable without JS, except things like "1920x1080@60Hz" are
| displayed as literal "[email protected]" text.
| digging wrote:
| > There are a fair few sites, where most all content is
| perfectly readable without JS, except things like
| "1920x1080@60Hz" are displayed as literal "[email protected]"
| text.
|
| Do you have one on hand? That sounds absurd and I've never seen
| it
| tentacleuno wrote:
| Mastodon instances fronted by Cloudflare (with Email
| Protection on) are good examples.
| helsinkiandrew wrote:
| Don't modern spam filters filter out most mails received this way
| and most spammers purchase lists for a specific targeted domains
| - house owners, porn users, dentists etc. rather than blindly
| scraping the web?
| dartos wrote:
| Idk LLM powered scraping can pull the email out of this without
| any issue
| stkdump wrote:
| It even uses the exact same syntax as in html, so as long as
| svg content isn't specifically excluded, normal web scraping
| would just work without modification.
| judge2020 wrote:
| Perhaps, but I think OCR is more likely.
| portaouflop wrote:
| Maybe I'm too stupid but I don't get why you would want to do
| this at all. Had my email in plaintext on the website for ages
| and never had an issue with spam...
| robbyiq999 wrote:
| How about posting 2 email addresses, a hidden one, and the actual
| one. Using the hidden one to filter the actual one
| JohnFen wrote:
| This has been my approach since the mid '90s. It works very
| well.
| butz wrote:
| I assume that nowadays emails are pulled directly from hacked
| mailbox contacts list. Nobody has the time to go through each
| individual website and collect emails one by one.
| Closi wrote:
| I assume that emails are pulled from every method available.
| Tagbert wrote:
| No body. Web crawler bots.
| donatj wrote:
| A friend of mine is an absolute wizard and has been building
| essentially "responsive images" as SVGs with JS inside. They
| adapt to their size programmatically. It's... interesting.
|
| The fact that SVGs can even have JS embedded feels both untapped
| and kind of dangerous.
| soperj wrote:
| SVGs are responsive out of the box? I'm confused about what the
| Javascript would be doing to help that situation within the
| svg.
| asynchronous wrote:
| I think they're talking about dynamically actually changing
| the image itself, not just resizing
| johnny99k wrote:
| This has been known in the security community for quite some
| time.
| alemanek wrote:
| That sounds super interesting. Does your friend have a GitHub
| or site that shows what they're doing on that front. If so
| could you post link.
|
| This is super far out of my wheelhouse technically as a backend
| engineer but it sounds really cool.
| replete wrote:
| <a href="{rewritten by js}">domain.com</a> a::before { content:
| "username@" }
| CM30 wrote:
| I think the main thing people forget with stuff like this is that
| yes, all these setups are possible (or even trivial) to bypass,
| but you're not really dealing with a dedicated adversary that's
| targeting you in particular.
|
| Spammers probably aren't going to update their tools to take into
| account every possible way every site obfuscates their email
| addresses, so the main trick to dealing with them would be to do
| something other sites/services don't. If you or your company
| become successful enough that people are actually targeting you
| in particular, then congrats, you're probably in a good place
| anyway.
| cmiller1 wrote:
| > Spammers probably aren't going to update their tools to take
| into account every possible way every site obfuscates their
| email addresses
|
| But this is also sort of a security through obscurity approach,
| if enough people adopt one of these methods of obfuscation then
| the spammers absolutely will change their tools.
| sircastor wrote:
| I think I get more unsolicited email from related businesses
| trying to get a foot in the door with my company - I assume
| they're connecting dots either from LinkedIn or Github (probably
| both). This is an interesting solution to the problem, but I
| don't genuinely think that anyone is scraping websites for email
| addresses anymore. I don't think it's cost effective for the
| modern spammer.
| readmemyrights wrote:
| Funny I'm seeing this now, I've finally ade the first tentative
| steps into making a website, and noticed that pandoc has an
| --email-obfuscation option and the whole topic was on my mind. I
| don't remember the last time I received an actual spam email (not
| counting desparate marketters trying to remind me of that one
| website I tried ages ago). Funnily enough, the new frontier seems
| to be what's app and SMS of all things. A month or two back I got
| a job offer from an indonesian phonenumber from what's app, and
| then something similar directly to my SMS. I didn't publish my
| phone number anywhere online, the closest thing to making it
| public was joining my college's what's app group and giving my
| phone number to a bank for a student credit card, and honestly I
| wouldn't put leaking them to some spam agency beyond either.
|
| I'm using voice over on MacOS chromium and I have the same
| experience as the NVDA user, although if I interact with the
| "link" I'll eventually find the email. If I wasn't aware of the
| ofuscation however I probably would just think the webpage was
| weird, saying "this is an email" but actually giving a mailto:
| link. In general, if you're doing something special to improve
| accessibility then odds are you're doing it wrong, and if it's
| anything web related the odds are at least 90%. Most
| accessibility issues on the internet are developers trying to be
| smart by using ARIA labels or such which usually just make it
| worse. The example I have to deal with most often are manpages on
| man.openbsd.org. All of their cross references to other manpages
| say something like "openssl, section 1" instead of "openssl(1)",
| which is what's displayed on the screen and what the browser's
| find command sees while searching.
|
| For completeness, I also tried the page with various terminal
| browsers, specifically lynx, felinks, w3m, and edbrowse. None,
| and I mean NONE of them could display the svg properly, they
| couldn't even recognise it as an image.
| _blk wrote:
| Seems like a great solution but I'd like to embed the data
| directly rather than linking an external file. Then one issue I
| see is that dumb scrapers just look for the email address (also
| in the embedded SVG, which they might not for external <object>
| or <img> files.) But for direct embeds, if the string is not
| otherwise encoded, that could potentially leak the email address.
|
| While this obviously (re)introduces JS into the mix, how would a
| simple compressed string fare against base64 svg embedding?
|
| ``` const compressedBase64Svg = '...';
|
| function decompressAndInsertSVG(encodedData) { const decodedData
| = atob(compressedBase64Svg); const decompressedSvg =
| decompress(decodedData); const svgContainer =
| document.getElementById('svgContainer'); svgContainer.innerHTML =
| decompressedSvg; }
|
| decompressAndInsertSVG(encodedSVG); ```
| nojs wrote:
| This is a cool trick. The email is in cleartext in the source,
| meaning mailto works and copy-paste works. But most scrapers
| probably skip the .svg file.
| pdonis wrote:
| _> most scrapers probably skip the .svg file_
|
| But they won't as soon as they realize it's just easy to parse
| text that contains data they're looking for.
| CodeWriter23 wrote:
| Seems kind of easy to defeat, just read the SVG to extract the
| email address from the mail to: link contained therein. Bonus the
| harvesting bots will now download all SVG files going forward.
| kindawinda wrote:
| google might start indexing your email
| saint-loup wrote:
| At that point, isn't adding a good old contact form a simpler
| solution? You can link it with your email address or other
| channels. It can even works with static websites, I hooked up
| mine with Nextcloud Forms.
|
| I appreciate the hacker creativity at display here, but as other
| said obfuscating an email address raises accessibility issues.
| Hiding content from some programs and not others (spam bots vs
| assistive technologies) seems inherently a losing game, for you
| or for users.
| kelnos wrote:
| I gave up on this sort of thing. Spam filters are good enough
| nowadays that I don't think I see an increase in spam by having
| my email address publicly available without obfuscation. (That
| is, an increase beyond other spam sources, like crappy companies
| who have my email address for a legitimate purpose, but sell it
| to third parties.) In general I see less than 1 spam email hit my
| inbox per day, and that's fine.
|
| Granted, this may depend on email provider and spam filter, so
| YMMV, but it hasn't been an issue for me.
| niutech wrote:
| This requires loading an external SVG file, better use an inline
| version: <object data="data:image/svg+xml,%3Csv
| g%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20viewBox%
| 3D%220%200%20200%2024%22%3E%3Ca%20href%3D%22mailto%3Amyemail%40my
| domain.tld%22%3E%3Ctext%20x%3D%2250%25%22%20y%3D%2250%25%22%20dom
| inant-baseline%3D%22middle%22%20text-anchor%3D%22middle%22%3Emyem
| ail%40mydomain.tld%3C%2Ftext%3E%3C%2Fa%3E%3C%2Fsvg%3E"
| type="image/svg+xml"></object>
|
| Also have a look at this:
| https://spencermortensen.com/articles/email-obfuscation/
___________________________________________________________________
(page generated 2024-05-13 23:01 UTC)