[HN Gopher] Filtering out newsletter signup forms embedded in we...
       ___________________________________________________________________
        
       Filtering out newsletter signup forms embedded in web articles
        
       Author : k1m
       Score  : 47 points
       Date   : 2021-06-14 08:30 UTC (14 hours ago)
        
 (HTM) web link (www.fivefilters.org)
 (TXT) w3m dump (www.fivefilters.org)
        
       | kilroy123 wrote:
       | Am I the only one who doesn't think these are that big of a deal?
       | I'd gladly take this over a big ugly ad. Hell ANY ad at all.
        
       | rchaud wrote:
       | This by far is by biggest pet peeve of reading online: an article
       | can't even be shown in full. It has to be broken up into
       | disembodied sections, with ads, newsletter signups, or a block of
       | links appearing every few paragraphs.
       | 
       | Fortunately, the NYT appears to have scaled back on this. I'm
       | finding that I can do "Print to PDF" on the majority of articles
       | (excl. interactive stories) and the layout is very clean with no
       | interruptions.
       | 
       | If that changes, I still have the option of going to FF's
       | "reading view" and Print to PDF from there.
        
         | k1m wrote:
         | > This by far is by biggest pet peeve of reading online: an
         | article can't even be shown in full. It has to be broken up
         | into disembodied sections, with ads, newsletter signups, or a
         | block of links appearing every few paragraphs.
         | 
         | Absolutely. I'm actually a little worried that with the growing
         | popularity of utility-first CSS and build tools that replace
         | semantic information on HTML elements with randomly generated
         | values, it's going to be increasingly difficult to have filter
         | lists remove these.
         | 
         | Currently the selectors in Fanboy's Annoyance List target
         | semantic attributes, e.g. "newsletter-widget", but already
         | there are a lot of sites that don't contain any semantic info
         | around the sections that aren't a part of the content. The BBC
         | website is one example, they have a list of related links
         | between paragraphs of text on many articles, e.g.
         | https://www.bbc.com/news/uk-57464097. This is the HTML markup:
         | <div data-component="unordered-list-block" class="ssrcss-
         | uf6wea-RichTextComponentWrapper e1xue1i84">         <div
         | class="ssrcss-18snukc-RichTextContainer e5tfeyi1">
         | <div class="ssrcss-1pzprxn-BulletListContainer e5tfeyi0">
         | <ul role="list">               <li>[related link 1]</li>
         | <li>[related link 2]</li>               <li>[related link
         | 3]</li>             </ul>           </div>         </div>
         | </div>
         | 
         | These are just related links randomly inserted into the
         | content, but no clear attribute values marking them as such,
         | making automatic removal difficult.
         | 
         | > If that changes, I still have the option of going to FF's
         | "reading view" and Print to PDF from there.
         | 
         | FF's reading view does well on a lot of articles. It handles
         | the BBC situation above quite well, but not others (see for
         | example the screenshot in our article with the newsletter
         | signup).
        
           | shortformblog wrote:
           | FWIW: The reason why they aren't easy to automatically remove
           | is because they are often put in manually by editors.
        
           | rchaud wrote:
           | I'm not a web developer (more like a hobbyist), so I never
           | used tools like Webpack to run 'builds'. Someone recently
           | explained this phenomenon to me when I asked about the
           | nonsensical class names I was seeing on some websites.
           | 
           | I now understand that those machine-generated classes are
           | designed to prevent styling conflicts on modern websites
           | where multiple developers' work may appear on a single page.
           | But with adblockers currently set up to detect cruft via
           | class names, I'm thinking that the clock is ticking on how
           | long these lists remain accurate.
        
       | bartvk wrote:
       | Personally I use the "No, Thanks" extension.
       | 
       | It costs a bit of money. Daniel Kladnik, the developer, provides
       | you with an invoice so you can deduct it as a business, or ask
       | your boss to pay.
       | 
       | https://www.no-thanks-extension.com/
       | 
       | Note that I haven't compared it to Fanboy's Annoyance List
       | mentioned in the article. Ages ago, I just wanted a no-configure
       | option.
        
         | shantnutiwari wrote:
         | Another vote for No thanks-- just works out of the box and
         | blocks most(all?) irritating full screen "popups" that everyone
         | seems to love so much
        
         | traceroute66 wrote:
         | I'm afraid I take issue with your "ask your boss to pay"
         | statement.
         | 
         | As someone who has been involved with running small businesses,
         | I can tell you that "the boss" very quickly gets annoyed with
         | employees who nickle & dime their employer and try to get "the
         | boss" to pay things that the employee should be paying for.
         | 
         | Sure if a piece of software, hardware or a service is an
         | essential part of your work (e.g. mobile phones for field
         | engineers) then it should be paid for in an appropriate manner
         | by "the boss" (or a suitable alternative provided, e.g. company
         | phone). No question, no argument.
         | 
         | But frankly, ad-blockers or other non-essentials ? Forget about
         | it. Being a "good boss" includes being able to draw a line in
         | the sand somewhere. A "bad boss" is one who thinks "the
         | company" can pay for everything "just because" is doing the
         | themselves, their employees and their company a disservice.
         | Businesses are not charities and everyone should be doing their
         | bit to keep overheads in control.
         | 
         | Not only that. But in some jurisdictions the local tax
         | authorities will make taxation differentiation between items
         | that are of demonstrable business needs and those that are not.
         | This tax differentiation may not only impact the business
         | itself, but your own personal tax situation too.
        
           | Permit wrote:
           | > Being a "good boss" includes being able to draw a line in
           | the sand somewhere. A "bad boss" is one who thinks "the
           | company" can pay for everything "just because" is doing the
           | themselves, their employees and their company a disservice.
           | 
           | I have never met an employee who would characterize a "good
           | boss" this way. At the very least I would not expect these
           | qualities to win you any favours with prospective employees.
           | 
           | No one looks at "free work lunches" as some negative
           | reflection of their boss' financial acumen. The closest
           | complaint I could think of is that the employee would rather
           | that money go directly into their pocket. Maybe there is a
           | middle ground here where you could allocate $50/year for
           | these sort of expenditures in order to prevent them from
           | getting out of hand?
           | 
           | I'm not sure, but I would consider reflecting on what really
           | makes a "good boss" and whether or not your employees would
           | agree with you on it. To me the distinction between a good
           | boss and a bad one is always decided by the employees, never
           | the boss.
        
             | traceroute66 wrote:
             | > Maybe there is a middle ground here where you could
             | allocate $50/year for these sort of expenditures in order
             | to prevent them from getting out of hand?
             | 
             | Here I would agree and indeed I think certain people
             | misunderstood my original comment.
             | 
             | What I was trying to say originally is a bad boss is one
             | who has an open-wallet policy to "but its just $10 a year",
             | because that sort of policy will absolutely come back and
             | bite them or their company in the backside one day once the
             | numbers start adding up. So yes, absolutely, set a hard
             | dollar limit per employee and we're in agreement.
             | 
             | Because yes, if you don't, things will soon get out of hand
             | (as I posted in more detail a little further down).
        
           | criddell wrote:
           | It can cost more to get approval on some small amount than
           | the actual invoice is. Some places don't require approval for
           | amounts under $x.
        
           | unknown_error wrote:
           | Sorry, but if the employees have to worry about $10 a year
           | because of nickle-and-diming from the boss, you're either not
           | paying them enough or your process for small reimbursements
           | suck. Why don't the employees have enough freedom to get the
           | tools to do their job in an appropriate way instead of
           | wasting your time and theirs discussing whether $10/yr for
           | adblocking is an essential part of their job? That argument
           | alone would cost you more than $10, not to mention the
           | productivity loss from intrusive advertising.
           | 
           | Yeesh, I'd hate to work for a boss like you...
        
             | traceroute66 wrote:
             | unknown_error
             | 
             | I'm sorry but the "its only $10/year, the boss can pay for
             | it" argument is complete BS.
             | 
             | That "its only $10/year" multiplied by how many employees ?
             | It soon adds up.
             | 
             | Then we move into the same BS "only $10/year" argument that
             | charities use when they are asking you to donate "only $5"
             | or that Disney+ uses when it says its "only $10/month".
             | 
             | The "only $5" to the charity and the "only $10/month" to
             | Disney+ is ON TOP OF all the other household expenses. So
             | really, you are foolish to buy into the "its only" school
             | of nonsensical argument.
             | 
             | So we soon find ourselves in an exponential situation. One
             | day we say "yes, the boss will pay $10/year for this" ....
             | roll forward a few weeks then its "oh, just another
             | $10/year for that" ... etc. etc. ... multiplied by X
             | employees .... soon equals $$$$.
             | 
             | That's why a good boss is one who is capable of standing
             | their ground and putting that line in the sand.
             | 
             | Otherwise the overheads soon run away from you and you find
             | yourself spending tens of thousands of dollars a year on
             | "but its just $10" non-essential stuff.
             | 
             | Basic business good practice.
        
               | edmundsauto wrote:
               | What if you just looked at it as a cost for employee
               | satisfaction? If I ask my boss for a license to try a new
               | tool, or something to make my workflow better, it's a win
               | for the company if it works out. If it doesn't work out,
               | it's a win for my morale, because I don't have a
               | skinflint of a boss.
               | 
               | When a manager evaluates what is "essential", it creates
               | a dynamic where they either agree with me (which wastes
               | both our time), or they disagree (which creates
               | conflict). For $10/year, it's better to avoid the
               | conflict and just buy the stupid tool that makes the ICs
               | life better.
        
               | unknown_error wrote:
               | It's not so much the particular dollar amount, but that
               | it sounds like your process for evaluating whether a tool
               | is "essential" is basically "whatever the boss feels like
               | in any given moment" instead of having a transparent
               | evaluation process that weighs, say, $10/yr/employee vs
               | the potential for risk for malware/ransomware or even
               | simply lost productivity from intrusive ads. Is there
               | ANYTHING an employee should be able to spend money on
               | without your direct approval? Is there a process for
               | evaluating what is an essential business expense, per
               | position? How small a business is yours?
               | 
               | It's the lack of agency and trust in your people, rather
               | than the specific dollar amount, that bothers me. If it
               | was a $10,000 or even $1,000 purchase (such as buying
               | that ad blocker for the whole company), yes, of course it
               | should merit more evaluation, but is there really a need
               | to micromanage every single transaction instead of having
               | an expense account of some sort and a procedure for small
               | purchases and reimbursements?
               | 
               | Not arguing that all your employees should be able to
               | arbitrarily spend company funds willy nilly, just that
               | they deserve some say in how they do their jobs, and the
               | tools they need for it, instead of being simply being
               | told "no, because if we scaled that $10 up to all the X
               | employees it would be too expensive..." Did all the X
               | employees even ask for that..?
        
       | ta988 wrote:
       | Website owners are desperate to capture your attention, they have
       | a few seconds to do so. What they don't realize is that beong
       | annoying is also what causes people to only stay for a few
       | seconds on their site. For me it is a good sign of someone who
       | cares more about advertisement than content so it makes me close
       | the page 99% of the time.
        
         | unknown_error wrote:
         | Totally agreed... if only I could get management to listen too.
         | If were it up to me as a dev, there would NEVER be
         | popups/overlays. We spend forever optimizing UX before and
         | during site design, only to have some middle manager come in at
         | the last minute and say "We need a newsletter modal signup to
         | drive conversions." They end up chasing signup numbers instead
         | of overall user satisfaction.
        
       | tzahifadida wrote:
       | Actually I think that a lot of newsletters of professionals i am
       | interested in are worthy and these are edge cases. I will be more
       | worried not seeing these because they were filtered out by some
       | plugin. I would say that if a site has a harassing newsletter
       | then the whole site is probably not worth my time.
        
         | inshadows wrote:
         | I also think that hiding newsletter sign up forms is bad,
         | because you lose the ability to detect that the site is just a
         | blog spam. Real blogs offer RSS/Atom. Only those freaks
         | fighting for user engagement dumping zero content articles
         | offer newsletters.
        
           | shortformblog wrote:
           | Hi, I run a long-form newsletter. I send it out twice a week.
           | I spend many bleary nights putting content together for it,
           | and its pieces occasionally appear here on Hacker News. I do
           | a lot of research for it--and pay lots of money for tools to
           | access that research. I often do interviews. It often brings
           | in contributors. Those contributors are paid.
           | 
           | It has an RSS feed as a service to readers. But it was built
           | as a newsletter because I specifically wanted to experiment
           | with that form and felt that I could do interesting things
           | with it.
           | 
           | Since I started it six and a half years ago, a couple
           | interesting things have happened in the sector: One, the
           | interest in building a business model around blogs has
           | shifted over to the newsletter space. And two, people who
           | wouldn't have paid to access a blog now are willing to pay to
           | access a newsletter.
           | 
           | I like to keep my content open to a large amount of people,
           | so I rely on newsletter sponsorships as a business model, but
           | my advertising approach is very minimal compared to other
           | publishers. I'm not particularly aggressive with my signup
           | form--I run it at the top of my front page and put a pop-up
           | option at the bottom of the article, and sometimes I
           | implement an occasional interstitial when an article seems
           | like it's doing very well and I want to catch the user's eye.
           | 
           | But I don't do as much as I could. It probably could be
           | larger if I did.
           | 
           | But this isn't about me. This is about this comment. Which is
           | to say: This is a really cruel thing to say about
           | newsletters. I know a lot of creators who put hours of work
           | into their newsletters each week on top of full-time gigs,
           | grinding it out with the goal of hoping to do something with
           | that newsletter.
           | 
           | It also ignores the business realities of the newsletter
           | space. Trying to build a blog into a business is really tough
           | these days (it was back then too, something I know because I
           | was blogging back then). But newsletters have created a path
           | of opportunity for those who want to build things
           | independently.
           | 
           | And while I will never claim that the work they do is
           | perfect, it's certainly not blog spam.
           | 
           | So I request, hey, maybe research the space before you use
           | such a broad brush. Thanks.
        
             | [deleted]
        
           | mschuster91 wrote:
           | > Real blogs offer RSS/Atom. Only those freaks fighting for
           | user engagement dumping zero content articles offer
           | newsletters.
           | 
           | The thing is, ever since Google Reader got terminated, the
           | user base of RSS/Atom clients has been steadily dropping
           | down. Firefox IIRC terminated "dynamic bookmarks" _years_
           | ago, and even back when it existed it didn 't support
           | notification.
           | 
           | Yes, us nerd crowd knows that RSS exists, but ever tried
           | asking your boss or your teenage kid what "Atom" is?
           | 
           | Actual mail newsletters are the only way left other than even
           | more obnoxious browser push nags to get your users notified
           | that you have new content.
        
             | inshadows wrote:
             | Your point is valid. Still, I find that most "blogs"
             | offering newsletters these days are frivolous content
             | dumping ground, and the reason for their existence is
             | likely just building Internet persona or something.
             | 
             | EDIT: Lots of such noise (often appearing here on HN) comes
             | from Substack.
        
               | k1m wrote:
               | > the reason for their existence is likely just building
               | Internet persona or something.
               | 
               | Another appeal of newsletters I think is distrust
               | (rightly in my opinion) in corporate algorithms powering
               | people's feeds. I'd love to see more widespread adoption
               | of RSS again, rather than a push to get everyone signed
               | up to a bunch of different email newsletters. Perhaps
               | once people feel overwhelmed with their inbox getting
               | filled with email newsletters, RSS will make a
               | resurgence.
               | 
               | I wrote the piece linked here, and it wasn't intended to
               | be an anti-newsletter post. Just against newsletters
               | being pushed on readers in the middle of the article
               | they've started reading, as opposed to a sidebar or at
               | the end of the article. We have web apps that produce
               | stripped-down articles to be read on e-readers or printed
               | out. These newsletter signup requests often blend into
               | the content as just another paragraph of text, which can
               | be a little jarring when you're engrossed in reading.
               | 
               | Quite a few of the writers I like are on Substack, and
               | earn money through it. Substack also offers full-text RSS
               | feeds (for publicly-accessible content, not sure what
               | level of RSS support there is for paid content - perhaps
               | that's still email-only). And their Reader has RSS
               | support too -
               | https://news.ycombinator.com/item?id=25444507 - I hope
               | that's a sign that they'll be doing more with RSS in the
               | future.
        
             | Breza wrote:
             | I still miss Google Reader. I wonder if Google wishes it
             | had kept it running now that their efforts to launch social
             | networks failed and Twitter/Facebook have taken over much
             | of the space that RSS used to occupy. Google could have
             | built such a huge dataset of individual-level reader
             | preferences (not saying that's the best use of RSS, just
             | that there's a business case).
        
         | Fnoord wrote:
         | If I want to sign-up to a newsletter from a company I will
         | usually find it either manually, or I subscribe to their RSS
         | feed. In almost every case I visit a website, I do not want to
         | subscribe to a newsletter. Therefore, it distracts from my
         | goal. To draw a parallel: in almost every case I start up an
         | Android application, I do not want to rate the Android
         | application. I don't want to spend any time on either of such
         | nonsense. In fact, from today on, I will rate applications on
         | Android which ask me to rate them a bad rating just because I'm
         | done with that spam. As for newsletters, if I want them, I will
         | find them. Take for example Bruce Schneier's blog. You can find
         | the newsletter without getting spammed about it. That's the
         | correct way.
        
       | greggturkington wrote:
       | I keep a long list of these I can contribute. The authors need to
       | take more advantage of fuzzy attribute matching. Really go after
       | those patterns you see frequently from Mailchimp and other major
       | newsletter vendors first!
       | [class*="mc_embed_signup"]         [id*="mc_embed_signup"]
       | [class*="FreeNewsletter" i]         [class*="inline-newsletter"]
       | [class*="newsletter-article"]         [class*="newsletter-form"]
       | [class*="newsletter-signup"]         [class*="newsletter-tout"]
       | [class*="newsletter-widget"]         [class*="NewsletterCard" i]
       | [class*="NewsletterSignup" i]         [class*="newssignup"]
       | [id*="SignupForm" i]         [id*="signupWrapper" i]
       | 
       | CSS isn't so simple anymore though, with CSS-in-JS we're moving
       | away from deterministic class name. We can still target other
       | attributes with CSS though, which I don't see used in these lists
       | nearly enough, for example data attributes:
       | [data-title*="Mailchimp"]
        
       ___________________________________________________________________
       (page generated 2021-06-14 23:02 UTC)