[HN Gopher] You either die an MVP or live long enough to build c...
       ___________________________________________________________________
        
       You either die an MVP or live long enough to build content
       moderation
        
       Author : mmcclure
       Score  : 559 points
       Date   : 2021-09-28 15:53 UTC (7 hours ago)
        
 (HTM) web link (mux.com)
 (TXT) w3m dump (mux.com)
        
       | ModernMech wrote:
       | Our MVP needed content moderation. We put a database tool up and
       | immediately, our first user started using it to create a public-
       | facing database of his porn collection. It was... quite the
       | collection.
        
         | dkh wrote:
         | It's inevitable that these folks would show up pretty quickly,
         | but your very first user? Extremely impressive!
        
       | view wrote:
       | I'm building a human content moderation service as an API for
       | developers. It's not fully ready yet (expected next week) but
       | people can sign up and start exploring the docs. I'd love to hear
       | your feedback and any features you might want to see:
       | 
       | https://moderationsystem.com/
        
         | abraae wrote:
         | Feedback: I went to the site with one question in mind - what
         | type of moderation do you use?
         | 
         | e.g. would your services be useful to a christian social
         | network? Or to a goth band forum? In the US? In Iceland?
         | 
         | i.e. how do you sync your service to the style, nuance and
         | voice of your customer?
        
       | jonshariat wrote:
       | I'll never forget having to be a moderator for a somewhat popular
       | forum back in the day and oh man did I learn how a few people can
       | make your life hell.
       | 
       | One thing not mentioned many times in these discussions are the
       | poor moderators. Having to look at all that stuff, some of which
       | can be very disturbing or shocking (think death, gore, etc as
       | well as the racy things) really takes a toll on the mind. The
       | more automation the less moderators have to deal with and then
       | usually its the tamer middle ground content.
        
         | wffurr wrote:
         | Someone still ends up reviewing images for the ML training
         | dataset.
         | 
         | That's still a huge improvement over every mod everywhere
         | seeing the same images repeatedly, but someone has to make the
         | call at some point.
        
         | [deleted]
        
         | Goronmon wrote:
         | _I 'll never forget having to be a moderator for a somewhat
         | popular forum back in the day and oh man did I learn how a few
         | people can make your life hell._
         | 
         | I was also a mod for a popular gaming forum way back in the
         | day. It was pretty miserable looking back.
         | 
         | Personally, for me, the extreme/shocking content wasn't the
         | biggest issue. That stuff was quick and easy to deal with. If
         | you saw that type of content you just immediately deleted it
         | and permanently banned account. Quick and easy.
         | 
         | What was a lot harder were the toxic users that just stuck
         | around. Not doing anything bad enough to necessarily warrant a
         | permanent ban, but just a constant stream of shitty behavior.
         | Especially sometimes when the most toxic users were also some
         | of the most popular users.
        
           | nicbou wrote:
           | That was also my experience moderating a medium-sized city
           | subreddit. Bigger problems were easily dealt with. Toxicity
           | was a lot harder to deal with, especially when it's so easy
           | to create a throwaway account. I quit when one user decided
           | to target me personally, and kept evading bans to cause more
           | grief.
           | 
           | All of this crap, and your reward is more complaints, more
           | demands.
        
           | zy0n911 wrote:
           | This forum, didn't happen to start with a T and end with a G?
           | (Shortened acronym)
        
           | sroussey wrote:
           | We had some of the crazy people track us down and call in
           | bomb/death threats to our office building.
           | 
           | So many though we were in collusion with a specific forum
           | moderator (out of a million forums) and got to incensed. And
           | this was in the early 2000s that we think was a saner time.
        
             | lelandfe wrote:
             | A close friend of mine is a primary contributor to an
             | extremely popular console emulator. He learned quickly to
             | author under an alias which he keeps secret - even from
             | most of our friend group.
             | 
             | It's bizarre that he has to keep this real love of his,
             | which he's devoted hundreds and hundreds of hours to, so
             | close to his chest.
             | 
             | But sadly The Greater Internet Fuckwad Theory holds true
             | today.
        
           | user-the-name wrote:
           | > What was a lot harder were the toxic users that just stuck
           | around. Not doing anything bad enough to necessarily warrant
           | a permanent ban, but just a constant stream of shitty
           | behavior. Especially sometimes when the most toxic users were
           | also some of the most popular users.
           | 
           | What people find out, again and again, is that you just ban
           | those users. Don't need an excuse. Just ban them. Even if
           | they are popular. Your community will be much better once you
           | do.
        
             | doublerabbit wrote:
             | > What people find out, again and again, is that you just
             | ban those users. Don't need an excuse. Just ban them. Even
             | if they are popular.
             | 
             | Rule #1 of moderation. Keep it in an open transparent log
             | and you'll find positivity.
        
         | saas_sam wrote:
         | There have been a bunch of articles lately about the horrors
         | that Facebook moderators have to pour through. FB has been
         | forced to pay $MMs to some of them for mental health:
         | https://www.bbc.com/news/technology-52642633
        
         | wainstead wrote:
         | > Having to look at all that stuff, some of which can be very
         | disturbing or shocking
         | 
         | Yup, was the designated person to report all child porn for our
         | photo-sharing website. It was horrific. Some of those images
         | still haunt me today, they were so awful. And the way the
         | reporting to the NCMEC[0] server worked, you had to upload
         | _every single image_ individually. They did not accept zip
         | files or anything at the time. It was a giant web form that
         | would take about forty image files at once.
         | 
         | [0] https://www.missingkids.org/HOME
        
         | echelon wrote:
         | > I'll never forget having to be a moderator for a somewhat
         | popular forum back in the day
         | 
         | Similar experience, though I'll say that the worst was dealing
         | with other teenagers that threatened suicide when you banned
         | them. That always took a lot of effort to de-escalate and was a
         | complete drain on personal mental health.
         | 
         | I could deal with porn, shock images, and script kiddie
         | defacements, but having people threaten to kill themselves was
         | human and personal. It hurt, especially when the other person
         | was legitimately having a personal crisis.
         | 
         | I still think about some of these people and wonder if they're
         | okay.
        
           | xwdv wrote:
           | Several years ago a popular gaming forum with a significant
           | teenage audience I used to read had declared a simple policy
           | toward threats of suicide. If you were threatening to kill
           | yourself, _do it_ , and stop messaging the mods, they are not
           | here to talk you down from a ledge. It seemed pretty
           | effective.
        
             | fragmede wrote:
             | That's horrible! Did you run that plan past any lawyers?
        
         | ericd wrote:
         | Even without seeing that stuff, seeing a constant stream of bad
         | behaviors with the probably-good behavior filtered out can
         | subtly change your priors about people - it makes you start
         | thinking people suck more in general, kind of like how watching
         | news where they show the worst of the worst makes one trust
         | people less.
         | 
         | I definitely used to notice this after some time working on our
         | moderation queues.
        
       | tomcam wrote:
       | There is an untold story here which is how incredibly well HN is
       | moderated. Don't know how it can remain so good. I feel like the
       | center cannot hold. The site seems to me hideously understaffed,
       | yet they do a pretty much perfect job of moderating. Would love
       | to know if it is all human, supplemented by ML, or what.
        
       | duxup wrote:
       | So true.
       | 
       | In HN articles where we discuss social media moderation there's
       | often this idea that "they shouldn't be doing this at all". But I
       | think for most companies and users ... they won't like what a
       | completely moderation free site looks like.
       | 
       | So here we are with this painful problem.
       | 
       | I kinda wish there was an imaginary "real person with an honest
       | identity" type system that did exist where we could interact
       | without the land of bots and dishonest users and so forth. But
       | that obviously brings its own issues.
        
         | throwawaysea wrote:
         | The problem is where the line is drawn. Arguments against
         | explicitly illegal content (like CSAM) or unhelpful content
         | (like spam) are used to justify content moderation that goes
         | far beyond. Moderation on the biggest social media platforms
         | (like Facebook, Twitter, TikTok, etc.) includes more than just
         | basic moderation to make the platform viable. They include a
         | number of elements that are more like censorship or propaganda.
         | These platforms ultimately bias their audience towards one set
         | of values/ideologies based on the moderation policies they
         | implement. And given that most of these companies are based in
         | highly-progressive areas and/or have employee bases that are
         | highly-progressive, it is pretty clear what their biases are.
         | This present reality is unacceptable for any society that
         | values free and open discourse.
        
           | Nasrudith wrote:
           | Propaganda and bias are ultimately subjective notions.
           | Suggest that kings are no different from the rest of us today
           | aside from what their parents did to seize power and it would
           | be slammed as wicked propaganda denying their divine right.
           | Heck just suggest that disfavored groups are people!
        
         | commandlinefan wrote:
         | > they won't like what a completely moderation free site looks
         | like.
         | 
         | So you say, but we've never actually had a chance to see one.
         | We _have_ seen content moderation slippery-slope its way to
         | highly opinionated censorship... every time it 's been tried.
        
           | duxup wrote:
           | >but we've never actually had a chance to see one
           | 
           | Have we not had site after site that does this eventually
           | start moderating for reasons every time? And we have darkweb
           | sites...
           | 
           | What are we missing?
        
           | unethical_ban wrote:
           | This is false. 4chan, some early reddit, voat, and a number
           | of other sites.
           | 
           | Moreover, that's the point: it's as i.possible to have a
           | healthy anonymous forum available to the world as it is to
           | have a large society with no laws or government, that isn't
           | dystopian.
        
             | commandlinefan wrote:
             | > early reddit
             | 
             | Reddit is the best example of why heavy moderation is worse
             | than lack of moderation.
        
               | ronsor wrote:
               | That's only because the most powerful reddit moderators
               | are basically corrupt dictators.
        
               | commandlinefan wrote:
               | It's not just reddit, though - if you're going to start
               | moderating, you have a meta-problem to address which is
               | what to do about opinionated moderators.
        
               | unethical_ban wrote:
               | All mods are opinionated. They're either humans, or bots
               | programmed by humans. I don't see why you're trying to
               | deny humanity to moderators.
        
               | unethical_ban wrote:
               | It's a difficult problem.
               | 
               | No moderation = porn, gore and nazi discussion. Period.
               | 
               | "Light" moderation often = bad-faith trolls taking
               | advantage of your moderation.
               | 
               | HN does a good job of moderating for civility and effort
               | of the post, rather than ideology.
               | 
               | Here's the thing: Talking about controversial topics, or
               | debating with people who think differently than you,
               | takes a lot of effort online, because there are so many
               | trolls and others baiting you into defending a position
               | without putting any effort to explain their own.
               | 
               | So yeah, heavy moderation is an unfortunate necessity in
               | some forums.
        
           | ziml77 wrote:
           | There used to be a fair chance that you'd stumble on CSAM on
           | 4chan. Without filtering and aggressive moderation, that's
           | what ends up happening (yes 4chan did have moderation to
           | delete the stuff back then and dish out IP bans, but it
           | wasn't fast enough to save people from seeing those things)
        
           | Veen wrote:
           | One alternative to moderation is to make people pay to post.
           | If you set the price high enough, you'll disuade most
           | spammers and some other species of asshole. Unfortunately,
           | people who hate moderation also tend to hate paying for
           | online services.
        
         | bogwog wrote:
         | > I kinda wish there was an imaginary "real person with an
         | honest identity" type system that did exist where we could
         | interact without the land of bots and dishonest users and so
         | forth. But that obviously brings its own issues.
         | 
         | That sounds like it could be done in a way that isnt terrible.
         | As a user, you sign up with an identity provider by submitting
         | personal documents and/or an interview to prove that you're a
         | real person.
         | 
         | Then, when you sign up to an app/service, you login with the ID
         | provider and come up with a username for that service.
         | 
         | The ID provider does not give the website provider any of your
         | personal information; they just verify that you exist (and you
         | login their their secure portal)
         | 
         | The identity providers could further protect privacy by
         | automatically deleting all of your personal documents from
         | their databases as soon as the verification process is
         | complete. They could also have a policy to not store any logs,
         | such as the list of services you've signed up for.
         | 
         | This could still be gamed (ex: a phone scammer tricking someone
         | into getting verified with a provider to get a valid ID), but
         | itd make things much harder and costlier.
         | 
         | Am I missing anything obvious that would make this a terrible
         | idea?
        
           | freefruit wrote:
           | Governor Dao has a better working system for this already.
           | Authenticated anonymous identities. They are currently
           | targeting the NFT space, but this tech could also be applied
           | to online communities.
           | https://authentication.governordao.org/
        
           | duxup wrote:
           | The thing I was thinking of when it comes to the ID provider
           | is the potential power an ID provider has... if say you
           | wanted to be validated, but still have some anonymity in some
           | cases ... or just if they decide to invalidate you for any
           | given reason.
           | 
           | Granted all very hypothetical stuff, I'd give it a spin for
           | sure.
        
           | PeterisP wrote:
           | Your idea becomes useless for deterrence from the
           | "automatically deleting" point. The only benefits for having
           | users who are "real person with an honest identity" accrue
           | when people either can't make many accounts (so banning an
           | user actually bans them instead of simply makes them get a
           | new account) or when you can identify them in case of fraud
           | or child sexual abuse material or some such.
           | 
           | So at the very least the "identity provider" absolutely needs
           | to keep a list of all real identities offered to a particular
           | service; otherwise the bad actors will just re-verify some
           | identity as many times as needed.
           | 
           | But if you give up the "hard privacy" requirement, then
           | technically it's possible. It would also mean that the
           | identity provider would sometimes get subpoenas to reveal
           | these identities.
        
         | Nasrudith wrote:
         | I suspect that even if we could we would wind up with a
         | disturbing "reverse Turing test". The dark suggestion that
         | there isn't any difference between us and other malfunctioning
         | machine learning arrangements trained by huge data sets. They
         | may be objectively homo sapiens with honest identities devoted
         | to something which makes them act indistinguishable from a bot.
        
       | rtkwe wrote:
       | It's really funny every time a "we don't censor" platform pops up
       | catering too the American right they speed run going from
       | moderation == censorship to we're moderating our platform in
       | record time. Turns out moderation is really important to make a
       | platform for a community.
        
         | User23 wrote:
         | How is that funny? Every platform has to block illegal content.
         | Every platform wants to block low value content like spam. Many
         | platforms want to block obscenity and pornography. None of this
         | is in any way news to any of the platforms you're alluding to.
         | 
         | The interesting distinction between platforms is not whether
         | they moderate, but what lawful and non-abusive (of the platform
         | itself) content they permit.
         | 
         | Edit: Child is incorrect. The vast majority of moderation on
         | free speech platforms is criminal threats and other illegal
         | speech.
        
           | rtkwe wrote:
           | It's not illegal content I'm talking about. I'm specifically
           | thinking of sites like Parler and Gab which very loudly and
           | specifically start as anti-censorship of anything legal for
           | the American right that feel like they're being censored off
           | of the regular platforms. Then they quickly learn that no
           | moderation means you'll get absolutely flooded with trolls
           | who aren't a fan of your chosen ideology and are willing to
           | spam and troll you. That quickly encourages them to start
           | actually moderating in some regards the exact thing they were
           | created in opposition to because what they're actually mad
           | about is specific moderation decisions not the idea of
           | moderation in general.
        
           | thomascgalvin wrote:
           | The irony comes from the fact that their moderation almost
           | always falls into two categories:
           | 
           | 1. They have to moderate the content that got them kicked off
           | the original platform, because it turns out nobody wants to
           | buy ad space on a forum dedicated to why the Jews are
           | responsible for all of the world's evils; and
           | 
           | 2. They choose to moderate dissenting political opinions,
           | which is just bald hypocrisy.
        
         | Sohcahtoa82 wrote:
         | If you create a platform with absolutely zero censorship, you
         | become a repository for child porn. I participated in Freenet
         | many years ago because I liked its ideas (And thought it would
         | have been a nice way to pirate games without my ISP being able
         | to know), but it got a reputation for being used for CP, and I
         | promptly deleted it, because I want no part in that.
         | 
         | If you merely censor illegal content, you will become a home
         | for disinformation and ultra right-wing conspiracies. See
         | Parler.
         | 
         | In either case, and especially the first, you're likely to get
         | kicked off your hosting platform and get a lot of attention
         | from the government.
         | 
         | I don't think it's possible create a "we don't censor" platform
         | without hosting it in some foreign country that doesn't care
         | about US laws and also hiding that you're the one that runs it.
        
           | throwawaysea wrote:
           | > If you merely censor illegal content, you will become a
           | home for disinformation and ultra right-wing conspiracies.
           | See Parler.
           | 
           | What do you count as disinformation and why is it a problem?
           | If you disagree with something you can ignore it and move on,
           | or engage with it and respond with your own counter-argument.
           | It doesn't seem like a problem that reduces the viability of
           | the entire platform. It is also strange to me that you seem
           | to think a lack of censorship only favors "ultra right-wing"
           | conspiracies. I saw a lot of disinformation about policing
           | being spread throughout 2020 without much evidence. Those who
           | pushed those narratives did not face any moderation for their
           | misinformation. I recall as well when Twitter, Medium, and
           | others banned discussions of the lab leak theory. The pro-
           | moderation crowd unwittingly aided in the CCP's avoidance of
           | accountability and smeared some very rational speculation as
           | disinformation. I don't think I want anyone - whether the
           | government, powerful private companies, or biased moderators
           | - to become the arbiters of permitted opinions.
           | 
           | > In either case, and especially the first, you're likely to
           | get kicked off your hosting platform and get a lot of
           | attention from the government.
           | 
           | It's also curious to me that you mention Parler, because
           | January 6th was organized more on other platforms than on
           | Parler. Silicon Valley acted in unison against Parler because
           | they share the same political biases among their leaders and
           | employees, and because they share that degree of monopolistic
           | power (https://greenwald.substack.com/p/how-silicon-valley-
           | in-a-sho...). The darker part of this saga is that sitting
           | members of the US government pressured private companies
           | (Apple, Google, Amazon) to censor the speech of their
           | political adversaries by banning Parler
           | (https://greenwald.substack.com/p/congress-escalates-
           | pressure...), in what can only be called an abuse of power
           | and authority. When tech companies are facing anti-trust
           | scrutiny and regulatory pressure on other issues, why
           | wouldn't they seek favor by doing the incoming government's
           | bidding and deplatforming Parler? I feel like the actions
           | observed in the Parler saga are less about moderation and
           | more about bias and power.
        
             | PeterisP wrote:
             | It is a problem because you'll get thrown out by your
             | hosting and other service providers if you don't moderate
             | your content; so if you want to keep running your service,
             | not moderating is simply not a practical option. _That_ is
             | why Parler is mentioned, they are a demonstration that it
             | 's not practical to keep operating without accepting a duty
             | to moderate (as Parler did eventually) even if you try
             | really, really hard.
             | 
             | And while there are a lot of conspiracies, all of which
             | will be on your site if you don't moderate, most of them
             | will be tolerated by others but it's the ultra right-wing
             | conspiracies / nazis / holocaust deniers that will cause
             | your service threats of disconnection; so you'll either
             | start moderating or get your service killed in order to
             | protect _them_.
             | 
             | I understand you don't want anyone - whether the
             | government, powerful private companies, or biased
             | moderators - to become the arbiters of permitted opinions;
             | however, you don't really get to choose (and neither do I);
             | currently there _are_ de facto arbiters in this world.
        
             | CamperBob2 wrote:
             | _If you disagree with something you can ignore it and move
             | on, or engage with it and respond with your own counter-
             | argument._
             | 
             | I sympathize with a lot of what you wrote (and didn't
             | downvote it), but the Trump era has highlighted a serious
             | problem with your specific point above. The marginal cost
             | of bullshit is zero. It takes basically no effort to post
             | more of it, while it always takes at least a small amount
             | of effort to debunk it.
             | 
             | Worse, the bullshitter usually has the first-mover
             | advantage. To claim the initiative, all you have to do is
             | post a new thread predicated on far-right propaganda or
             | conspiracy theories or hijack an existing one. Once lost,
             | the rhetorical high ground is difficult and time-consuming
             | to reclaim. As soon as you argue with the shitposter, they
             | effortlessly shift their role from aggressor to victim, as
             | some would suggest is happening in this very conversation.
             | 
             | I've always maintained that the antidote to bad speech is
             | more speech. A few years ago I would have died on this hill
             | at your side. But principles that don't work in practice
             | are useless... and this one, having been tested, simply
             | doesn't work in practice. The sheer quantity of bullshit
             | has an ironclad quality all its own.
        
             | Sohcahtoa82 wrote:
             | > What do you count as disinformation
             | 
             | I won't get into the argument of "Well then who is the
             | arbiter of truth?" because honestly, I don't have an
             | answer. It can't be the government for obvious reasons, but
             | it also can't be private corporations, and certainly can't
             | be the general public. That leaves...nobody. Maybe a non-
             | profit organization, but even those could easily be
             | corrupted.
             | 
             | > and why is it a problem?
             | 
             | Nearly 700,000 US deaths from COVID so far, a number that
             | continues to rise due to anti-vax disinformation convincing
             | people to not vaccinate.
             | 
             | Disinformation is _literally killing people_ by
             | contributing to the continued spread of a pandemic. It 's
             | absolutely insane to me that you would genuinely ask why
             | disinformation is a problem.
             | 
             | Just because I don't have a solution to a problem doesn't
             | mean the problem doesn't exist.
             | 
             | > If you disagree with something you can ignore it and move
             | on, or engage with it and respond with your own counter-
             | argument.
             | 
             | If this was an effective approach, Tucker Carlson would
             | have been off the air ages ago, QAnon would have been
             | dismissed as a crackpot by everybody, and disinformation
             | wouldn't be a problem.
        
               | throwawaysea wrote:
               | If you don't mind answering another question, why the
               | hate for Tucker Carlson? I don't follow him but people
               | have complained about him so much that I've now seen a
               | few videos to see what the fuss is about. I didn't see
               | anything wrong within those few clips I saw (maybe an
               | hour's worth) - it didn't seem any different from any
               | other mainstream news in that one side of the argument
               | was being presented, with a lot of conviction. But I did
               | not see misinformation. I am sure there's some non-zero
               | amount of misinformation that can be found from scouring
               | his clips, but that's true for anyone and any source, and
               | I certainly don't think he should be "off air" for it. I
               | can't help but think that a lot of the character attacks
               | against him are simply made because he's a prominent and
               | successful voice on the "other side", and his
               | effectiveness is a risk to political adversaries.
               | 
               | As someone who wants to see Tucker Carlson off the air,
               | do you see your position on the matter differently? Are
               | there conservative voices you support being platformed,
               | and what makes them different for you?
        
               | CamperBob2 wrote:
               | It's not a good sign when the "News" network you work for
               | has to go to court to argue that no reasonable person
               | would take you seriously.[1]
               | 
               | Just over the past year, Carlson has peddled such obvious
               | falsehoods as claiming the COVID-19 vaccines don't work,
               | the Green New Deal was responsible for Texas' winter
               | storm power-grid failure, immigrants are making the
               | Potomac River "dirtier and dirtier," and that there's no
               | evidence that white supremacists played a role in the
               | violent Jan. 6 Capitol riots. [2]
               | 
               | We don't need this guy on the public airwaves. He should
               | get a blog... that is, if he can find a hosting provider
               | who will tolerate his views.
               | 
               | 1: https://slate.com/news-and-politics/2020/09/judge-
               | rules-fox-...
               | 
               | 2: https://www.thedailybeast.com/tucker-carlson-admits-i-
               | lie-if...
        
           | fragmede wrote:
           | > See Parler.
           | 
           | Gab may be a more useful point of reference, but what's
           | hilarious about right-wing platform is that their censorship
           | is fine, it's the other people's censorship that's the
           | problem. (Similar to how immigrants having fake papers is
           | wrong but having a fake vaccination card is sticking it to
           | the man). Go post some pro-vaccine or pro-mask mandate
           | things, and see how long you last before being deplatformed.
        
       | madrox wrote:
       | I considered doing a moderation as a service startup a couple
       | years ago. I didn't end up doing it because came to the
       | conclusion that global communities aren't the future. I think
       | platforms supporting more silo'd communities that make and
       | enforce their own rules are how it will look. Discord and Twitch
       | use this model, and while they have their problems, the problems
       | look quite different from the ones outlined here.
        
         | dcow wrote:
         | I agree that this is the future. I even feel US politics have
         | suffered from being hoisted onto a global stage. When everyone
         | on the internet (globe) can weigh in on or muck with the
         | politics of your smaller community (country or state) you're
         | going to get into situations that make it hard to practically
         | make decisions and run a country. One of the foundational
         | principles of the US the ability to justifiably oppress
         | minority factions for the good of the majority, but checked by
         | systems of power distribution so that it's not a simply mob
         | rule and limited so as not to impinge on a set of inalienable
         | rights afforded to all citizens. Yet on the global theatre the
         | assumption is that minority opinions now take precedence over
         | the majority. And what's worse, 100 people screaming on twitter
         | now has the same impact as 200,000 matching on Washington (to
         | be clear, 200k marching on Washington is significant and should
         | matter, 200 people on twitter should not).
         | 
         | So what? Well now when we need to oppress minority factions
         | more than ever in the face of a public health crisis and tell
         | people sorry suck it up you live in American where the majority
         | says to mask up and get vaccinated if you want to be in public,
         | we "for some reason" at critical moments in curbing the spread
         | of the pandemic fumble around for months on end because a few
         | anti-vexers all of a sudden have infinite civil liberties and a
         | global platform (note, one that they didn't have when we solved
         | previous public health crises). My fear is that we've become a
         | society of "piss off I can do what I want" rather than one of
         | calculated and ideally minimized oppression.
         | 
         | I also don't understand as a society why we have to hold
         | platforms accountable for content. If the problem is a bunch of
         | illicit material showing up, implement KYC requirements so that
         | individuals are exposed legally to the consequences of posting
         | illegal material. Anonymity is a tool/privilege to be used, not
         | abused, and distinctly not a fundamental human right in the US.
         | Make the default less anonymous (but still private, that is
         | something we're supposed to care about constitutionally) and I
         | suspect a lot of content moderation problems go away.
        
           | dragonwriter wrote:
           | > 100 people screaming on twitter now has the same impact as
           | 200,000 matching on Washington
           | 
           | It doesn't, though, unless at least one of those 100 has a
           | giant following, but "one person with a media megaphone is
           | louder than 100,000 without" isn't new, its older than radio
           | competing with newspapers.
        
       | mfrye0 wrote:
       | While not a site with user generated content, our version of this
       | was a huge increase in spam accounts.
       | 
       | We offer an API with a free tier. So naturally people would
       | create multiple accounts to avoid having to pay.
       | 
       | It's been a huge lesson in rate limiting, IP blocking, and
       | verifying accounts.
        
       | mooreds wrote:
       | I agree. My employer has a moderation product (for comments,
       | usernames, etc): https://cleanspeak.com/
       | 
       | I don't work with it much, but from what I can see it's
       | surprisingly complicated to filter out comments quickly without
       | impacting user experience. I guess you know you've succeeded when
       | the pottymouths join your platform :).
        
         | floren wrote:
         | Can I say "Scunthorpe" under your platform?
        
           | mooreds wrote:
           | Can't say I've tested that one. Depends on how you have it
           | configured, from a brief review of the docs:
           | https://cleanspeak.com/docs/user-guides/cleanspeak-3.x.pdf
           | 
           | As I said, I haven't done much with this product.
        
       | seany wrote:
       | Freenet/lbry/tor hidden sites all exists (and get used all the
       | time) and it's 100% not required there at all. I hope at some
       | point weird moralization of nudity will stop.
        
         | cowmoo728 wrote:
         | Have you gone on darknet sites? They have moderation too, or
         | else they get filled with CP and terrorist propaganda just like
         | every other service. I guess that's "fine" if you're anonymous
         | and don't think the FBI will find you. But if you're running a
         | business on the clearnet there's a real name and address and
         | there will be real life consequences. The FBI gets interested
         | real fast if you don't moderate posts that encourage terrorist
         | acts.
        
           | seany wrote:
           | Yes, near daily.
        
             | Loughla wrote:
             | While that's great, you didn't really address the actual
             | point of that post. I would like to hear your take on that.
        
         | brodouevencode wrote:
         | Exactly. The author does not make a sufficient case of why
         | content moderation is necessary, and doesn't even touch client-
         | side moderation.
        
           | makeitdouble wrote:
           | The article is about business running platforms with UGC.
           | 
           | While free forums on the darknet might get away with a tad
           | more lax policies, if you're a registered business hoping to
           | make any money you won't have a choice but to moderate in
           | some way. At the very least it will be to follow your country
           | laws, and more often than not your clients will require you
           | to do so.
        
             | brodouevencode wrote:
             | Outside of what the law prescribes - because that's a whole
             | different topic with more grey area than not. If it's what
             | the law prescribes then I'm not contending that.
             | 
             | > you won't have a choice but to moderate in some way
             | 
             | This is the way that moderation has been done in the past
             | 10-15 years, but does it have to be? Why couldn't a
             | platform provide user-level controls over what they see
             | instead of making those decisions for them? Early forum
             | software actually did somewhat of a good job of this, and I
             | remember building phpBB extensions that enabled more user-
             | level control. Even with this you can go from super
             | granular to just a couple of primary options. It becomes a
             | tagging/filtering mechanism on behalf of the client.
             | 
             | Edit: UGC platforms may discover that there's some value in
             | finding what filtering options their users use.
        
               | makeitdouble wrote:
               | To be clear, just following legal requirements is no
               | simple task in most countries, and it might already
               | require a significant moderation effort depending on how
               | motivated your users are.
               | 
               | > does it have to be?
               | 
               | In most cases moderation is less about what users want or
               | don't want to see, and more about what you want your
               | platform to be.
               | 
               | For instance people are OK with product suggestions when
               | they go on Amazon, but if your job posting site becomes
               | an endless stream of Amazon links you'll want to curb
               | that. And perhaps your users find interesting products in
               | all these links, but from your perspective it will kill
               | your business (except if you pivot into becoming a
               | product listing site of course)
        
             | commandlinefan wrote:
             | The only real argument in favor of moderation is that the
             | platform owner is legally _obligated_ to do so - which
             | could and should be changed.
        
               | mindslight wrote:
               | Or better yet, do away with these "platform owners" by
               | using decentralized technologies.
        
         | [deleted]
        
         | [deleted]
        
         | nicbou wrote:
         | > moralization of nudity
         | 
         | Even porn sites need moderation. Trying to stop sexual abuse
         | and child pornography isn't weird moralization.
         | 
         | > and it's 100% not required there at all
         | 
         | I'm not super familiar with the darkweb, but I assume that
         | darkweb platforms also have active moderation, even if it's
         | only to keep griefers out. Pornography is not the only use case
         | for moderation.
        
           | quantumBerry wrote:
           | >Trying to stop sexual abuse and child pornography isn't
           | weird moralization.
           | 
           | How does moderation of content prevent sexual abuse or CP? If
           | anything I'd argue it creates more, because those that seek
           | the images instead have to produce their own if they cannot
           | find them.
        
             | npteljes wrote:
             | Why do you think OP tried to stop it by moderation? They
             | simply wanted to not propagate it further.
        
       | shuntress wrote:
       | It really frustrates me that this level of abuse is just accepted
       | as a fact of nature.
       | 
       | It feels like watching masked bandits stick up a bank then walk
       | away casually to the next bank down the street while the bank
       | manager says "Drat! Too bad we didn't stop that one at the door.
       | Oh well, at least they only got one register"
       | 
       | I know all the responses to this are going to be _" mUh PoLicE
       | sTaTe"_ but I really wish there was some system of accountability
       | for breach of trust online.
        
       | sneak wrote:
       | Really the future for content moderation are feeds published by
       | site operators and volunteer moderators that individual readers
       | can opt in or out of for filtering.
       | 
       | Relying simply on a central authority to decide what you should
       | be allowed to read is a system with utterly predictable failure
       | modes (not the least of which is too much volume for the
       | centralized mods).
        
       | Pxtl wrote:
       | At this point I'm surprised companies aren't blanket-requiring
       | phone numbers so they have something a little more concrete to
       | ban, much less going whole hog and demanding government-issued ID
       | like drivers' licenses or something.
        
       | sneak wrote:
       | This article opens up with an anecdote about policing morality
       | and imposing one's own local norms on the legal speech of others.
       | 
       | NSFW is a euphemism. Please don't push your American puritanical
       | worldview on every user of your UGC platform.
        
         | regiostech wrote:
         | If they own the platform, why shouldn't they get to make the
         | rules?
        
           | sneak wrote:
           | Legally, they are within their rights to.
           | 
           | Practically, it makes them an asshole.
           | 
           | Just because you are legally entitled to do something doesn't
           | mean it's good for your users or society: e.g. the widespread
           | practice of IAPs, or Apple's censorship of the App Store.
        
         | dkh wrote:
         | Okay, so disregarding that label, most platforms will have a
         | target or focus or niche (well, the ones that want a chance of
         | surviving, anyway) and they will thus be very wise to tailor
         | the rules around that, and create the conditions ideal for
         | fostering that type of content.
         | 
         | For instance, if you were starting, say, a TikTok-esque video
         | app but for super-quick tutorial videos, wouldn't it make sense
         | for upload criteria to require it be some sort of tutorial,
         | stay within some time limit, and probably _not_ be just a
         | gratuitous video of a bunch of people having sex? Call it
         | whatever you want --  "NSFW" is just a shortcut, a heuristic
         | that most people understand the meaning of regardless of
         | whether it is actually safe or unsafe at their place of work.
         | But there can be no denying that platforms/communities serving
         | some interest or demographic will have their own unique
         | requirements, their policies and standards will reflect this,
         | and very often this will preclude "NSFW" content.
         | 
         | A lot of people get bent out of shape about this and view these
         | sorts of policies solely as some sort of censorship issue, but
         | many fail to realize that most of the time it's just about
         | creating the ideal conditions for the community/platform to
         | take hold.
        
       | loco053100 wrote:
       | Ihr benutzt mein Handy als ware es eures ihr steuert mein Handy
       | macht was ihr wollt damit. Hirnlose ratten seit ihr ich zeig euch
       | an 7 jahre Steuerung 7 jahre habt ihr mein leben beeintrachtigt
       | wenn man sich das allein vorstellen unter aller sau krank seit
       | ihr. Ich will euch weg deswegen geh ich zum Anwalt und verklagt
       | euch endlich erpressen wollte mich jemand nachstes Jahr 20.000
       | doller. Ich hau ihn 21.000 mal auf die nuss . 7 jahre habt ihr
       | mein leben beeintrachtigt jetzt ist zahltag fur euch sonst is nie
       | ruhe!
        
       | quantumBerry wrote:
       | The internet itself is unmoderated in any useful sense for
       | content, yet it has lived longer than most of these cheesy
       | "moderated" products that seek to impose their morality on you.
        
         | oblio wrote:
         | Unmoderated? Probably 99% of traffic goes to the same top 500
         | sites which are heavily moderated.
        
           | quantumBerry wrote:
           | As late as 2018, torrent traffic has been quoted as the fifth
           | in internet traffic [1].
           | 
           | [1] https://torrentfreak.com/netflix-dominates-internet-
           | traffic-...
        
           | commandlinefan wrote:
           | Maybe so (that sounds believable, anyway). What he's saying
           | is that the other 1% is unmoderated because there's no
           | central authority [1]. The problem here isn't that people
           | will share bad things if you don't stop them, the problem is
           | that you're in a position of being held responsible for
           | something outside your control. If it's illegal, it should be
           | reported (or found by law enforcement whose job it is to
           | enforce the law) and if it's offensive, offer some user-side
           | filtering.
           | 
           | [1] this is starting to change, though - Amazon took Parler
           | offline completely at the hosting level. Although they
           | eventually found another hosting provider, it's not
           | unimaginable that in the near future, service providers will
           | collaborate to moderate the underlying traffic itself.
        
         | Supermancho wrote:
         | The "internet" isn't liable, so moderate is in the form of
         | transparent traffic shaping. When disruptions are small, costs
         | are either absorbed in aggregate by infrastructure owners (and
         | user attention) until traffic is literally moderated away with
         | routing.
        
         | majormajor wrote:
         | The internet is very moderated, on the contrary, in terms of
         | UGC.
         | 
         | Traditional, non-social, websites have single or known-group
         | authors. When one of them is defaced or modified we call it
         | "hacking" not "unmoderated content." We assume NASA's site has
         | NASA-posted content. We assume Apple's site has Apple-posted
         | content.
         | 
         | Sites with different standards for what they'd publish have
         | been around for decades (for gore, for porn, etc) but many of
         | these still exist in a traditional curated-by-someone fashion,
         | or are more open to UGC but still have some level of
         | moderation.
        
           | quantumBerry wrote:
           | The internet is not moderated in any useful sense for
           | content. Drug markets like white house market, and before
           | that silk road have perpetuated for years. Tor and other
           | darknet websites host content that is nearly universally
           | disdained by governments and even most individuals, which I
           | hesitate to even name here what that heinous content is (you
           | and I both know some examples).
           | 
           | > We assume NASA's site has NASA-posted content. We assume
           | Apple's site has Apple-posted content.
           | 
           | Trust in identity is not the same thing as useful moderation
           | of content. That's useful moderation of identity.
           | 
           | >Sites with different standards for what they'd publish have
           | been around for decades (for gore, for porn, etc) but many of
           | these still exist in a traditional curated-by-someone
           | fashion, or are more open to UGC but still have some level of
           | moderation.
           | 
           | Those sites _choose_ to moderate their content, that doesn't
           | exclude others that don't.
        
             | jjulius wrote:
             | >The internet is not moderated in any useful sense for
             | content. Drug markets like white house market, and before
             | that silk road...
             | 
             | You mean the Silk Road that the US government "moderated"
             | out of existence, along with other Tor marketplaces over
             | the years? The same ones that suggest White House Market's
             | existence is also likely to be limited?
        
               | quantumBerry wrote:
               | I suppose in the sense that Gabby Pettito was moderated
               | off the internet, Ross Ulbricht was moderated off of the
               | internet and into a cage permanently for the heinous
               | crime of facilitating voluntarily peaceful trade. Tor
               | marketplaces were definitely not gone for years, the same
               | content just moved under new banners. You can literally
               | find the same content and more on WHM today as you did
               | under Ulbricht's banner before he was kidnapped by
               | government thugs.
        
               | [deleted]
        
               | jjulius wrote:
               | >I suppose in the sense that Gabby Pettito was moderated
               | off the internet, Ross Ulbricht was moderated off of the
               | internet and into a cage permanently for the heinous
               | crime of facilitating voluntarily peaceful trade.
               | 
               | Oh hello, strawman.
               | 
               | >Tor marketplaces were definitely not gone for years, the
               | same content just moved under new banners. You can
               | literally find the same content and more on WHM today as
               | you did under Ulbricht's banner before he was kidnapped
               | by government thugs.
               | 
               | And the only reason that happens is by virtue of Tor
               | making it difficult to track the source of those sites
               | and their operators. That doesn't mean that "moderators"
               | (governments, etc.) aren't putting forth their best
               | efforts to track them down and shut them down. It is
               | nearly inevitable that WHM will see a similar fate to
               | Silk Road, AlphaBay, DarkMarket, etc.. They're being shut
               | down as quickly as they can be.
        
               | quantumBerry wrote:
               | >Oh hello, strawman.
               | 
               | Glad to know you finally admit that being kidnapped by a
               | 3rd party is not really what most of us think as
               | "moderation", and thus you have made a straw man.
               | Although in the strict sense I guess it is true that
               | moderation could merely mean some 3rd party entity came
               | along and violently kept me away from communicating. If
               | you don't like me posting cat pictures on reddit, you
               | could crack my skull or lock me in a cage and steal my PC
               | and you would have "moderated" me but I wouldn't call
               | that reddit moderation.
        
               | jjulius wrote:
               | ... wow. Talk about going from 0-100 entirely too fast.
               | 
               | I was talking specifically about sites such as Silk Road
               | and others being taken offline (which is exactly what
               | _you_ were talking about, too), not once did I mention
               | his arrest nor did I allude to it. Glancing at your
               | username, I seem to recall previous comments from you in
               | threads about drug use being legalized. On the broad
               | topic of drug legalization - _again_ - you and I agree,
               | but you would do well to prevent your biases from
               | creeping in and causing you to misunderstand posts and
               | /or lash out at others.
        
               | quantumBerry wrote:
               | I apologize, maybe you are not familiar with the details
               | of the silk road. Ross Ulbricht was the administrator and
               | creator of the silk road, allegedly. It's quite probable
               | that without his arrest, it would have persisted even if
               | on newly acquired hardware. I would argue his arrest was
               | integral in these violent thugs "moderating" silk road
               | away like the mob "moderates" away their competition.
               | 
               | Instead, after his arrest the content ended up on new
               | platforms rather than the Silk Road platform.
               | 
               | > biases from creeping in and causing you to
               | misunderstand posts and/or lash out at others.
               | 
               | Yes my bias is in complete, unrestricted free speech.
               | Every single piece of content, regardless of how damaging
               | or vulgar anyone thinks it is and regardless of if it
               | portrays even the worst of crimes. I admit I am colored
               | by that bias.
               | 
               | > lash out at others.
               | 
               | What are you talking about? You feel attacked because
               | your poorly constructed argument was laid open. Your case
               | is pretty clear. Even if the system of the internet has
               | no useful filter of content (whether that is true or
               | not), if a third party such as DEA comes along and
               | decides to seize equipment and throw the operator in
               | jail, you consider that content moderation. And I'm
               | willing to admit from a practical perspective, that could
               | be considered a form of moderation by a violent third
               | party.
               | 
               | ---------------
               | 
               | Edit due to waiting on timeout to reply below:
               | 
               | His arrest is hand and hand with the shutdown. It was
               | integral. You can't say you weren't mentioning Ulbricht's
               | arrest when that arrest WAS, in part, the takedown of
               | Silk Road. The very fact that you said you weren't
               | speaking of the arrest lead me to say you "may not be
               | familiar" (note the uncertain words, that your bias
               | clouds you from understanding did not speak in
               | certainties.)
               | 
               | >s, and then angrily respond to them as such.
               | 
               | I think you're projecting. If there's any anger, it must
               | be yours.
               | 
               | >Yeah, again, you're injecting your own biases as you
               | create assumptions about my comments
               | 
               | Your comment appeared to be a rebuttal to my statement
               | that "The internet itself is unmoderated in any useful
               | sense for conten." If it wasn't actually a rebuttal but
               | actually an agreement, I apologize for misunderstanding
               | you were actually supporting that argument.
               | 
               | >See how I used "moderated" in quotes in my very first
               | response? That suggests that I'm using the term rather
               | loosely.
               | 
               | >If something's illegal - even if you and I think it
               | shouldn't be - then it's typically going to be removed at
               | some point, even if it takes a while because something
               | like Tor makes it difficult. And in that sense, yes, the
               | internet is "moderated" for that content. That's all I've
               | said/argued, and I truly don't understand how that is so
               | difficult for you to grasp.
               | 
               | The illegal content has only progressively proliferated
               | since the advent of the internet, and we've yet to see an
               | effective mechanism to moderate the content of the
               | internet as a whole. Virtually every category of content
               | has not only not been removed but increased.
               | 
               | >That's all I've said/argued, and I truly don't
               | understand how that is so difficult for you to grasp.
               | 
               | Yes and I'm arguing that this is incorrect, it hasn't
               | been moderated. At best it has passed from platform from
               | platform but no effective mechanism has managed to censor
               | the internet as a whole.
               | 
               | Sometimes I wonder with all this speak of anger,
               | misinterpretations, and clouded judgement is just you
               | repeating to me what your own psychologist told you.
        
               | jjulius wrote:
               | Yeah, again, you're injecting your own biases as you
               | create assumptions about my comments, rather than
               | stopping to ask what I mean before you fly off the
               | handle. See how I used "moderated" in quotes in my very
               | first response? That suggests that I'm using the term
               | rather loosely.
               | 
               |  _All I 've said_ was that that's how illegal content is
               | moderated on the internet - _it is removed_. Silk Road
               | was removed, AlphaBay was removed, DarkMarket was
               | removed, many others have been removed, and many more
               | will continue to be removed even if Tor makes that a slow
               | process. At no point did I bring up whether or not I
               | thought it was  "right" to remove them, or to treat
               | Ulbricht in that manner (again, you're assuming I don't
               | know what happened). I said _" moderating"_ with quotes,
               | _for lack of a better word_.
               | 
               | If something's illegal - even if you and I think it
               | shouldn't be - then it's typically going to be removed at
               | some point, even if it takes a while because something
               | like Tor makes it difficult. And _in that sense_ , yes,
               | the internet is "moderated" for that content. _That 's
               | all I've said/argued_, and I truly don't understand how
               | that is so difficult for you to grasp.
               | 
               | >It's quite probable that without his arrest, it would
               | have persisted even if on newly acquired hardware. ...
               | Instead, after his arrest the content ended up on new
               | platforms rather than the Silk Road platform.
               | 
               | For implying that I don't know what happened, you seem to
               | be forgetting that other Silk Road staff started Silk
               | Road 2.0 after his arrest, but that was also shut down.
               | 
               | >What are you talking about? You feel attacked because
               | your poorly constructed argument was laid open.
               | 
               | Nope. You allow your biases to creep in to your poor
               | interpretations of other people's comments, and then
               | angrily respond to them as such. My initial response was
               | simple, but your strongly held beliefs have clouded your
               | responses.
        
         | munificent wrote:
         | It looks like you're getting downvoted, but I think this is a
         | good point and worth thinking about.
         | 
         | I believe one key difference here is _group identity
         | perception_. If you like thinking in business terms, you could
         | say  "branding".
         | 
         | Facebook, Reddit, HN, Twitter, etc. all must care about content
         | moderation because there is a feedback loop they have to worry
         | about:
         | 
         | 1. Toxic content gets posted.
         | 
         | 2. Users who dislike that content see it _and associate it with
         | the site_. They stop using it.
         | 
         | 3. The relative fraction of users _not_ posting toxic content
         | goes down.
         | 
         | 4. Go to 1.
         | 
         | Run several iterations of that and if you aren't careful, your
         | "free" site is now completely overrun and forever associated
         | with one specific subculture. Tumblr -> porn, Voat -> right-
         | wing extremism, etc.
         | 
         | Step 2 is the key step here. If a user sees some content they
         | don't like and _associates it with the entire site_ it can tilt
         | the userbase.
         | 
         | The web as a whole avoids that because "the web" is not a
         | single group or brand in the minds of most users. When someone
         | sees something horrible on the web, they think "this site
         | sucks" not "the web sucks".
         | 
         | Reddit is an interesting example of trying to thread that
         | needle with subreddits. As far as I can tell, Reddit as a whole
         | isn't strongly associated with porn, but there are a _lot_ of
         | pornographic subreddits. During the Trump years, it _did_ get a
         | lot of press and negative attention around right-wing extremism
         | because of The_Donald and other similar subreddits, but it has
         | been able to survive that better than other apps like Gab or
         | Voat.
         | 
         | There are still many many thriving, wholesome, positive
         | communities on Reddit. So, if there is a takeaway, it might be
         | to preemptively silo and partition your communities so that a
         | toxic one doesn't take down others with it.
        
           | Nasrudith wrote:
           | I personally see it as "plausible deniability" as the cynical
           | actual distinction for what gets people to share blame. Not
           | actual affiliations or whose servers it is run on. Any number
           | of objectionable sites are run on AWS and you basically need
           | to be an international scandal or violating preexisting terms
           | to get booted. Like some malware to governments merchants.
           | Amazon's policies did not care if it was legal just if you
           | were doing so unauthorized. A wise move when international
           | law is really like the Pirate code.
           | 
           | The interlinking between the pages themselves and common
           | branding are what creates the associations. Distributed
           | twitter alternatives like Mastodon can even share the same
           | branding but it is on a per network basis and complex enough
           | to allow for some "innocent" questionable connections.
        
       | makeitdouble wrote:
       | Interesting article, I wonder how many UGC platforms got away
       | with a third party solution to help their moderation. Feels like
       | a core part of the business that would directly impact if it
       | sinks or floats.
       | 
       | > It's the dirty little product secret that no one talks about.
       | 
       | Hmmm. I'd say that the first thing people have in mind when UGC
       | comes on the table. A bit like how nobody thinks lightly of
       | storing credit card info, that's part of the culture at this
       | point I think.
        
       | dylanjha wrote:
       | Hey there :). Author here. It was a fun experiment to play around
       | with some different strategies for adding content moderation to
       | https://stream.new
       | 
       | Hive ended up being the one I landed on after trying Google
       | Vision first (https://cloud.google.com/vision).
       | 
       | The other one I was looking at is Clarity.ai but I didn't get a
       | chance to try that one yet.
        
         | donclark wrote:
         | stream.new seems really cool. However there is no account
         | button to see all of your video URLs, or a download option for
         | the video. If there was, I would probably make it my default
         | (not sure if that is what you want)
        
           | dylanjha wrote:
           | Thanks! Yeah that would be a significant improvement.
           | 
           | This started as a little demo project with Nextjs + Mux and
           | then evolved into more of an actual product
           | (https://github.com/muxinc/stream.new).
           | 
           | Right now the lightweight utility aspect of stream.new feel
           | right, but if we continue to build upon it as a standalone
           | free product then adding the concept of an "account" with
           | saved videos makes a ton of sense.
        
         | DelightOne wrote:
         | Is there a way to delete videos again?
        
         | tomjen3 wrote:
         | One thing that wasn't obvious to me is, why did you care about
         | uploads of NSFW? As I understood it, you want to become Imgur
         | of video. Imgur only became so big because they allowed NSFW
         | stuff.
        
           | dkh wrote:
           | Not involved with this project, but there's a couple big
           | reasons most would care about this.
           | 
           | * Child porn and similar content that is a level beyond
           | simply "NSFW"
           | 
           | * Uploaders of NSFW stuff are always in need of a new
           | platform they haven't been kicked off yet, and newer
           | platforms are likely to be dominated with this type of
           | content. Unless you want your platform to gain a reputation
           | as the place for mostly NFSW content, you probably don't want
           | this.
        
       | [deleted]
        
       | djyaz1200 wrote:
       | LOL at the title, I think the silver lining is your moderation
       | becomes a barrier to entry/competitive advantage if done well and
       | kept hidden. It's one of the last things in software that's hard
       | to copy.
        
       | ufmace wrote:
       | Another particular headache as you get bigger is that humans are
       | better for accuracy, but less consistent. No matter how specific
       | you think your rules are, there will be edge cases where it isn't
       | clear what side they're on. Different human moderators may rule
       | differently on the same content, or even the same moderator at a
       | different time of day. When users find these edge cases,
       | inevitably somebody will get upset that you blocked X but not Y.
       | 
       | And then you have to keep the actual abusive users from figuring
       | out a way to leverage moderator X usually approving their just
       | barely over the line images if they ever figure out how your
       | approval requests are routed.
        
       | jfengel wrote:
       | If you host blobs for free, somebody is going to use you as their
       | host. Even if you just hosted audio, I'm sure somebody will
       | quickly come along with a steganography tool to hide their
       | content on your site (and use your bandwidth).
       | 
       | Similarly, if you make compute power available, people will use
       | you to mine cryptocurrency. Even if all you host is text,
       | somebody will come along to be abusive. When you put a computer
       | on the Internet, it's open to the entire world, including the
       | very worst people.
       | 
       | If you're hosting a community, start from the beginning by
       | knowing who your community is and how they will tell you who they
       | are. If the answer is "everybody", then know what _everybody_
       | means -- it means some people won 't want to be there, because
       | some people will make life hard for them.
       | 
       | It's no longer 1991, when you could assume that such people
       | wouldn't find you. They _will_ find you -- for money, or the
       | lulz. You have to plan for that on day 1. You can 't fix it after
       | the fact.
        
         | pjc50 wrote:
         | > Even if you just hosted audio
         | 
         | Absolute worst case! You're going to end up DMCA'd by the
         | entire music industry.
         | 
         | > It's no longer 1991, when you could assume that such people
         | wouldn't find you.
         | 
         | Even back in the nineties, there was abuse .. but the internet
         | was so much smaller, and it was possible to manually ban them.
         | Except on USENET. The labour of dealing with spam fell to a
         | small number of people, one of whom wrote this astonishing
         | rant: https://www.eyrie.org/~eagle/writing/rant.html
         | 
         | (and partly disowned it, but I think he was right first time)
        
           | Sohcahtoa82 wrote:
           | > Absolute worst case! You're going to end up DMCA'd by the
           | entire music industry.
           | 
           | You'll even get DMCA'd for your own content!
           | 
           | In late 2018, musician TheFatRat had one of his YouTube
           | videos taken down due to a DMCA report: https://twitter.com/t
           | hisisthefatrat/status/10729330469391933...
           | 
           | Herman Li, the lead guitarist for Dragonforce, had his Twitch
           | account suspended due to supposed DMCA violations because he
           | played his own music on stream:
           | https://www.kotaku.com.au/2020/10/twitch-dragonforce-
           | herman-...
        
         | htrp wrote:
         | > If you host blobs for free, somebody is going to use you as
         | their host. Even if you just hosted audio, I'm sure somebody
         | will quickly come along with a steganography tool to hide their
         | content on your site (and use your bandwidth).
         | 
         | This feels like something more of a theoretical example cited
         | versus something that has happened. Do you have any examples of
         | steganography being used as bandwidth redirection/hosting?
        
           | MauranKilom wrote:
           | Many examples (maybe not exactly steganography, but same
           | spirit) in this discussion from three weeks ago:
           | 
           | https://news.ycombinator.com/item?id=28431716
        
           | joshuamorton wrote:
           | https://handwiki.org/wiki/GmailFS
           | 
           | https://github.com/maxchehab/redditfs
           | 
           | etc etc.
        
           | michaelpb wrote:
           | Not at all theoretical, this happens all the time. There are
           | tools like StegoShare:
           | https://en.wikipedia.org/wiki/StegoShare
           | 
           | I googled and one of the top articles was this (I didn't read
           | it): https://www.deccanchronicle.com/technology/in-other-
           | news/070...
           | 
           | I couldn't find the article I had read a few years back, but
           | I remember this sort of thing being used to host content on
           | Facebook, Wikipedia, Reddit, etc, before they cracked down on
           | it.
        
           | jcun4128 wrote:
           | Recently saw a post about using imgur to host websites by the
           | website code embedded in images (steganography?)
        
           | TravisHusky wrote:
           | I did this when I was in high school for fun. I used a poorly
           | designed comment system somebody designed and used it to
           | transfer files around hidden in gibberish comments. Maybe not
           | the most common thing, but more common than you would expect.
        
         | michaelpb wrote:
         | Yeah, there's an entire category of "idea guys" who don't get
         | this. They repeatedly try to crack the code on a truly
         | moderation-free or purely crowd-moderated platform, and it
         | never, ever, ever works.
         | 
         | It almost always boils down to a poor understanding of how
         | humans work (usually some sort of "homo economicus") or how
         | computers work (usually some sort of "AI magic wand").
        
           | seph-reed wrote:
           | I still theorize crowd-moderated platforms are possible, as
           | long as there's really good gate-keeping.
           | 
           | My bet is some real-world tie, one which is time consuming
           | and expensive to create. From there it should be possible to
           | create moderation tools that keep the rest going.
           | 
           | An example of a real world tie would be a trust network that
           | requires status with in-person communities and local
           | businesses. And not just "accept the hot chick friend
           | request," but an explicit "I'm staking my reputation by
           | saying this person is real."
           | 
           | But once you let bots in, it's over.
        
             | pjc50 wrote:
             | The nearest was Advogato; it didn't have an abuse problem,
             | but it did end up as a ghost town. https://web.archive.org/
             | web/20170715120119/http://advogato.o...
        
             | Sohcahtoa82 wrote:
             | Whatever happened to Something Awful? Are they still
             | around?
             | 
             | They charged a one-time $10 fee to access their forums. If
             | you got banned, you could pay $10 to get a new account. It
             | made being a total dick expensive. I've heard it get called
             | the Idiot Tax.
        
             | WorldMaker wrote:
             | Moderation is labor and you get what you pay for. Which is
             | not that crowd-moderation cannot work, but that for good
             | crowd-moderation you still have to treat it as a labor
             | pool, have a very good idea of how you are
             | incentivizing/paying for it, and what "metrics/qualities"
             | those incentives are designed to optimize for.
             | 
             | (In some cases it actually is far cheaper to pay a small
             | moderator pool a good wage than to pay an entire community
             | a bad wage to "crowd-moderate" if you actually test the
             | business plan versus alternatives.)
        
             | snowwrestler wrote:
             | Slashdot's meta-moderation system worked well for a long
             | time. One set of people could make moderation decisions
             | directly on content, and then another unrelated set of
             | people would review the moderation decisions and support or
             | revert them.
             | 
             | It was all tied to karma and permissions in ways I can't
             | quite remember. But essentially there was no way for a
             | motivated bad-faith group to both moderate and meta-
             | moderate themselves, and the incentives marginalized bad
             | faith actors over time.
        
           | 999900000999 wrote:
           | Idea Guys are useless for many other reasons.
           | 
           | Generally they'll want to make a half baked social media
           | network without understanding you need to pay for things like
           | hosting, or a programers time. I've made the mistake of
           | writing code for these folks.
           | 
           | Guaranteed they'll never appreciate it, and this includes non
           | profit coding groups. Never ending scope creep, vague
           | requirements, etc.
           | 
           | My rule is unless you're one of my best friends I simply will
           | not build your project for you. However the few times I have
           | built something for a friend I found the experience to be
           | very rewarding, it can be good to develop with someone else
           | who can give you feedback so you actually know you're
           | building something someone would like
        
         | mijustin wrote:
         | > If you host blobs for _free_
         | 
         | This is the key distinction. If you charge money, from the
         | beginning, most of your content moderation woes go away.
         | 
         | At Transistor.fm we host podcasts and charge money for it
         | (starting at $19/month). We've had very little problems with
         | questionable content.
         | 
         | We're a counterpoint to the narrative here: small (4 full-time
         | people), profitable, and calm.
         | 
         | > Even if you just hosted audio
         | 
         | Most DMCA takedown requests these days are handled through the
         | big podcast directories (Spotify, Apple Podcasts). We haven't
         | had to write/implement any fingerprinting tech.
        
       | jedberg wrote:
       | It always blew people's minds when I told them that 50% of the
       | engineering time at reddit was spent on moderating. What's
       | interesting though is that we didn't even have any moderation for
       | the first year or so, because the community would just downvote
       | spam.
       | 
       | It wasn't until we got vaguely popular that suddenly we were
       | completely overwhelmed with spam and had to do something about
       | it.
        
       | canadapups wrote:
       | I run an online marketplace. It's a constant battle against
       | scammers putting up fake items to sell. While I do run "content
       | moderation" to identify the scams, the fakes are identical or
       | nearly identical to the real items. Content moderation isn't the
       | solution for me. As other commenters point out it's just war of
       | attrition or cycle of escalation from a few bad actors.
       | 
       | The only effective method I have now is fingerprinting (i.e.,
       | invading users privacy). Browsers are becoming more privacy
       | oriented so at time goes on fingerprinting will be less
       | effective, with more people being scammed online. I don't think
       | those that want privacy at all costs understand the trade off.
       | 
       | In a few months, I will move to an voluntary
       | fingerprinting/identification scheme soon (like GDPR cookies opt
       | in). Where you identify yourself or don't use my website... which
       | may leave me as a "die an MVP" example.
        
       | patall wrote:
       | One interesting approach I heard about in this domain is the
       | Trolldrossel (German for troll throttle) by Linus Neumann from
       | CCC. He implemented a captcha test for a comment server that
       | would fail with a certain percentage when encountering certain
       | key-words in the comment, even when the captcha was solved
       | correctly.
       | 
       | While I have no notes about the effects and the corresponding
       | talk seems to have vanished from the internet, it supposedly
       | worked quite well by forcing any 'obscene' comments through
       | additional rounds of captcha without telling that that was the
       | reason for them to fail, thus demotivating the submitting person
       | to do so.
        
         | carom wrote:
         | I don't speak this language but I assume this is it.
         | 
         | https://linus-neumann.de/2013/05/die-trolldrossel-erkenntnis...
        
       | invincivlepvt wrote:
       | https://invincivlepvt.com/
        
       | baybal2 wrote:
       | What about not treating users like kiddies needing supervision?
       | 
       | I think the Silicone Valley MVP TrendySpeak crowd needs to open
       | eyes to the realities of life.
       | 
       | Slashdot is a very peculiar case of near no spam filtering, yet
       | very good user content moderation
       | 
       | Kuro5 was also kind of interesting with its old user rating
       | system.
        
         | MattGaiser wrote:
         | Turn on your ability to see dead comments if it is not on
         | already. Even HN would be a landfill without moderation.
        
           | commandlinefan wrote:
           | I always read "dead" comments. I believe that many of them
           | should not have been "killed".
        
             | BeFlatXIII wrote:
             | Reddit rule of thumb: the downvoted comments are far more
             | likely to introduce a new idea than the upvoted ones.
             | However, it's like comparing the likelihood of dying in a
             | car crash instead of a plane wreck: both events are rare
             | enough as not to be worth considering in planning your
             | daily life.
        
               | JasonFruit wrote:
               | I often find material of value in dead comments. I think
               | it's worth considering; I recommend it.
        
         | lostlogin wrote:
         | Slashdot describe their moderation as 'mass moderation'. This
         | description is given to the phase that came _after_ having 400
         | mods.
         | 
         | https://slashdot.org/moderation.shtml
        
         | AnimalMuppet wrote:
         | > What about not treating users like kiddies needing
         | supervision?
         | 
         | There's two kinds of "users" here - writers and readers. As a
         | reader, in this world with the humans we've got, I do _not_
         | want to be subjected to everything someone wants to write. It
         | makes a platform unusable to readers who want to read something
         | that isn 't a troll, spam, or propaganda.
         | 
         | The trick is to do that without crimping the users who are
         | writers...
        
         | jrochkind1 wrote:
         | Because if you don't do any content-moderation your site will
         | turn into a horrid wasteland that most of your users don't want
         | to be on, thus defeating your purpose if your purpose is making
         | money or any other reason to attract users.
         | 
         | If your purpose is being 8chan, you'll be good.
         | 
         | > Slashdot is a very peculiar case of near no spam filtering,
         | yet very good user content moderation
         | 
         | So, you're saying they have very good user content moderation.
         | Which means they have content moderation. Sounds like it's
         | human (rather than automated), and they have unpaid volunteers
         | doing it. That's a model that works for some. A model of...
         | content moderation.
        
       | invincivlepvt wrote:
       | <a href="https://invincivlepvt.com/logo-designing-
       | jalandhar/">Logo Designing Punjab</a> <a
       | href="https://invincivlepvt.com/logo-designing-jalandhar/">Best
       | Logo Designer</a> <a href="https://invincivlepvt.com/logo-
       | designing-jalandhar/">Logo Designing jalandhar</a> <a
       | href="https://invincivlepvt.com/logo-designing-jalandhar/">Logo
       | Designer Jalandhar</a> <a href="https://invincivlepvt.com/logo-
       | designing-jalandhar/">Best Logo Designer Punjab</a> <a
       | href="https://invincivlepvt.com/logo-designing-jalandhar/">Logo
       | Designing Punjab</a> <a href="https://invincivlepvt.com/logo-
       | designing-jalandhar/">Best Logo Designer Jalandhar</a>
       | 
       | <a href="https://invincivlepvt.com/poster-designing-
       | jalandhar-2/">Pos... Designing Punjab</a> <a
       | href="https://invincivlepvt.com/poster-designing-
       | jalandhar-2/">Bes... Poster Designer</a> <a
       | href="https://invincivlepvt.com/poster-designing-
       | jalandhar-2/">Pos... Designing jalandhar</a> <a
       | href="https://invincivlepvt.com/poster-designing-
       | jalandhar-2/">Pos... Designer Jalandhar</a> <a
       | href="https://invincivlepvt.com/poster-designing-
       | jalandhar-2/">Bes... Poster Designer Punjab</a> <a
       | href="https://invincivlepvt.com/poster-designing-
       | jalandhar-2/">Pos... Designing Punjab</a> <a
       | href="https://invincivlepvt.com/poster-designing-
       | jalandhar-2/">Bes... Poster Designer Jalandhar</a>
       | 
       | <a href="https://invincivlepvt.com/banner-designing-
       | jalandhar-2/">Ban... Designing Punjab</a> <a
       | href="https://invincivlepvt.com/banner-designing-
       | jalandhar-2/">Bes... Banner Designer</a> <a
       | href="https://invincivlepvt.com/banner-designing-
       | jalandhar-2/">Ban... Designing jalandhar</a> <a
       | href="https://invincivlepvt.com/banner-designing-
       | jalandhar-2/">Ban... Designer Jalandhar</a> <a
       | href="https://invincivlepvt.com/banner-designing-
       | jalandhar-2/">Bes... Banner Designer Punjab</a> <a
       | href="https://invincivlepvt.com/banner-designing-
       | jalandhar-2/">Ban... Designing Punjab</a> <a
       | href="https://invincivlepvt.com/banner-designing-
       | jalandhar-2/">Bes... Banner Designer Jalandhar</a>
       | 
       | <a href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Card Designing Punjab</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Business Card Designer</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Card Designing jalandhar</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Card Designer Jalandhar</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Business Card Designer Punjab</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Card Designing Punjab</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Business Card Jalandhar</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-...> Visiting Card Designing Punjab</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Business Card Designer</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Card Designing jalandhar</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Card Designer Jalandhar</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Visiting Card Designer Punjab</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Card Designing Punjab</a> <a
       | href="https://invincivlepvt.com/business-card-designing-
       | jalandhar-... Visiting Card Jalandhar</a>
        
       | godshatter wrote:
       | I guess I would rather see a system that focuses on scanning all
       | images for illegal content (presumably there are services where
       | you can hash the image and check for known child porn images, for
       | example), and focus on tagging all other images for certain
       | things (like David Hasselhof's bare chest or whatever concerns
       | your users). Give the users tools to flag images as illegal
       | content, or for misapplied or missing tags, and the tools to
       | determine which type of content they wish to see. Prioritize
       | handling known illegal content found earlier, then user-flagged
       | possibly illegal content, then missing or misapplied tags. Handle
       | DMCA take down requests according to the letter of the law.
       | 
       | Let the users help you, and let them choose what they want to
       | see. Use conservative defaults if you wish, but trying to guess
       | what users might find objectionable and filtering that out ahead
       | of time sounds like a losing proposition to me. They'll tell you
       | what they don't like. When they do, make a new tag and start
       | scanning for that with the AI goodness.
       | 
       | Of course, this is what I would like to see as a user. I'm
       | probably an atypical user. And I'm not the person about to bet
       | their life savings on a start-up, either, so take this with a
       | grain of salt. I just wish that content providers would stop
       | trying to save me from the evils of life or whatever their
       | motivation is.
        
         | godshatter wrote:
         | That downvote was fast. See, user moderation works :)
        
       | loco053100 wrote:
       | Ich gehe zum Anwalt seit 2004 hackt ihr mein Handy. Und da fallen
       | beleidigungen Mobbing fernsteuern alles abhoren. Die scheisse
       | hort jetzt auf. Kommt doch mal in der Realitat fotzn wollt ihr
       | mein Leben zerstoren seit 7 Jahren schon.??? Ich kann das alles
       | belegen und das passiert endlich auch. Dann sehen wir uns beim
       | Gericht in Realitat
        
       | dkh wrote:
       | It drives me absolutely nuts when I encounter a video platform
       | upstart that has not adequately prepared (or prepared at all) for
       | the inevitable onslaught of undesirable and illegal content that
       | users will soon start uploading if the platform has really any
       | traction at all. No UGC site/app is immune. Even when prepared,
       | it is an eternal, constantly-evolving battle as users find more
       | clever ways to try to hide their uploads or themselves. If you
       | aren't ready for it at all, you may never be able to catch up.
       | And while a lot of the undesired content could just be really
       | annoying to get rid of, some is catastrophic -- a user uploading
       | a single video of something like child porn that is publicly
       | visible can be the death knell for the platform.
       | 
       | I'm going to go ahead and refute some of the counterarguments
       | I've heard a million times over the years just to get it out of
       | the way.
       | 
       |  _"It could be a while before it's necessary."_
       | 
       | People seeking to upload and share unsavory content are
       | constantly getting kicked off every other platform for doing so,
       | and thus are always on the lookout for something new to try where
       | they might be able to get away with it, at least for now. They
       | are the earliest adopters imaginable.
       | 
       |  _"Just let users flag content"_
       | 
       | Lots of issues here, but here's a couple big ones.
       | 
       | 1. You cannot afford something like child porn to be visible long
       | enough to be flagged, or for it to be seen by anyone at all. If
       | something like this gets uploaded and is visible publicly, you
       | could be screwed. I worked on a video platform once that had been
       | around a couple years and was fairly mature. One video containing
       | child porn managed to get uploaded and be publicly visible for
       | about one minute before being removed. It was a year before the
       | resulting back-and-forth with federal agencies subsided and the
       | reputation of the platform had recovered.
       | 
       | 2. People uploading things like pirated content tend to do so in
       | bulk. You might see people uploading hundreds of videos of TV
       | shows or whatever. It may exceed legitimate uploads in the early
       | days of a platform. You do not want to burden users with this
       | level of moderation, and actually they aren't likely to stick
       | around anyway if good videos are lost in a sea of crap that
       | needed to be moderated.
       | 
       |  _"We'll just use (some moderation API, tool, etc.)"_
       | 
       | Yes, please do, but I'm not aware of anything that works 100%.
       | Even if you filter out 99% of the bad stuff, if the 1% that gets
       | through is kiddie porn, say goodnight. These tools get better all
       | the time, but users who are serious about uploading this kind of
       | stuff also continue to find new and interesting ways to trick
       | them. As recently as 2017 a pretty big video platform I worked on
       | was only able to stop everything with a combination of automated
       | systems as well as a team overseas that literally checked every
       | video manually. (We built a number of tools that enabled them to
       | do this pretty quickly.)
       | 
       |  _Content shouldn't be moderated_
       | 
       | Child porn? Hundreds of pirated episodes of _Friends_ instead of
       | legitimate user videos? (Even if you are pro-piracy, you don 't
       | want to pay to host and serve this stuff, and you don't want it
       | to distract from legit original content from your users.) What
       | about when some community of white supremacists gets wind of your
       | new platform and their users bomb it with all their videos?
       | 
       | Do not take this stuff lightly.
       | 
       | EDIT: I've spent most of the last decade as an engineer working
       | on UGC and streaming video platforms
        
         | BeFlatXIII wrote:
         | > 2. People uploading things like pirated content tend to do so
         | in bulk. You might see people uploading hundreds of videos of
         | TV shows or whatever. It may exceed legitimate uploads in the
         | early days of a platform. You do not want to burden users with
         | this level of moderation
         | 
         | Not to mention that viewers aren't likely to flag the complete
         | discography camrip of My Little Pony unless they're stupid or
         | have an axe to grind (either against the IP that was uploaded,
         | piracy in general, or the specific uploader). The viewers are
         | often drawn to platforms specifically because they are flooded
         | with piracy in their early days.
        
           | dkh wrote:
           | Exactly. Setting aside all legal concerns and whatever
           | anyone's philosophy is about piracy or moderated content, you
           | still have the enormous concern about what kind of community
           | you are fostering and what kind of people you are attracting
           | based on what content you allow to be surfaced.
        
         | commandlinefan wrote:
         | > It was a year before the resulting back-and-forth with
         | federal agencies
         | 
         | You're blaming lack of content moderation and not a law
         | enforcement system that holds you responsible for something you
         | had no control over when it actually failed to do its own job
         | in this case?
        
           | PragmaticPulp wrote:
           | > a law enforcement system that holds you responsible for
           | something you had no control over when it actually failed to
           | do its own job in this case?
           | 
           | Investigating these issues _is_ their job. They don't show up
           | assuming the site operator is the guilty party, but they do
           | need their cooperation in collecting evidence so they can
           | pursue the case.
           | 
           | It's analogous to a crime being committed on your property.
           | They don't show up to charge the property owner for a crime
           | someone else committed, but they do need access to the
           | property and cooperation for their investigation.
        
           | dkh wrote:
           | We weren't held responsible, but it was still investigated
           | and required our cooperation and was not the best use of our
           | resources. Honestly, the public reputation part was far and
           | away the more unfortunate consequence.
           | 
           | Trust me, I have numerous concerns around the legal issues
           | and the chain of responsibility, but what choice do you have?
           | Are you going to start a fight with them out of principle and
           | hope this works out in your favor? While still devoting the
           | time and energy to the video platform you set out to build in
           | the first place?
        
         | tester756 wrote:
         | How about accepting stuff manually before publish and only for
         | paid users? :P
        
       | phkahler wrote:
       | Here's a thought. Have a platform where identity is verified.
       | Users can post publicly or within their circle. Any illegal or
       | fraudulent content can be handled by the legal system due to the
       | lack of anonymity. Beyond that, let users form groups for topics
       | like reddit.
       | 
       | Where does this fall down?
        
         | teddyh wrote:
         | Laws are not the same for all people within a circle.
        
         | Nasrudith wrote:
         | We tried that with "real ID" policies. It just made people
         | commit to their shittiness openly. Not to mention they always
         | have repudiation. Even if we go with full fledged cryptography
         | the opsec will fail at scale.
        
         | abraae wrote:
         | I would do that but make it pseudonymous.
         | 
         | i.e. illegal or fraudulent content is clearly owned by the user
         | irongoat, and you know who irongoat is (at least you know their
         | email, IP address) but no-one on your site knows who irongoat
         | is.
         | 
         | If the content is bad enough, authorities will get in touch for
         | you to tell them what email address irongoat is associated
         | with.
         | 
         | But as long as the content is OK, irongoat can say what he/she
         | wants, with no PII visible.
        
         | PerkinWarwick wrote:
         | What I'd like is something identical to Facebook (preferably
         | hosted in a box on my desk) that I invite all my friends to,
         | and they have the ability to invite people to. 2nd gen is
         | probably far enough.
         | 
         | Anyone cuts up rough, I go by their house and make them
         | miserable.
        
           | srfvtgb wrote:
           | You lose out on the network effect with this solution.
           | 
           | If someone gets to pick who's allowed onto a network, then
           | they won't bother using it. Maybe your friends will join
           | since they are allowed to add _their_ friends, but those
           | friends of friends wouldn't bother because most of their
           | friends can't join, meaning your friends won't bother either
           | unless they really want to talk to you specifically.
           | 
           | It would work for groups where most people know each other,
           | but there are already options for that that allow users to be
           | in groups that aren't all owned by the same person e.g.
           | Discord, Signal, Facebook.
        
           | teddyh wrote:
           | You never plan to have any friends beyond your immediate
           | neighborhood?
        
             | PerkinWarwick wrote:
             | I'll fly in if need be.
        
         | tomcam wrote:
         | I like the general idea, but one place where it falls down is
         | in Free press situations. For example, you are a whistleblower
         | or dissident. And the other example is if you are a battered
         | spouse and want to discuss it without the batterer being able
         | to identify you.
        
         | didericis wrote:
         | It incentivizes homogeneity and greatly decreases the amount of
         | discussion on controversial topics. If posts are permanent,
         | tied to your identity, and potentially subject to legal
         | punishments, people with minority opinions become much more
         | skittish.
         | 
         | That could be considered only a risk if the moderation is bad,
         | but bad moderation becomes more likely over time due to
         | feedback loops. An optimally permissive moderator will risk
         | inviting an overly strict moderator due to that permissiveness.
         | An overly restrictive moderator will not. There is a greater
         | likelihood of moderation becoming increasingly restrictive over
         | time because the moderation narrows the pool of moderators.
        
       | EthOptimist wrote:
       | Laws requiring content moderation in social media will only serve
       | to be a form of regulatory capture that inhibits competition
        
       | shadowgovt wrote:
       | Ages ago on This Week in Tech, Leo Laporte and perennial guest
       | John C. Dvorak discussed the notion (in the context of Yahoo
       | Groups at the time) that nearly every hosted content platform
       | starts default-open to maximize the size of their user base but
       | eventually passes over the "horse porn event horizon." Sooner or
       | later, a critical mass of users with specific predilections would
       | find the forum and use it for communicating on their, uh, topic
       | of choice. And the owner at that point had two options:
       | 
       | - Do nothing and let the forum stay as open as it had been
       | 
       | - Ever be ad-supported or supported via mainstream sponsorship or
       | partnership, or be purchased by a bigger company
       | 
       | ... Few hosts choose the first option.
        
       | [deleted]
        
       | derekzhouzhen wrote:
       | There is no correct way to do the content moderation, so on my
       | website, https://roastidio.us I don't plan to have any. My rules
       | are simple:
       | 
       | * you can say whatever you want to say, but only one person, the
       | one you replied to, get to see it other than youself
       | 
       | * the one you replied to get to approve whether it can be shown
       | to other people. No further reply is allowed until this one is
       | approved.
       | 
       | * Even the one you replied to don't approve your comment, the
       | comment will not be deleted and will still be visible to you.
        
         | makeitdouble wrote:
         | Don't you still have to deal with user reports, repeated spam,
         | stalking, verbal abuse etc. ?
         | 
         | I'd assume a comment being visible to only one person doesn't
         | completely remove all the user to user issues the other
         | platforms face.
        
           | derekzhouzhen wrote:
           | Flame wars, abuse etc. only happen if there are other people
           | watching. Why would anyone try to abuse someone of which they
           | have next to zero info for no personal gain, and no one can
           | witness your triumph anyway?
        
             | ketzo wrote:
             | Is this a rhetorical question? Because people do that
             | literally all the time. This is an extremely real problem.
        
       | devmor wrote:
       | As someone working on a platform with content moderation as a
       | core feature, it is so much work. I thoroughly understand why so
       | many platforms ignore it for so long.
       | 
       | Thankfully, we have some nice tools these days. I use Google's
       | Perspective API to automatically hold back text input for manual
       | moderation, which takes a lot of the man hours out of it for my
       | moderation team.
       | 
       | The rest is handled by the users of the platform themselves, and
       | metrics about content reports to curtail abuse.
        
       | Alir3z4 wrote:
       | Around 1 year ago we got hit badly on our [blogging platform][0]
       | by people/groups submitting fake customer support description of
       | other big companies, either being Microsoft, Facebook, Comcast
       | etc.
       | 
       | Rolled out a machine learning model and trained it on the
       | database. 99% of them vanished.
       | 
       | Next day, the machine didn't work and success rate was around 5%.
       | 
       | Found out, they have learned the trick and now using symbols from
       | different languages to make it look like English.
       | 
       | Trained again, success rate went up again.
       | 
       | Next hour, success rate fallen.
       | 
       | This time, they mixed their content with other valid content of
       | our own blogging platform. They would use content from our own
       | blog or other people posts and mix it to fool the machine
       | learning.
       | 
       | Trained it again and was success.
       | 
       | Once a while such content appear and machine model fails to catch
       | them.
       | 
       | It only takes couple of minutes to mark the bad posts and have
       | the model get trained and redeployed and then boom, bad content
       | is gone.
       | 
       | The text extraction, slicing through good content and bad
       | content, finding out symbols vs sane alphabet and many other
       | thing was at first challenging, but overall pretty excited to
       | make it happen.
       | 
       | Through this we didn't use any platform to do the job, the whole
       | thing was built by ourselves, little bit of Tensorflow, Keras,
       | Scikit-learn and some other spices.
       | 
       | Worth noting, it was all text and no images or videos. Once we
       | got hit with that we'll deal with it.
       | 
       | [0]: https://www.gonevis.com
       | 
       | edit: Here's the training code that made the initial work
       | https://gist.github.com/Alir3z4/6b26353928633f7db59f40f71c8f...
       | it's pretty basic stuff. Later changed to cover more edge cases
       | and it got even simpler and easier. Contrary to the belief, the
       | better it got, the simpler it became :shrug
        
         | LuisMondragon wrote:
         | > Found out, they have learned the trick and now using symbols
         | from different languages to make it look like English.
         | 
         | I wonder if you can train a ML model using text as _images_.
         | For example, taken as strings,  "porn" and "p0rn" are not very
         | similar, but visually they are.
        
         | oblio wrote:
         | Do you have a mechanism for appealing the automated process?
        
           | Alir3z4 wrote:
           | Not sure if I understand correctly, but if you mean how re-
           | training and deployment would be.
           | 
           | Nothing fancy tbh.
           | 
           | For the first several deployment while taking care of edge
           | case and debugging, all manually on my own laptop and shot
           | into cluster as a docker image.
           | 
           | Later, when starting to classify more content on the platform
           | itself:
           | 
           | - Webhook will trigger the CI to train the model with new ham
           | and spam content. - A new docker image was created and
           | deployed to container.
           | 
           | The above would fail if success rate in validation/testing
           | would be below 95%.
           | 
           | Still when classifying bad content, the whole process happens
           | automated.
           | 
           | Special thanks to Gitlab CI, shell scripts, Python and Docker
           | Swarm.
        
             | nerd_light wrote:
             | I think they meant (and I am interested in hearing about)
             | appealing a "block" decision that was made by your
             | automation.
             | 
             | If I'm a real human and trying to post a "good" post, but
             | the model classifies it as bad and automatically blocks it,
             | how do I appeal that decision? Can I? Or is my post totally
             | blocked with no recourse?
        
               | Alir3z4 wrote:
               | Oh got it.
               | 
               | Thanks for clarification.
               | 
               | When a post gets published, it will be send to machine
               | learning image via REST.
               | 
               | If bad, the post will be kept as Draft.
               | 
               | A new record gets created in another database table to
               | keep track them, the accuracy rate was recorded as well.
               | 
               | This was made to make sure no irreversible action was
               | done on the good content.
               | 
               | Blogs with more than 1 year of history would not go
               | through moderation but no action was being taken, just
               | recording the accuracy for future reference.
               | 
               | Later, someone from our team (me usually) would check
               | them by eye and pull trigger on them, they would go into
               | make the training better.
               | 
               | If something would pass the moderation but it was indeed
               | spam, would go into another iteration.
               | 
               | We had to do this for over a month, through the time, the
               | success was around 99%, no blogs would be wiped by
               | machine classification from our database unless confirmed
               | by someone.
               | 
               | That time the whole model was trained for that specific
               | content. Later it get into other type of spams. Which we
               | trained different models.
               | 
               | Overall, the the machine actions were logged,
               | content/users/blogs would get labeled and bad marks on
               | them.
               | 
               | They would be displayed in a report page, until someone
               | make the final decision, through the whole time, the user
               | would be shadow banned (shadow banning didn't help
               | though) and their content would not be published.
        
               | nerd_light wrote:
               | Thanks for the detailed response! And nice to hear how
               | much you've managed to keep humans involved in the
               | process. I used to work on a content review automation
               | system for a big company, so it's always fun to hear
               | about how others handle similar cases.
               | 
               | And there's a lot of overlap between how that system
               | acted and what you're describing. It makes we wonder if
               | there's space for a company that offers this sort of
               | model training + content tagging + review tooling
               | capability as a service, or if there's too many variation
               | on what "good" and "bad" input is to make it
               | generalizable.
        
             | fallingknife wrote:
             | I think he's talking about false positives.
        
           | bliteben wrote:
           | what are they google? wait google doesn't even have that
        
             | oblio wrote:
             | My comment was targeted at companies trying to take
             | Google's crappy approach at this problem.
        
         | vadfa wrote:
         | What software/libraries have you used for your machine learning
         | moderation system?
        
           | Alir3z4 wrote:
           | tensorflow and scikit-learn to train and build the model.
           | 
           | On the front FastAPI (behind uvicorn) to accept calls via
           | REST API.
           | 
           | Deployed via docker.
           | 
           | To be honest, tensorflow and scikit-learn may not be the
           | right fit for everything.
           | 
           | Every situation needs different approach and different
           | solution.
           | 
           | Worth nothing, the most time consuming part was dealing with
           | data itself and not model training or machine learning.
           | 
           | In couple of hours you'd notice you're starting at charts and
           | tuning parameters.
        
             | flal_ wrote:
             | "Worth nothing, the most time consuming part was dealing
             | with data itself and not model training or machine
             | learning."
             | 
             | Become a data scientist they said. Yay, artificial
             | intelligence...
        
               | Alir3z4 wrote:
               | Hahaha. Yes, I felt that every second while dealing with
               | it.
        
         | YetAnotherNick wrote:
         | How do you detect the ground truth for training the model? Do
         | you manually label it?
        
           | Alir3z4 wrote:
           | Yes, simple classification. Nothing fancy.
           | 
           | Basically, pulled the database into CSV file and anything
           | that was published before the bad content was classified as
           | HAM.
           | 
           | We had content that were OK, so marked as HAM and then our
           | new bad content all marked as SPAM.
           | 
           | When deployed to production for some hours HAM content got
           | wrongly marked and model got trained on them as well which
           | made so many confusion but the problem was taken care of once
           | the model got properly tuned and safer to let it be
           | automated.
        
             | benjaminjackman wrote:
             | Hmm I wonder if it picked up timestamps as its initial
             | filter.
        
         | KorematsuFredt wrote:
         | Did you try rate limiting, shadow banning, ip banning etc?
        
         | Y_Y wrote:
         | If your eyes can "normalize" a unusual symbols to a common one
         | to make an English word then so can a lookup table. I feel like
         | this isn't a case where you'd reach first for a neural net.
        
           | nradov wrote:
           | Those are known as homoglyphs. This issue has been studied
           | and best practices are documented. There's no complete
           | solution but it can be mitigated.
           | 
           | https://en.wikipedia.org/wiki/Homoglyph
        
           | Alir3z4 wrote:
           | Yeah, then someone has to create or find that whole table and
           | make.
           | 
           | The initial problem wasn't those symbols but the content
           | itself, the symbols and special characters came into the
           | problem later.
           | 
           | Later on as mentioned in my original comment, that they would
           | use positive content from other blog posts that were
           | published/passed the moderation to mix up their bad content.
           | 
           | Probably could use a different method, but at that time
           | needed something quick and fast and it worked and still works
           | with very little tweaking.
           | 
           | Although we don't have massive amount of threats or abusers
           | anymore to exactly know the effect, but again, so far it
           | works.
           | 
           | That time, they would coming several thousands per minute, IP
           | blocking, range blocking, USER AGENT, captcha or anything
           | such didn't work on them.
        
             | jffry wrote:
             | The good news is that the Unicode consortium has a report
             | on this issue, and the tables already exist for
             | normalization and mapping of confusables to their ASCII
             | lookalikes: https://www.unicode.org/reports/tr39/
        
               | Alir3z4 wrote:
               | Oh, that's nice.
               | 
               | I guess I can use that next time time to work on the data
               | cleaning for that model.
               | 
               | Thanks.
        
               | wanderingstan wrote:
               | As I commented above:
               | 
               | I built a Python library for finding strings obfuscated
               | this way. Was critical when moderating our telegram
               | channel before an ICO.
               | https://github.com/wanderingstan/Confusables E.g. "Hel10"
               | would match "Hello"
        
           | tgsovlerkhgsel wrote:
           | You can hardcode a rule for this specific bypass. Or you just
           | retrain the neural net, and it learns that presence of these
           | symbols = bad very quickly, and you spent less time writing
           | and testing a custom solution.
        
           | infogulch wrote:
           | If you can identify text written with mixed glyphs just ban
           | it outright. Normal users don't use text like this, the pure
           | binary presence of such "homomorphic" text at all is probably
           | a better signal for spam than whatever your neural net when
           | running it after normalization.
        
             | falcolas wrote:
             | > Normal users don't use text like this
             | 
             | They kinda do. Check out the shrug "emoji", table flip, and
             | so forth. Then there's the meme of adding text above and
             | below by abusing Unicode's "super" and "sub" modifications.
             | 
             | You could block it to only ever represent ASCII, but then
             | you've knocked out the ability to expand internationally.
        
             | NavinF wrote:
             | > Normal users don't use text like this
             | 
             | Sounds like you live in a filter bubble.
             | 
             | (+deg#deg)+( +-+
             | 
             | (no*wa*)no*:*;
        
               | jart wrote:
               | I know right? There are so many times when I've wanted to
               | use something like box drawing unicode characters (cp437)
               | to explain a complicated concept on hacker news, but alas
               | I couldn't, due widespread computer fraud and abuse. How
               | are we going to build a more inclusive internet that
               | serves the interests all ALL people around the world,
               | regardless of native language, if the bad guys are
               | forcing administrators to ban unicode? (+deg#deg)+( +-+
        
             | chefandy wrote:
             | > Normal users don't use text like this
             | 
             | I think that depends on the users. People copying and
             | pasting bits of text that was in English or another common
             | language-- think documentation, code, news articles,
             | tweets, etc.-- with a different character set could be
             | problematic.
             | 
             | Also, Sme Apps marketed as "Fonts for socal media" would be
             | aught up in this. (math symbols) A user base with young
             | people getting bounced or shadow banned for trying to
             | express themselves or distinguish themselves from their
             | peers would be like tth_tth (Kannada letter ttha)
             | 
             | I think targeting the language they're using is a better
             | bet.
             | 
             | -\\_(tsu)_/- (Hirigana letter tsu)
        
               | ronsor wrote:
               | I can especially echo the "social media fonts" trend.
               | They're quite popular on certain Discord guilds at least.
               | 
               |  _Minor nitpick, but tsu is the katakana tsu._
        
               | chefandy wrote:
               | Oh, fact. Not a Japanese speaker. (or reader)
        
               | v_london wrote:
               | Bizarrely, I'm seeing recruiters using this on LinkedIn.
        
               | chefandy wrote:
               | huh. For any specific purpose? Does it seem like they
               | avoiding paying for recruiter accounts or something by
               | evading algorithms designed to detect their activity, or
               | is it just for the heck of it?
        
           | jffry wrote:
           | In fact, the Unicode consortium provides a report and
           | extensive list of "confusable" symbols, which you could use
           | alongside Unicode normalization tables to map adversarial
           | back into more ASCII-equivalent text before running it
           | through anti-spam mechanisms that are interested in the
           | content of the message.
           | 
           | https://www.unicode.org/reports/tr39/
        
             | wanderingstan wrote:
             | I built a Python library for finding strings obfuscated
             | this way. Was critical when moderating our telegram channel
             | before an ICO.
             | 
             | https://github.com/wanderingstan/Confusables
             | 
             | E.g. "Hel10" would match "Hello"
        
               | Alir3z4 wrote:
               | I should have had better eyes while searching, could find
               | this and saved some hours.
        
               | jffry wrote:
               | I only learned about it myself after spending too long
               | building my own half-baked version. I think it's in
               | pretty opaque language that makes it hard to find even if
               | you know what you want.
               | 
               | Maybe somebody else on here will see it and learn about
               | it before they need it, and at least you still have a new
               | tool to reach for in the future.
        
               | loco053100 wrote:
               | Mein Handy is voller kacke hacking Apps. Mein Handy wird
               | seit 2004 gehackt ihr seit kleine Maden doch die Maden
               | kommen jetzt zum Anwalt ich lasse mir das nicht mehr
               | bitten. Ich habe genug was ich brauche! Habt ihr kein
               | Leben anscheinend armselig
        
               | airstrike wrote:
               | You should post this to Show HN. Also you have a typo on
               | your README ("characgters")
        
               | jart wrote:
               | The one built-in to Python will get you most of the way
               | there:                   >>> import unicodedata
               | >>> unicodedata.normalize('NFKD', 'Hel10hello')
               | 'Hel10hello'
               | 
               | Obviously it isn't going to remap leetspeak characters
               | like 1 -> l but it covers a lot of cases.
        
               | wanderingstan wrote:
               | Thanks, I've fixed the typo! It was such a simple
               | project, hardly seems worthy of a "Show HN".
        
               | 8note wrote:
               | Test for the library: would it catch that that typo still
               | refers to characters?
        
               | jonplackett wrote:
               | I've seen crazier things get to #1
        
             | jonplackett wrote:
             | There should be a 'bad symbols' list you can block like you
             | do expletives. There's surely zero need to support that
             | kind of thing in comments.
        
         | [deleted]
        
         | drdebug wrote:
         | Would you be able to share information about the tools you
         | used, they could be very useful for other blogs and platforms
         | it seems!
        
           | Alir3z4 wrote:
           | Basically https://gist.github.com/Alir3z4/6b26353928633f7db59
           | f40f71c8f...
           | 
           | but don't think it will be useful for everyone, each usage
           | and requirements is different and needs different solutions.
        
         | ummonk wrote:
         | This story needs to be turned into a writeup and submitted to
         | HN on its own.
        
         | [deleted]
        
         | 5faulker wrote:
         | Tale of humans man...
        
         | cowmoo728 wrote:
         | I'm interested in why these people were doing this. Were they
         | hoping to get non-tech-savvy people that were searching for
         | computer help? I guess that's a good audience of unwitting
         | users to attempt to hack, but was the goal to get them to
         | submit to one of the remote tech support scams? Were they
         | embedding malware into your blogging platform, or getting ad
         | revenue out of this somehow?
        
           | Alir3z4 wrote:
           | > Were they hoping to get non-tech-savvy people that were
           | searching for computer help?
           | 
           | Yes.
           | 
           | They would create this posts and get quickly on search
           | results (The platform is pretty good for making SEO
           | optimization out of the box) and they would write good
           | quality posts as well.
           | 
           | They would also share this posts on some other websites,
           | especially social media accounts.
           | 
           | We don't have Google analytics or such to see where exactly
           | they would come from. I noticed huge traffic to such pages by
           | looking at the logs.
           | 
           | Our nginx log parser was alerting us about sudden spike on
           | certain blogs and pre-defined list of words we have.
           | 
           | That's when we noticed something is going on.
           | 
           | Didn't take more than couple of hours (while working on the
           | model) that we receive email from data center people about
           | hosting phishing content, again didn't take much longer we
           | received emails from some of those companies as well.
           | 
           | > Were they embedding malware into your blogging platform, or
           | getting ad revenue out of this somehow?
           | 
           | No. On the blogging platform, we have everything bleached
           | out, nothing would go in without passing through sanitizers.
           | 
           | They would simply had people convinced to call those US
           | numbers.
           | 
           | I actually called one of those numbers and yeah, it was one
           | of those customer supports some other part of the planet
           | earth and definitely not from the company he was pretending
           | to be and very quickly asked me to install team viewer on my
           | machine. I really wanted to let them access it via the
           | windows on my virtual box and have some fun with them, but
           | well, someone had to fix the moderation issue :D
        
             | pier25 wrote:
             | Why didn't you ban the user when you found out the scam?
        
               | Alir3z4 wrote:
               | We did.
               | 
               | We banned, shadow banned, deleted, recorded and many
               | other things to them to make sure they're not reaching
               | their goal.
               | 
               | The problem was, it wasn't a single user to deal with.
               | thousands and thousands with no stop.
               | 
               | Flooding the platform like zombie apocalypse.
        
               | pier25 wrote:
               | Jesus... what a nightmare.
               | 
               | Were these all using your free plans?
        
               | Alir3z4 wrote:
               | Oh yes.
               | 
               | Never ending.
               | 
               | We still keep the free plans even through there are
               | abusers, but that won't be a reason to retire it. So many
               | people using for legitimate reasons and keeping their
               | personal writing there.
               | 
               | It's unfortunate, but well, it happens.
        
               | pier25 wrote:
               | Why do you think they were using your platform to do
               | this?
               | 
               | Is it because these scammers were non technical?
               | 
               | Or maybe the anonymity provided by your platform?
        
               | Alir3z4 wrote:
               | 2 theories
               | 
               | 1. Highly technical, because the flow was scripted to
               | work with our website. Bypassing captcha, email
               | verification by using many different domains, email
               | accounts and also highly distributed via many ips.
               | 
               | 2. Non technical. Where they paid some ppl to do it
               | manually, which doesn't seems to be due to the way they
               | walk through many steps like piece of cake.
               | 
               | However, our platform was/is a target due to several
               | reasons:
               | 
               | 1. Easy to register and start blogging. 2. Free plan with
               | no hard limit. 3. Quick rankings due to SEO
               | implementation out of the box. 4. Absence of any
               | moderation before such attack.
               | 
               | And probably some other possible reasons that made their
               | job easier and us a better target.
        
         | HodorTheCoder wrote:
         | If you don't mind me asking, what sentence embeddings model
         | (bert/roberta/etc) did you have the best luck with for your
         | classifier? I like the quick retrain that can be done with an
         | approach like this, though I have found that if you throw too
         | many different SPAM profiles at a classifier it starts to
         | degrade, and you might have to build multiple and ensemble
         | them. The embedding backend can help a lot with that.
        
           | Alir3z4 wrote:
           | Tried bert but didn't get the proper result, probably wasn't
           | working with it properly.
           | 
           | Here's the old source I have on my computer that did the
           | training
           | 
           | https://gist.github.com/Alir3z4/6b26353928633f7db59f40f71c8f.
           | ..
           | 
           | This was doing the early work and later changed more to fit
           | other cases.
           | 
           | Pretty basic stuff.
        
         | neilv wrote:
         | What was the motivation of those attackers?
        
         | nothrowaways wrote:
         | I would recommend you a different approach, such as using
         | metadata like their location etc.
        
           | Alir3z4 wrote:
           | It was a coordinated attack from many different locations and
           | IPs.
           | 
           | The IP blocking was in place on application level and later
           | into Cloudflare blacklist. Still they would flood in with
           | different IPs and browsers.
        
             | nothrowaways wrote:
             | Still i would consider working on such metadata. I have
             | seen many interesting results based on this.
        
         | bcrosby95 wrote:
         | When we had this problem, we added an input element positioned
         | off-screen and ignored submissions that populated it. Cleaned
         | it all up.
        
           | Spone wrote:
           | This is usually called a "honey pot" if some people want to
           | search tools for implementing it.
        
         | tdeck wrote:
         | For adversarial problems like this, a shadowban approach can
         | sometimes be necessary. Perhaps people can still see their
         | blogs but GoogleBot gets blocked from indexing them, or they
         | only appear to someone with the spammer's cookies. That way it
         | takes them longer to catch on and evade the model.
         | 
         | Of course, that means you'll need to at least spot check your
         | bans because you can't rely on legit users escalating to you.
        
           | andrei_says_ wrote:
           | I'd presume that large scale spammers check their work in
           | incognito via a different network. It's their job.
        
           | Alir3z4 wrote:
           | Yep, shadow ban was in place as well.
           | 
           | The thing is, the people that weren't the ones posting the
           | content. It appeared their computer was affected by some type
           | of malicious file to be part of a bigger network. (botnet?)
           | 
           | I could see from the thousands of different IPs in different
           | countries around the globe, that it could be affected
           | personal computers.
           | 
           | Very few of them were computers from hosting companies, the
           | rest were normal people computers.
           | 
           | I'm sure these machines were doing the job, someone else
           | would have tests the result.
           | 
           | When we did the shadow banning it didn't made a dent in their
           | effort.
           | 
           | The way they changed email, changed username, tried to be
           | unique was completely prepared specifically for our platform
           | (I would guess so)
           | 
           | Whenever we counter their attack, they would be silent for a
           | while and then attack again. They would adjust.
           | 
           | Shadow ban is effective when the attacker themselves will not
           | be aware it, in our case it was tricky to know who was the
           | observer.
        
             | jart wrote:
             | Did they map to ISP ASNs? Country geolocation doesn't say
             | much anymore since there's so many VPN providers whose
             | business is to buy a CIDR in every country and resell
             | access.
        
               | Alir3z4 wrote:
               | Yes, almost all of them mapped to ISP ASNs.
               | 
               | Very few of them were from AWS, OVH and other hosting
               | providers, very very few.
               | 
               | We ran each IP towards black lists ips, paid IP
               | reputation checkers. Majority of the IPs were clean.
               | 
               | Back then we had IP reputation check, but it was a
               | headache to maintain, so we disabled it later, however
               | that time very very few them got stopped at IP reputation
               | checks.
        
               | kbenson wrote:
               | As someone that's used IP proxying services that provide
               | millions of IPs for scraping purposes, that is a very
               | mature industry, and they advertise (and I believe them)
               | "millions" or IPS, even for what you might consider hard
               | to supply ones, like mobile IPs, and they let you slice
               | and dice them however you want? Datacenter IPs?
               | Residential IPs? Mobile IPs?[1] What state or city would
               | you like them in? Would you like the site you're hitting
               | to not have been accessed by this IP (through proxying at
               | least), and if so how many days? Do you want some mix of
               | that? Make your own configurations and set them up as
               | proxy endpoints, etc.
               | 
               | Fighting against abuse at the level of IP address
               | attributes seems like a losing game to me. Honestly, the
               | best I saw at this (3-5 years ago at least) for traffic
               | was Distil networks, where they put a proxy device in
               | front and examine your traffic and captcha or block based
               | on that.
               | 
               | Since you have content being submitted, there's a lot
               | more you can use to classify, such as how you used ML, so
               | that's good. Part of me worries that this is all sort of
               | reminiscent of infections and antibiotics though. The
               | continual back-and-forth of you finding a block them
               | finding a workaround feels kind of like you were training
               | the spammers (even if you were training yourself at the
               | same time). At some point maybe we'll find that most the
               | forum spam is ML generated low information content posts
               | that also happen to be astroturfing that is hard to
               | distinguish from real people's opinions.
               | 
               | 1: Fun fact, to my knowledge anonymous mobile IPs are
               | provided by a bunch of apps opting into an SDK (like an
               | advertising/metrics SDK) which while their app is open
               | (at least I hope that's a requirement) registers itself
               | to the proxying service so it can be handed out for use
               | by paying proxy customers. Think about _that_ next time
               | you play your free  "ad-supported" mobile game.
        
             | bryan_w wrote:
             | Yup they utilize "residential proxies" to hide their
             | behavior which makes the situation worse because sometimes
             | it will affect your legit users
        
           | mrkurt wrote:
           | We shadow ban abusive users on Fly.io and it works great.
           | Everything seems to work right up until they try to connect
           | to a deployed app.
           | 
           | It took me a while to realize that ramping up the frustration
           | level is, itself, a helpful deterrent.
        
             | Nextgrid wrote:
             | Are you doing this because of resource usage?
             | 
             | If resources are free then you could even actually deploy
             | their app and either whitelist it for their own IP or only
             | allow very few requests before taking it down.
             | 
             | This would be even more frustrating and could ruin whatever
             | they plan to do with their abusive app in the first place.
             | Let's say they deploy their malware/phishing page, test it
             | a couple of times (possibly from a different IP) and it
             | works. They then start spamming the malicious link and
             | waste decent amounts of time/money/processing power, not
             | realizing that the link was dead after the first 10 hits.
        
               | mrkurt wrote:
               | We're primarily trying to prevent fraudulent payments
               | combined with expensive VMs. Throttling CPU to almost
               | nothing on high risk accounts sounds delightfully
               | irritating.
               | 
               | We also get the less resource intensive, but still
               | harmful abusive apps that port scan the internet. Those
               | are relatively easy to detect. We generally don't want to
               | be a source of port scans so we shut them off pretty
               | quickly.
        
             | ljm wrote:
             | I remember an old mailing list discussion on sourcehut,
             | because sourcehut provides a build service that you can use
             | for automation.
             | 
             | The decision from sircmpwn was, at the end, to charge money
             | for the service. Charging money and KnowYourCustomer will
             | kill most exploits dead.
             | 
             | In this sense, this is turning the frustration level to 11.
             | You can use the service to a certain extent, without
             | frustration, but if you want to get serious then you're
             | going to have to jump through some hoops.
             | 
             | Dedicated people will still find a way through, but you've
             | cut off 95% of the flow and killed the low-effort attempts.
             | Now, you can focus on the serious shit.
        
             | sdenton4 wrote:
             | Yeah, in the end you wind up with a small set of persistent
             | adversaries who have been tweaking their abuse alongside
             | your fixes, and a hopefully much higher wall for new
             | abusers to scale.
             | 
             | If possible, it can help to hold back new systems and
             | release a bunch of orthogonal anti abuse systems at once.
             | Then the attackers need to find multiple tweaks instead of
             | just evading one new system.
        
         | pbreit wrote:
         | How did the lousy content affect your legit users?
        
           | Alir3z4 wrote:
           | Every blog is pretty isolated from other users.
           | 
           | It's not that the content will be popped up to everyone when
           | someone posted something.
           | 
           | There's a Feed page where you can read what others you follow
           | have published.
           | 
           | There's a Explore page where latest content without any
           | filter or categorization would be visible. This is where such
           | content would appear, but only blogs older than 7 days would
           | appear there (we have removed that delay in recent versions).
           | 
           | Basically no one noticed them.
           | 
           | Although we disclosed the issue we were dealing with some of
           | the old users of platform when they complained about their
           | posts not getting published. That was the first issue in
           | first 15 minutes of the machine learning model classifying
           | wrongly due to being fed mixed content (where bad content was
           | mixed with good content from those exact blogs by spammers.)
           | 
           | Other than several bloggers reporting their posts wouldn't go
           | through as expected, no one else got effected and I hope no
           | people were lured with those scammer while their content was
           | published on our platform.
        
       | snowwrestler wrote:
       | It's not covered in this post, but IP infringement is another
       | moment in life when content moderation becomes necessary. You
       | have to be above a certain scale for large IP owners to notice or
       | care, but if you're growing and allow users to upload media,
       | you'll eventually need to start handling DMCA requests at
       | minimum.
       | 
       | Also worth noting that the infamous Section 230 is what allows
       | companies to take these sort of best-effort, do-the-best-you-can
       | approaches to content moderation without fear of lawsuit if they
       | don't get it perfect.
        
         | doh wrote:
         | We built service [0] to cover the Copyright Infringement and
         | other forms including CSAM and IBSA. While in US DMCA is
         | enough, EU has new law [1] that goes way beyond the
         | requirements of the previous law (E-commerce Directive, which
         | is analogous to the DMCA).
         | 
         | [0] https://pex.com
         | 
         | [1]
         | https://en.wikipedia.org/wiki/Directive_on_Copyright_in_the_...
        
         | Nasrudith wrote:
         | What is infamous about it? People have to lie constantly to
         | attack it and create outright alternative universes with
         | distinctions that don't really exist.
        
       | Animats wrote:
       | Roblox is working on a "moderation" system that can ban a user
       | within 100ms after saying a bad word in voice. But their average
       | user is 13 years old.
       | 
       | Interestingly, Second Life, the virtual world built of user-
       | created content, does not have this problem. Second Life has real
       | estate with strong property rights. Property owners can eject or
       | ban people from their own property. So moderation, such as it is,
       | is the responsibility of landowners. Operators of clubs ban
       | people regularly, and some share ban lists. Linden Lab generally
       | takes the position that what you and your guests do on your own
       | land is your own business, provided that it isn't visible or
       | audible beyond the parcel boundary. This works well in practice.
       | 
       | There are more and less restrictive areas. There's the "adult
       | continent", which allows adult content in public view. But
       | there's not that much in public view. Activity is mostly in
       | private homes or clubs. At the other extreme, there's a giant
       | planned unit development (60,000 houses and growing) which mostly
       | looks like upper-middle class American suburbia. It has more
       | rules and a HOA covenant. Users can choose to live or visit
       | either, or both.
       | 
       | Because it's a big 3D world, about the size of Greater London,
       | most problems are local. There's a certain amount of griefing,
       | but the world is so big that the impact is limited. Spam in
       | Second Life consists of putting up large billboards along roads.
       | 
       | Second Life has a governance group. It's about six people, for a
       | system that averages 30,000 to 50,000 concurrent connected users.
       | They deal mostly with reported incidents that fall into narrow
       | categories. Things like someone putting a tree on their property
       | that has a branch sticking out into a road and interferes with
       | traffic.
       | 
       | There's getting to be an assumption that the Internet must be
       | heavily censored. That is not correct. There are other
       | approaches. It helps that Second Life is not indexed by Google
       | and doesn't have "sharing".
        
         | dexwiz wrote:
         | For some reason I remember everyone's behavior on old school
         | message boards as much better than modern social media. Sure,
         | you have your degenerate boards, but just don't go there.
         | Moderation and censor ship will always exist, but they seem to
         | work better when they are more locally applied.
        
           | sroussey wrote:
           | Having run a platform of a million or so of those, this is
           | somewhat true. But there were spammers posting across many
           | communities and those where the mods left got littered. We
           | had to set forums to automatically move to require moderator
           | approval to post which at least saved the board in history,
           | but was a pain for a moderator if they returned.
        
           | fragmede wrote:
           | Gosh, imagine how the users of the Internet from before us
           | felt when we all joined!
           | 
           | https://en.wikipedia.org/wiki/Eternal_September
        
           | petermcneeley wrote:
           | Echoing the same sentiment Counter Strike was exactly the
           | same. There was an insane diversity of servers. Some were
           | literally labeled Adult Content and way on the other extreme
           | some were 'christian' where saying the word "shit" would get
           | you banned.
        
             | doublerabbit wrote:
             | True sense of authority keeps everyone at bay; if one a mod
             | goes rogue it all collapses. But when a mod can be
             | countable for their actions, everyone acts as a community
             | and holds the peace. A transparent modlog could really make
             | a community.
             | 
             | Clan's were more than just a bunch of mates playing a game.
             | It was a free-open community where everyone was treated
             | with respect regardless of who you were. Q3Arena was my
             | first FPS at 13 and I fell in love for just the community
             | spirit.
             | 
             | Organized clan-wars between X and Y, joining rival clan-
             | servers just to poke and have fun are days which are now
             | lost. It's the same experience of inserting a VHS cassette
             | and hitting play knowing you were going to receive a real-
             | feel of an experience.
             | 
             | I may of hit the tequila a bit too stiff tonight and this
             | really hits hard but I do wonder if the same experience
             | will ever make a come back.
        
           | gmmeyer wrote:
           | I remember it being a huge mix! It really depended on what
           | boards you were on. There were boards I was a member of in
           | 2003 that had very strong moderation and they were great!
           | 
           | I was also on some basically unmoderated boards and saw some
           | stuff I wish I didn't see.
           | 
           | I think this is more indicative of the communities you were a
           | part of than the actual behavioral norms of people at the
           | time.
        
         | fragmede wrote:
         | It sounds like Second Life lived long enough to build content
         | moderation, pushed the work of content moderation onto its
         | users, and in a hilarious psychological trick worthy of
         | machiavelli, made the users think they own a piece of something
         | (they don't) so that what other users do on "your" land is up
         | to you. My job would also love if I paid them to work there
         | instead of the other way around.
         | 
         | The Internet must be heavily censored to be suitable for
         | mainstream consumption and the tools described make Second Life
         | sound like no exception.
         | 
         |  _> You either die an MVP or live long enough to build content
         | moderation_
        
         | armchairhacker wrote:
         | The biggest reason IMO for moderation in the first place is
         | because if you don't block/censor some people, they will
         | block/censor others. Either by spamming, making others feel
         | intimidated or unwelcome, making others upset, creating "bad
         | vibes" or a boring atmosphere, etc.
         | 
         | So in theory, passing on moderation to the users seems natural.
         | The users form groups where they decide what's ok and what's
         | banned, and people join the groups where they're welcome and
         | get along. Plus, what's tolerable for some people is offensive
         | or intimidating for others and vice versa: e.g. "black
         | culture", dark humor.
         | 
         | If you choose the self-moderation route you still have to deal
         | with legal implications. Fortunately, I believe what's
         | blatantly illegal on the internet is more narrow, and you can
         | employ a smaller team of moderators to filter it out. Though I
         | can't speak much to that.
         | 
         | In practice, self-moderation _can_ be useful, and I think it 's
         | the best and only real way to allow maximum discourse. But
         | self-moderation alone is not enough. Bad communities can still
         | taint your entire ecosystem and scare people away from the good
         | ones. Trolls and spammers make up the minority of people, but
         | they have outsized influence and even more outsized coverage
         | coverage from news etc.. Not to mention they can brigade and
         | span small good communities and easily overwhelm moderators who
         | are doing this for volunteering.
         | 
         | The only times I've really seen moderation succeed are when the
         | community is largely good, reasonable, dedicated people, so the
         | few bad people get overwhelmed and pushed out. I suspect Second
         | Life is of this category. If your community is mostly toxic
         | people, there's no form of moderation which will make your
         | product viable: you need to basically force much of your
         | userbase out and replace them, and probably overhaul your site
         | in the process.
        
         | 908B64B197 wrote:
         | > Roblox is working on a "moderation" system that can ban a
         | user within 100ms after saying a bad word in voice. But their
         | average user is 13 years old.
         | 
         | Reminds me of XBox live more than 10 years ago. They banned the
         | word "Gay" since it was used as a slur by (by their estimate)
         | 98% of users.
         | 
         | But there was a two percent population that simply used it
         | legitimately. [0]
         | 
         | [0] http://www.mtv.com/news/1605966/microsoft-apologizes-for-
         | xbo...
        
       | BrianOnHN wrote:
       | So then the moderation market must be a big one.
       | 
       | With AI companies leading the pack.
       | 
       | Instead of a now [sufficiently abused] abuse reporting system.
       | 
       | What's going on? Tptb should fund these moderation companies
       | asap, it could be a more centralized and hidden censorship tool.
       | A SaaS that all companies at sufficient scale require for
       | [lawful] moderation.
        
       | GCA10 wrote:
       | Can't help but think back to W. Edwards Deming's distinction
       | between after-the-fact efforts to "inspect" quality into the
       | process -- as opposed to before-the-fact efforts to build quality
       | into the process.
       | 
       | OP offers a first-rate review (strategy + tactics!) for the
       | inspection approach.
       | 
       | But, the unspoken alternative is to rethink the on-ramp to
       | content-creation privileges, so that only people with net-
       | positive value to the community get in. That surely means a more
       | detailed registration and vetting process. Plus perhaps some way
       | of insisting on real names and validating them.
       | 
       | I can see why MVPs skip this step. And why venture firms still
       | embrace some version of "move fast and break things," even if we
       | keep learning the consequences after the IPO.
       | 
       | But sites (mostly government or non-profit) that want to serve a
       | single community quite vigilantly, without maximizing for early
       | growth, do offer another path.
        
         | caseyross wrote:
         | Absolutely this. Don't build the ship and then run around
         | plugging leaks --- plan out the ship well enough to prevent
         | leaks in the first place.
         | 
         | This is hard, and rare, because it requires predicting how all
         | sorts of different people are going to interact with the
         | community. Traditionally, this hasn't been something that the
         | people who start software companies are particularly interested
         | in, or good at. And a laser focus on user growth only compounds
         | the problem.
        
           | fragmede wrote:
           | Maybe when the Internet was new. But whether you count the
           | Internet's birth in the 1980's with the original cross-
           | content and cross-country links, or around the first dot com
           | boom and bust in 2001, or with the iPhone in 2007, we _know_
           | how  "the Internet" is going to interact with "the
           | community". We knew this back in _2016_ when Microsoft
           | released their  "AI" chatbot to Twitter, and Twitter taught
           | it to be a racist asshole in less than 24 hours+ and the
           | Internet, collectively said _duh._ Of course that was going
           | to happen.
           | 
           | Anyone who's started a new community these days knows they
           | have to start with a sort of code of conduct. That's non-
           | negotiable these days. Would it be better if platforms like
           | Discord did more to address the issue? Absolutely.
           | 
           | You're totally right it isn't easy - but the Internet's a few
           | decades old by now and we _know_ what 's going to happen to
           | your warm cosy website that allows commenting. The instant
           | the trolls find it, _you either die an MVP or live long
           | enough to build content moderation_.
           | 
           | +: https://www.theverge.com/2016/3/24/11297050/tay-microsoft-
           | ch...
        
           | dkh wrote:
           | In this case, a mix of both is required. While you absolutely
           | must plan ahead and implement as many safeguards as you can
           | prior to launch, that's simply the beginning, and it is
           | incredibly naive to think that all the leaks can be
           | prevented. (Or, honestly, that really any aspect of a
           | community can be perfectly master-planned in advance.) To
           | operate anything like a UGC platform is to be eternally
           | engaged in a battle against ever-evolving and increasingly
           | clever methods someone will come up with to exploit,
           | sabotage, or otherwise harm your platform.
           | 
           | This is totally fine -- you just need to acknowledge this and
           | try not to drop the ball when things seem like they're
           | running smoothly. Employing every tactic at your disposal
           | from the very beginning should be viewed as a prerequisite,
           | one that will start you off in a strong position and able to
           | evolve without first having to play catch-up.
        
       | fungiblecog wrote:
       | How about building something useful rather than yet another
       | (anti-) "social" media platform?
        
       | invincivlepvt wrote:
       | https://invincivlepvt.com/logo-designing-jalandhar/ - Logo
       | Designing Jalandhar https://invincivlepvt.com/poster-designing-
       | jalandhar-2/ - Poster Designing Jalandhar
       | https://invincivlepvt.com/banner-designing-jalandhar-2/ - Banner
       | Designing Jalandhar https://invincivlepvt.com/business-card-
       | designing-jalandhar-... - Business Card Designing
        
       ___________________________________________________________________
       (page generated 2021-09-28 23:00 UTC)