[HN Gopher] Characterizing and identifying shills in social medi...
       ___________________________________________________________________
        
       Characterizing and identifying shills in social media (2017) [pdf]
        
       Author : uLogMicheal
       Score  : 40 points
       Date   : 2023-07-12 16:04 UTC (6 hours ago)
        
 (HTM) web link (sbp-brims.org)
 (TXT) w3m dump (sbp-brims.org)
        
       | nonameiguess wrote:
       | Not sure about the title of this submission. The construction is
       | ambiguous. The shills identified here were not funded by the DoD.
       | They seem to have probably been funded by political campaigns.
       | This study was funded by the DoD.
        
         | Der_Einzige wrote:
         | Sounds like what a shill would say
        
         | uLogMicheal wrote:
         | Tried to fit it all, maybe "DoD Report:" would have been
         | better? On another note, would be great to see the amount of
         | money that flows from our gov into such programs.
        
           | dang wrote:
           | The site guidelines ask you to " _Please use the original
           | title, unless it is misleading or linkbait; don 't
           | editorialize._"
           | 
           | If the study was DoD funded, that's a great piece of
           | information to share--but please do so via a comment in the
           | thread. Title fields are not supposed to be for sharing the
           | details that the submitter considers important. (Other social
           | news sites work that way, but HN intentionally doesn't.)
           | 
           | Past explanations here in case helpful: https://hn.algolia.co
           | m/?dateRange=all&page=0&prefix=false&so...
        
             | uLogMicheal wrote:
             | Were there any other actions taken to moderate this post
             | beyond the title change? Genuinely curious, momentum halted
             | and post fell to page 4 shortly after.
        
               | dang wrote:
               | It got all the actions: first a moderator downweighted it
               | (so that it would be lower on the front page, but not off
               | it), then it set off the flamewar detector, and it also
               | got flagged by users. I've rolled back most of that now.
        
               | uLogMicheal wrote:
               | A sad reality of discussion surrounding this topic. Thank
               | you for the transparency.
        
           | HWR_14 wrote:
           | Why lead with the DoD at all. If anything, the funding source
           | is of secondary interest.
        
             | uLogMicheal wrote:
             | Why do people care more when they see YC in the title of a
             | post?
        
               | HWR_14 wrote:
               | YC hosts this forum. If we were on a DoD forum it would
               | matter more.
        
       | 2OEH8eoCRo0 wrote:
       | They don't mention /u/Cant_Trust_Hillary on Reddit who said in an
       | AMA that they were paid to be a serial poster under that
       | username.
       | 
       | I don't think they posted anything political, they were similar
       | to /u/GallowBoob where they were good at making the front page.
        
       | DANmode wrote:
       | [flagged]
        
         | dang wrote:
         | " _Please don 't complain about tangential annoyances--e.g.
         | article or website formats, name collisions, or back-button
         | breakage. They're too common to be interesting._"
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
         | ryanisnan wrote:
         | If only there were a medium where we could transfer text across
         | the internet...
        
         | pmarreck wrote:
         | I'm on Linux (NixOS) and have zero trouble viewing PDF's. And
         | coming from a former Mac user (for whom the PDF format was
         | practically _native_ to the OS), that says a LOT.  "Evince" is
         | Gnome's document/PDF viewer. It's a fine format that is broadly
         | usable. I hit Spacebar, the PDF "quicklooks" (just like on
         | macOS!) thanks to Gnome Sushi, it's great.
         | 
         | Here, if you're a Linux guy, here's a bash function to view
         | sexed-up man pages as PDF's (requires evince):
         | pman() {         tmpfile=$(mktemp --suffix=.pdf /tmp/$1.XXXXXX)
         | man -Tpdf "$@" >> $tmpfile 2>/dev/null         evince $tmpfile
         | }
        
         | lockhouse wrote:
         | Is there a better high fidelity, printing friendly format with
         | viewers available for all major platforms out there?
         | 
         | PDF has a lot of flaws, but what's the alternative here? Word
         | .DOC files are proprietary, RTF files are too limited, and HTML
         | allows for too much variation between browsers and looks like
         | garbage when printed.
        
           | DANmode wrote:
           | > HTML allows for too much variation between browsers and
           | looks like garbage when printed.
           | 
           | Disagree.
           | 
           | Serve it how you want the user to see it, but don't stop the
           | user from choosing their own formats for comfort or
           | accessibility afterward.
        
             | lockhouse wrote:
             | Or just use PDF where it will look nearly identical
             | everywhere, even in print.
             | 
             | Not everybody has time to test multiple browsers to make
             | sure content actually looks good across Chrome, Firefox,
             | Safari, and Edge, plus their mobile counterparts. Oh, and
             | make a nice looking PDF on top of that.
             | 
             | PDF is perfectly fine for distributing a paper. In fact,
             | I'd say it is the _preferred_ means for most people.
        
           | jvanderbot wrote:
           | > HTML allows for too much variation between browsers and
           | looks like garbage when printed.
           | 
           | Yeah, nowadays. But it's pretty simple to print _just text_
           | pretty nicely from a website.
        
             | Barrin92 wrote:
             | >But it's pretty simple
             | 
             | it really isn't. Depending on what you display your text
             | with the results may look very different depending on the
             | platform. 'Portable' is in the name for a reason, the big
             | advantage of pdfs is that they'll look the same everywhere.
        
               | tiffanyg wrote:
               | _... they 'll look the same everywhere._
               | 
               | Hahahahhahhahahhahhahahahha ahhahahhaha ...
               | ajajjjajjajaja...
               | 
               | Look, I'm laughing so hard, I suddenly turned Spanish*.
               | Lol.
               | 
               | That wasn't true in the late 80s and 90s, and it isn't
               | true today. The reasons are different, and PDF _does_
               | allow a lot of control, PLUS, PDFs undoubtedly DO look
               | "the same everywhere" to the degree of "fidelity" you're
               | implicitly assuming (and, to be fair, the degree that is
               | most important in this kind of casual discussion), but,
               | my experiences with PDF and "same everywhere" (including
               | across types of media) lead to uproarious laughter on the
               | outside (and crying and gnashing of the teeth on the
               | inside *<:o) ).
               | 
               | * j is so close to h in qwerty, mashing the keys =
               | suddenly Spanish (where jajaja is often used / sensible
               | with "soft j")
        
               | DANmode wrote:
               | But will they be able to execute the same malicious code
               | everywhere?!
               | 
               | (No, they will not.)
        
             | lockhouse wrote:
             | > But it's pretty simple to print just text pretty nicely
             | from a website.
             | 
             | That hasn't been my experience at all. I've had to resort
             | to taking screenshots or copying and pasting text from
             | websites into Word to print something actually readable.
        
             | adrian_b wrote:
             | I do not know what browser you are using, but both Chrome
             | and Firefox are extremely bad at printing.
             | 
             | Using the "reader mode" to print "just text" works on some
             | sites, but on other sites it loses essential information
             | from the original page or formats the text in such an
             | inappropriate way that it becomes very hard to read. The
             | "reader mode" is intended to be read loudly and not as an
             | aid for printing.
             | 
             | What is needed is a print command that renders the page
             | exactly like on the display, instead of rendering it for
             | printing in such a way that much of the text becomes
             | obscured by various junk.
        
           | presbyterian wrote:
           | An HTML version for web viewing, a PDF for printing.
        
             | Modified3019 wrote:
             | PDFs allow for easy and portable saving, sharing, and
             | generally also searching, all in a single file that will
             | always have a consistent presentation.
             | 
             | Reasons not to use HTML
             | 
             | - HTML files generally have load of external calls to css
             | and javascript that don't allow for making clean single
             | files as is.
             | 
             | - Reality has proven that webdev will not to put effort
             | into making saving pages cleanly viable. Expecting
             | otherwise is _fragile_ thinking
             | 
             | - Even if you go above and beyond the standard user and use
             | a browser addon for saving webpages as a single file, this
             | doesn't always work out as formatting generally becomes
             | fucked in some way, sometimes severely.
             | 
             | - Often straight up a mandated requirement.
             | 
             | - HMTL links will always rot, but a government document
             | faxed, photocopied, faxed again, and scanned into a PDF
             | file spread haphazardly across a dozen university and legal
             | archives is _forever_
             | 
             | Summarizing the results of some government project is a
             | _perfect_ use case for PDFs.
        
             | lockhouse wrote:
             | That's double work, especially considering that browsers
             | have built-in native PDF viewers these days.
        
               | presbyterian wrote:
               | It's not double work, it's the proper amount of work to
               | do the task you (should) want to do: share information in
               | an accessible way. Also, there are a billion different
               | ways you can create decent HTML and a PDF from one
               | source.
        
               | DANmode wrote:
               | The "Save" dialog in any browser, is a good example.
        
               | lockhouse wrote:
               | Save results in the often terrible browser printed
               | output.
        
               | presbyterian wrote:
               | They're saying that because browsers render the HTML/CSS
               | differently, the prints wouldn't look good. I don't think
               | this is necessarily true, but even if it is, CSS lets you
               | style the print version of a page directly so that the
               | print looks good, and you can even make it look like a
               | normal document and not a printed web page
        
         | _Algernon_ wrote:
         | Scientific papers tend to be.
        
         | ykonstant wrote:
         | Clearly funded by Big Doc.
        
         | MilStdJunkie wrote:
         | Oh goodness. Let's talk about web-based print.
         | 
         | Over the decades, we've gone from non-web typesetting like
         | troff/nroff/groff-t, TeX to first-gen DTP like InterLeaf to
         | often-deceptively-marketed XSL toolchains [0] to vendors like
         | Prince.
         | 
         | Somewhere in there, W3C created Paged Media Module Technical
         | Group for CSS3, which would - someday!! - integrate CSS @media
         | calls to support true typesetting and layout . . just from the
         | web! Wow!
         | 
         | Oh wait . . that was around 2004.
         | 
         | So, since the PMM TG has been, I don't know, making paperclip
         | men for two decades[1], others have taken that bull by the
         | horns and implemented their own web print systems. Or selling
         | them, for . . for a LOT of money; Prince is basically a closed-
         | source CSS PMM implementation.
         | 
         | Disclaimer: I use Asciidoc for everything, so asciidoctor-web-
         | pdf comes to mind. It's built on Paged.JS/Puppeteer
         | (https://pagedjs.org), so issues might lag because of the
         | Puppeteer dependency. On the plus side, it's integrated into
         | Antora (https://antora.org/), so, hey, hello git-based pdf
         | scheduler. How ya doin?
         | 
         | Back in the day, FlyingSaucer was the tech for web print; I've
         | seen and wrenched it in a dozen different S1000D stacks. Its
         | current incarnation is the ubiquitous openhtmltopdf and its
         | descendants, it still has huge adoption in the Android
         | ecosystem.
         | 
         | ReLaXed is the father of all JS web print implementations, but
         | it's been dormant for awhile. I wouldn't use it directly.
         | Besides, most of it lives on in Paged.js and others.
         | 
         | Vivliostyle is another great implementation on JS , based on
         | the Chromium renderer rather than riding hard on Puppeteer.
         | It's also got a really great widget that allows you to mark up
         | HTML pages for reviews. Very neat.
         | 
         | Sorry, WeasyPrint. You're rad, but you're on Python, and none
         | of my customers have been cool with that. If Python is cleared
         | for your environment though, check it out. The svg handling is
         | tight.
         | 
         | [0] I'll talk about that later. Let's not go into the afternoon
         | angry.
         | 
         | [1] OK OK, I get it: the user base for web-pdf is always tiny
         | in comparison to web frameworks. Most of the rest of the world
         | has moved on from complex print, and, I mean, seriously, why
         | not? HTML + JS + CSS does anything, and without Acrobat, and
         | with control over load, AND without conditionals for print
         | focused stuff, like trimming SVGs, or color profiles, or a
         | million other things. AND AND AND. BUT. Aerospace, though,
         | we're a print-focused business, we need the dead tree
         | simulator. And so I've been riding this stupid donkey for
         | twenty years now.
        
       | Eisenstein wrote:
       | > After reading all of the 1,000 replies by the user, the human
       | then made the assignment based on the following criteria: (1)
       | "Did the user's replies entirely, or almost entirely support one
       | candidate?"; (2) "Did the user's posts generally contain claims
       | to support their arguments?"; and (3) "Did the user explicitly
       | mention a tie to any campaign?"
       | 
       | If they can't see why these criteria are nonsensical then their
       | entire premise should be discarded.
        
         | lcnPylGDnU4H9OF wrote:
         | > why these criteria are nonsensical
         | 
         | I don't conduct surveys or other research so it's not
         | immediately obvious to me. I would guess that it's some
         | combination of:
         | 
         | 1) Humans making subjective calls like "generally contain
         | claims to support" (and even just the subjectivity of the word
         | "support");
         | 
         | 2) Most opinionated users (read: people who reply 1,000 times)
         | on social media are going to "entirely, or almost entirely
         | support one candidate";
         | 
         | 3) Most people are going to post things which "generally
         | contain claims to support their arguments".
         | 
         | But I'm not really sure if those are (to a researcher who ought
         | to know) the obvious problems with these criteria. Is there
         | another problem you see which is not included here? (Are these
         | suggested "problems" even problems in this case?)
        
       | kodah wrote:
       | Are shills misinformation adjacent in some way? It's interesting
       | to see activists listed as potential shills, though they don't
       | clearly explain the body of evidence that aligns activists as
       | potentially harmful. An activist _can_ be a shill but I 'm not
       | sold that they're all bad, but I've definitely run into a good
       | number of activists who border so far on not knowing what they're
       | talking about or how to talk about it that they're essentially
       | harmful to either information quality or discourse among regular
       | people.
        
         | uLogMicheal wrote:
         | It was odd to me that government actors seemed to not be a
         | mention in this study. Maybe they fall under the activist
         | category?
        
           | obviouslynotme wrote:
           | It is funded by the US Government. The only shills it will
           | find are Russian, maybe the Chinese these days, the
           | politically unpopular, and social media marketers.
        
         | at_a_remove wrote:
         | Shills are absolutely misinformation because they distort the
         | weighting.
         | 
         | If I am one guy and I complain about the font used in the
         | question mark under the help menu, I am just one guy. If I
         | spawn or buy or suborn a thousand accounts to complain about
         | it, my message is amplified and can distort priorities, pass
         | thresholds, and so forth. Shills can downvote something out of
         | being seen at all, even on here unless you turn on "showdead."
         | It's basically the commodification of a Sybil attack.
        
           | uLogMicheal wrote:
           | or even passive bots that run sentiment analysis and attach
           | votes to what they favor...
        
       | adeon wrote:
       | One of the footnotes mention that you can get a JSON version of a
       | Reddit page by adding .json to the end of the URL.
       | 
       | Sure enough I tried it and it works:
       | https://old.reddit.com/r/MachineLearning/.json
       | 
       | TIL. I wonder if that's going away given the recent Reddit API
       | access shenanigans.
        
         | knodi123 wrote:
         | I found that trick a while back when trying to write a
         | bookmarklet for downloading videos from reddit. I got it to
         | work, but it downloads a version without audio... Since this is
         | often superior, I stopped work and declared victory.
         | 
         | Try making a bookmark whose url is this:
         | javascript:fetch(document.location.href.replace(/\/$/, '') +
         | '.json').then(r => r.json()).then(j => {document.location.href
         | = (j[0].data['children'][0].data.secure_media ? j[0].data['chil
         | dren'][0].data.secure_media.reddit_video.fallback_url : j[0].da
         | ta['children'][0].data.crosspost_parent_list[0].secure_media.re
         | ddit_video.fallback_url)})
         | 
         | Then go to any reddit page where the main post is a video file,
         | and click the bookmark!
        
       | boredumb wrote:
       | Have they tried walking down the hall to find the various people
       | whose campaigns were/are funding them?
        
       | rootsudo wrote:
       | This reminds me of this:
       | 
       | https://cryptome.org/2012/07/gent-forum-spies.htm
       | 
       | The Gentleperson's Guide To Forum Spies
        
       | specproc wrote:
       | I think the interesting thing about this article is how quaint it
       | feels five years on. In terms of methods, sure, but also the
       | goal.
       | 
       | There's this implicit assumption that a) ML could effectively
       | spot bad actors, and b) this is a fight someone, somewhere is
       | able to win.
       | 
       | I feel the information environment has degraded so badly over the
       | last decade that this feels naive.
        
       ___________________________________________________________________
       (page generated 2023-07-12 23:01 UTC)