[HN Gopher] Characterizing and identifying shills in social medi...
___________________________________________________________________
Characterizing and identifying shills in social media (2017) [pdf]
Author : uLogMicheal
Score : 40 points
Date : 2023-07-12 16:04 UTC (6 hours ago)
(HTM) web link (sbp-brims.org)
(TXT) w3m dump (sbp-brims.org)
| nonameiguess wrote:
| Not sure about the title of this submission. The construction is
| ambiguous. The shills identified here were not funded by the DoD.
| They seem to have probably been funded by political campaigns.
| This study was funded by the DoD.
| Der_Einzige wrote:
| Sounds like what a shill would say
| uLogMicheal wrote:
| Tried to fit it all, maybe "DoD Report:" would have been
| better? On another note, would be great to see the amount of
| money that flows from our gov into such programs.
| dang wrote:
| The site guidelines ask you to " _Please use the original
| title, unless it is misleading or linkbait; don 't
| editorialize._"
|
| If the study was DoD funded, that's a great piece of
| information to share--but please do so via a comment in the
| thread. Title fields are not supposed to be for sharing the
| details that the submitter considers important. (Other social
| news sites work that way, but HN intentionally doesn't.)
|
| Past explanations here in case helpful: https://hn.algolia.co
| m/?dateRange=all&page=0&prefix=false&so...
| uLogMicheal wrote:
| Were there any other actions taken to moderate this post
| beyond the title change? Genuinely curious, momentum halted
| and post fell to page 4 shortly after.
| dang wrote:
| It got all the actions: first a moderator downweighted it
| (so that it would be lower on the front page, but not off
| it), then it set off the flamewar detector, and it also
| got flagged by users. I've rolled back most of that now.
| uLogMicheal wrote:
| A sad reality of discussion surrounding this topic. Thank
| you for the transparency.
| HWR_14 wrote:
| Why lead with the DoD at all. If anything, the funding source
| is of secondary interest.
| uLogMicheal wrote:
| Why do people care more when they see YC in the title of a
| post?
| HWR_14 wrote:
| YC hosts this forum. If we were on a DoD forum it would
| matter more.
| 2OEH8eoCRo0 wrote:
| They don't mention /u/Cant_Trust_Hillary on Reddit who said in an
| AMA that they were paid to be a serial poster under that
| username.
|
| I don't think they posted anything political, they were similar
| to /u/GallowBoob where they were good at making the front page.
| DANmode wrote:
| [flagged]
| dang wrote:
| " _Please don 't complain about tangential annoyances--e.g.
| article or website formats, name collisions, or back-button
| breakage. They're too common to be interesting._"
|
| https://news.ycombinator.com/newsguidelines.html
| ryanisnan wrote:
| If only there were a medium where we could transfer text across
| the internet...
| pmarreck wrote:
| I'm on Linux (NixOS) and have zero trouble viewing PDF's. And
| coming from a former Mac user (for whom the PDF format was
| practically _native_ to the OS), that says a LOT. "Evince" is
| Gnome's document/PDF viewer. It's a fine format that is broadly
| usable. I hit Spacebar, the PDF "quicklooks" (just like on
| macOS!) thanks to Gnome Sushi, it's great.
|
| Here, if you're a Linux guy, here's a bash function to view
| sexed-up man pages as PDF's (requires evince):
| pman() { tmpfile=$(mktemp --suffix=.pdf /tmp/$1.XXXXXX)
| man -Tpdf "$@" >> $tmpfile 2>/dev/null evince $tmpfile
| }
| lockhouse wrote:
| Is there a better high fidelity, printing friendly format with
| viewers available for all major platforms out there?
|
| PDF has a lot of flaws, but what's the alternative here? Word
| .DOC files are proprietary, RTF files are too limited, and HTML
| allows for too much variation between browsers and looks like
| garbage when printed.
| DANmode wrote:
| > HTML allows for too much variation between browsers and
| looks like garbage when printed.
|
| Disagree.
|
| Serve it how you want the user to see it, but don't stop the
| user from choosing their own formats for comfort or
| accessibility afterward.
| lockhouse wrote:
| Or just use PDF where it will look nearly identical
| everywhere, even in print.
|
| Not everybody has time to test multiple browsers to make
| sure content actually looks good across Chrome, Firefox,
| Safari, and Edge, plus their mobile counterparts. Oh, and
| make a nice looking PDF on top of that.
|
| PDF is perfectly fine for distributing a paper. In fact,
| I'd say it is the _preferred_ means for most people.
| jvanderbot wrote:
| > HTML allows for too much variation between browsers and
| looks like garbage when printed.
|
| Yeah, nowadays. But it's pretty simple to print _just text_
| pretty nicely from a website.
| Barrin92 wrote:
| >But it's pretty simple
|
| it really isn't. Depending on what you display your text
| with the results may look very different depending on the
| platform. 'Portable' is in the name for a reason, the big
| advantage of pdfs is that they'll look the same everywhere.
| tiffanyg wrote:
| _... they 'll look the same everywhere._
|
| Hahahahhahhahahhahhahahahha ahhahahhaha ...
| ajajjjajjajaja...
|
| Look, I'm laughing so hard, I suddenly turned Spanish*.
| Lol.
|
| That wasn't true in the late 80s and 90s, and it isn't
| true today. The reasons are different, and PDF _does_
| allow a lot of control, PLUS, PDFs undoubtedly DO look
| "the same everywhere" to the degree of "fidelity" you're
| implicitly assuming (and, to be fair, the degree that is
| most important in this kind of casual discussion), but,
| my experiences with PDF and "same everywhere" (including
| across types of media) lead to uproarious laughter on the
| outside (and crying and gnashing of the teeth on the
| inside *<:o) ).
|
| * j is so close to h in qwerty, mashing the keys =
| suddenly Spanish (where jajaja is often used / sensible
| with "soft j")
| DANmode wrote:
| But will they be able to execute the same malicious code
| everywhere?!
|
| (No, they will not.)
| lockhouse wrote:
| > But it's pretty simple to print just text pretty nicely
| from a website.
|
| That hasn't been my experience at all. I've had to resort
| to taking screenshots or copying and pasting text from
| websites into Word to print something actually readable.
| adrian_b wrote:
| I do not know what browser you are using, but both Chrome
| and Firefox are extremely bad at printing.
|
| Using the "reader mode" to print "just text" works on some
| sites, but on other sites it loses essential information
| from the original page or formats the text in such an
| inappropriate way that it becomes very hard to read. The
| "reader mode" is intended to be read loudly and not as an
| aid for printing.
|
| What is needed is a print command that renders the page
| exactly like on the display, instead of rendering it for
| printing in such a way that much of the text becomes
| obscured by various junk.
| presbyterian wrote:
| An HTML version for web viewing, a PDF for printing.
| Modified3019 wrote:
| PDFs allow for easy and portable saving, sharing, and
| generally also searching, all in a single file that will
| always have a consistent presentation.
|
| Reasons not to use HTML
|
| - HTML files generally have load of external calls to css
| and javascript that don't allow for making clean single
| files as is.
|
| - Reality has proven that webdev will not to put effort
| into making saving pages cleanly viable. Expecting
| otherwise is _fragile_ thinking
|
| - Even if you go above and beyond the standard user and use
| a browser addon for saving webpages as a single file, this
| doesn't always work out as formatting generally becomes
| fucked in some way, sometimes severely.
|
| - Often straight up a mandated requirement.
|
| - HMTL links will always rot, but a government document
| faxed, photocopied, faxed again, and scanned into a PDF
| file spread haphazardly across a dozen university and legal
| archives is _forever_
|
| Summarizing the results of some government project is a
| _perfect_ use case for PDFs.
| lockhouse wrote:
| That's double work, especially considering that browsers
| have built-in native PDF viewers these days.
| presbyterian wrote:
| It's not double work, it's the proper amount of work to
| do the task you (should) want to do: share information in
| an accessible way. Also, there are a billion different
| ways you can create decent HTML and a PDF from one
| source.
| DANmode wrote:
| The "Save" dialog in any browser, is a good example.
| lockhouse wrote:
| Save results in the often terrible browser printed
| output.
| presbyterian wrote:
| They're saying that because browsers render the HTML/CSS
| differently, the prints wouldn't look good. I don't think
| this is necessarily true, but even if it is, CSS lets you
| style the print version of a page directly so that the
| print looks good, and you can even make it look like a
| normal document and not a printed web page
| _Algernon_ wrote:
| Scientific papers tend to be.
| ykonstant wrote:
| Clearly funded by Big Doc.
| MilStdJunkie wrote:
| Oh goodness. Let's talk about web-based print.
|
| Over the decades, we've gone from non-web typesetting like
| troff/nroff/groff-t, TeX to first-gen DTP like InterLeaf to
| often-deceptively-marketed XSL toolchains [0] to vendors like
| Prince.
|
| Somewhere in there, W3C created Paged Media Module Technical
| Group for CSS3, which would - someday!! - integrate CSS @media
| calls to support true typesetting and layout . . just from the
| web! Wow!
|
| Oh wait . . that was around 2004.
|
| So, since the PMM TG has been, I don't know, making paperclip
| men for two decades[1], others have taken that bull by the
| horns and implemented their own web print systems. Or selling
| them, for . . for a LOT of money; Prince is basically a closed-
| source CSS PMM implementation.
|
| Disclaimer: I use Asciidoc for everything, so asciidoctor-web-
| pdf comes to mind. It's built on Paged.JS/Puppeteer
| (https://pagedjs.org), so issues might lag because of the
| Puppeteer dependency. On the plus side, it's integrated into
| Antora (https://antora.org/), so, hey, hello git-based pdf
| scheduler. How ya doin?
|
| Back in the day, FlyingSaucer was the tech for web print; I've
| seen and wrenched it in a dozen different S1000D stacks. Its
| current incarnation is the ubiquitous openhtmltopdf and its
| descendants, it still has huge adoption in the Android
| ecosystem.
|
| ReLaXed is the father of all JS web print implementations, but
| it's been dormant for awhile. I wouldn't use it directly.
| Besides, most of it lives on in Paged.js and others.
|
| Vivliostyle is another great implementation on JS , based on
| the Chromium renderer rather than riding hard on Puppeteer.
| It's also got a really great widget that allows you to mark up
| HTML pages for reviews. Very neat.
|
| Sorry, WeasyPrint. You're rad, but you're on Python, and none
| of my customers have been cool with that. If Python is cleared
| for your environment though, check it out. The svg handling is
| tight.
|
| [0] I'll talk about that later. Let's not go into the afternoon
| angry.
|
| [1] OK OK, I get it: the user base for web-pdf is always tiny
| in comparison to web frameworks. Most of the rest of the world
| has moved on from complex print, and, I mean, seriously, why
| not? HTML + JS + CSS does anything, and without Acrobat, and
| with control over load, AND without conditionals for print
| focused stuff, like trimming SVGs, or color profiles, or a
| million other things. AND AND AND. BUT. Aerospace, though,
| we're a print-focused business, we need the dead tree
| simulator. And so I've been riding this stupid donkey for
| twenty years now.
| Eisenstein wrote:
| > After reading all of the 1,000 replies by the user, the human
| then made the assignment based on the following criteria: (1)
| "Did the user's replies entirely, or almost entirely support one
| candidate?"; (2) "Did the user's posts generally contain claims
| to support their arguments?"; and (3) "Did the user explicitly
| mention a tie to any campaign?"
|
| If they can't see why these criteria are nonsensical then their
| entire premise should be discarded.
| lcnPylGDnU4H9OF wrote:
| > why these criteria are nonsensical
|
| I don't conduct surveys or other research so it's not
| immediately obvious to me. I would guess that it's some
| combination of:
|
| 1) Humans making subjective calls like "generally contain
| claims to support" (and even just the subjectivity of the word
| "support");
|
| 2) Most opinionated users (read: people who reply 1,000 times)
| on social media are going to "entirely, or almost entirely
| support one candidate";
|
| 3) Most people are going to post things which "generally
| contain claims to support their arguments".
|
| But I'm not really sure if those are (to a researcher who ought
| to know) the obvious problems with these criteria. Is there
| another problem you see which is not included here? (Are these
| suggested "problems" even problems in this case?)
| kodah wrote:
| Are shills misinformation adjacent in some way? It's interesting
| to see activists listed as potential shills, though they don't
| clearly explain the body of evidence that aligns activists as
| potentially harmful. An activist _can_ be a shill but I 'm not
| sold that they're all bad, but I've definitely run into a good
| number of activists who border so far on not knowing what they're
| talking about or how to talk about it that they're essentially
| harmful to either information quality or discourse among regular
| people.
| uLogMicheal wrote:
| It was odd to me that government actors seemed to not be a
| mention in this study. Maybe they fall under the activist
| category?
| obviouslynotme wrote:
| It is funded by the US Government. The only shills it will
| find are Russian, maybe the Chinese these days, the
| politically unpopular, and social media marketers.
| at_a_remove wrote:
| Shills are absolutely misinformation because they distort the
| weighting.
|
| If I am one guy and I complain about the font used in the
| question mark under the help menu, I am just one guy. If I
| spawn or buy or suborn a thousand accounts to complain about
| it, my message is amplified and can distort priorities, pass
| thresholds, and so forth. Shills can downvote something out of
| being seen at all, even on here unless you turn on "showdead."
| It's basically the commodification of a Sybil attack.
| uLogMicheal wrote:
| or even passive bots that run sentiment analysis and attach
| votes to what they favor...
| adeon wrote:
| One of the footnotes mention that you can get a JSON version of a
| Reddit page by adding .json to the end of the URL.
|
| Sure enough I tried it and it works:
| https://old.reddit.com/r/MachineLearning/.json
|
| TIL. I wonder if that's going away given the recent Reddit API
| access shenanigans.
| knodi123 wrote:
| I found that trick a while back when trying to write a
| bookmarklet for downloading videos from reddit. I got it to
| work, but it downloads a version without audio... Since this is
| often superior, I stopped work and declared victory.
|
| Try making a bookmark whose url is this:
| javascript:fetch(document.location.href.replace(/\/$/, '') +
| '.json').then(r => r.json()).then(j => {document.location.href
| = (j[0].data['children'][0].data.secure_media ? j[0].data['chil
| dren'][0].data.secure_media.reddit_video.fallback_url : j[0].da
| ta['children'][0].data.crosspost_parent_list[0].secure_media.re
| ddit_video.fallback_url)})
|
| Then go to any reddit page where the main post is a video file,
| and click the bookmark!
| boredumb wrote:
| Have they tried walking down the hall to find the various people
| whose campaigns were/are funding them?
| rootsudo wrote:
| This reminds me of this:
|
| https://cryptome.org/2012/07/gent-forum-spies.htm
|
| The Gentleperson's Guide To Forum Spies
| specproc wrote:
| I think the interesting thing about this article is how quaint it
| feels five years on. In terms of methods, sure, but also the
| goal.
|
| There's this implicit assumption that a) ML could effectively
| spot bad actors, and b) this is a fight someone, somewhere is
| able to win.
|
| I feel the information environment has degraded so badly over the
| last decade that this feels naive.
___________________________________________________________________
(page generated 2023-07-12 23:01 UTC)