[HN Gopher] Twitter showed us its algorithm - what does it tell us?
___________________________________________________________________
Twitter showed us its algorithm - what does it tell us?
Author : randomwalker
Score : 83 points
Date : 2023-04-11 00:39 UTC (22 hours ago)
(HTM) web link (knightcolumbia.org)
(TXT) w3m dump (knightcolumbia.org)
| mountainofdeath wrote:
| Given what has been open sourced so far, it makes sense that
| content that is likely to be controversial, or content that
| generates neutral to negative engagement would have a smaller
| probability of being displayed.
|
| I suspected this and told my far-right/left wing acquaintances,
| that no, Twitter (and Facebook too) isn't suppressing you, your
| content is just a net-negative from the platform's perspective.
| The platform is in the business of keeping the bulk of its users
| and advertisers happy.
| akira2501 wrote:
| When people talk about suppression do they mean that their own
| tweets are suppressed, or the tweets of people they follow are?
| Or tweets from news organizations are, depending on their
| content?
| whimsicalism wrote:
| The evidence is abundant for active suppression of certain
| political views beyond just predicted engagement. No clue about
| Twitter's algo, but Facebook certainly does this.
| dekhn wrote:
| Can you, uh, share that abundant evidence?
| whimsicalism wrote:
| Clear-cut example #1, which you might consider "flipping
| switches" more than part of the algo is the suppression of
| posting about Hunter Biden/laptop on Twitter.
|
| Instagram also downweights posts about Biden passing 1994
| crime bill. [0]
|
| [0]:
| https://twitter.com/perma___ben/status/1339293381625864195
| dekhn wrote:
| Do you have something substantive? IE, evidence that
| there is a systemic, large-scale policy being applied?
| Anyway, it looks like they outsource their fact checking
| and applied a fact check. The fact check is extensively
| documented here: https://www.usatoday.com/story/news/fact
| check/2020/07/03/fac...
|
| I'm a bit tired of the "aggrieved people claiming
| censorship" over small potatoes that often turns out to
| be "the company applied their policy uniformly"
| paulpauper wrote:
| Tweets with hashtags and links do worse.
|
| It would seem anti-woke tweets do very well. I see such tweets a
| lot when logged out.
|
| Replying to an account with an unverified account is
| automatically collapsed. Only twitter blue accounts get to post
| replies in comments and not have those comments be collapsed.
| They can also post replies to their own tweets without the
| replies being collapsed.
| simplotek wrote:
| I was disappointed by this article and how it omitted the fact
| that Twitter hardcoded how references to Russia's invasion of
| Ukraine should be downranked.
|
| https://news.ycombinator.com/item?id=35410841
| theteapot wrote:
| That HN artcile is "[flagged]". Reading the comments I think
| it's because there is no analysis of what "UkraineCrisisTopic"
| actually means or does? Author seems to just grep code base for
| "Ukraine" then draw what conclusion fits narrative.
| ROTMetro wrote:
| Versus the push that we should draw the conclusion that it's
| a nothing burger? It definitely highlights that Twitter is
| happy to categorize and treat Ukrainian content differently
| which is insightful to know.
| matthew9219 wrote:
| [dead]
| paulpauper wrote:
| If I had to guess it's because it gets poor engagement and
| would crowd out other topics due to the popularity of the
| topic.
| simplotek wrote:
| If that was remotely true then why wouldn't the topic be
| handled at the training level? Instead, Twitter is censoring
| references to Russia's invasion of Ukraine through the same
| mechanism used to kill DMCA violations. This is not an
| approach motivated by "engagement".
| xwdv wrote:
| Training is a waste of time when you know exactly what you
| want to block.
| hathawsh wrote:
| HN did the same thing with Bitcoin when it was soaring in
| value. I remember the front page felt like it was 90%
| Bitcoin. I was grateful that HN added an exception,
| bringing the discussion back into balance.
| GenerocUsername wrote:
| I don't know when this dataset was published from, but there
| was def a time when Ukraine news was beyond saturation.
| madeofpalk wrote:
| It's the sort of thing I would expect on a highly opinionated
| Hacker News (iirc like how posts about Apple have a penalty
| applied to them to counteract massive usual interest in
| them), but less so on something more general audience like
| Twitter.
|
| I'm not really looking to twitter to say "Actually, we've all
| heard a bit too much about Tennis". I don't want Twitter's
| timeline to have an editorial voice.
| luckylion wrote:
| If I recall correctly, Twitter has explicitly said before
| that they do rank topics very differently because otherwise
| Justin Bieber would be trending all day, every day.
| whimsicalism wrote:
| All social media algo timelines have an editorial voice.
| Otherwise, you would exclusively be seeing engagement bait.
| simplotek wrote:
| > All social media algo timelines have an editorial
| voice.
|
| Sorry, that's a bullshit excuse. Russia's invasion of
| Ukraine was hardcoded to be downranked like DMCA
| violations and high toxicity content.
|
| This has zero do to with "editorial voice" or other
| bullshit excuse. This was a blatant attempt to smuther
| any reference on what Russia is doing to Ukraine.
| whimsicalism wrote:
| I was simply responding to the comment that was made.
| phailhaus wrote:
| Because if he's going to claim to be the internet's town
| square, he can't pick and choose topics that he's personally
| 'tired of'.
| [deleted]
| btilly wrote:
| Anyone who is interested should also read
| https://twitter.com/aakashg0/status/1641976869460275201 for a
| rather different take.
|
| It particularly interested me that Twitter under Musk is trying
| NOT to discuss Ukraine, and PENALIZES people who attempt to
| interact with those outside of their general political circle. I
| can give arguments for why they should do both, but I think both
| are ultimately bad ideas.
| simplotek wrote:
| > It particularly interested me that Twitter under Musk is
| trying NOT to discuss Ukraine, and PENALIZES people who attempt
| to interact with those outside of their general political
| circle.
|
| While Musk's Twitter explicitly censors references to Russia's
| genocide of Ukraine, Musk himself feigns ignorance and false
| indignation accusing the "western press" of insisting "on
| pushing such a lopsided view of the conflict".
|
| https://twitter.com/VsimPohuy/status/1645699649003569152?t=v...
| hathawsh wrote:
| From your link (thanks!):
|
| > 9. Making up words or misspelling hurts
|
| > Words that are identified as "unknown language" are given
| 0.01, which is a huge penalty.
|
| Does that mean if I tweet about coding and use identifiers like
| "setUserName", which is not an English word, the tweet gets a
| huge penalty? If so, that's disappointing.
| strken wrote:
| That jumped out at me as a possible misreading of the code.
| Is it detecting the language of the whole tweet, or just a
| word as the author claims?
|
| Demoting a tweet that's entirely unidentifiable as any human
| language seems fair enough.
| giraffe_lady wrote:
| Man if someone asked me to build a system to merely
| identify whether a unicode string is human language or not
| I would flatly refuse. There are thousands of spoken
| languages, many of them with no standard written form, some
| that are transcribed into multiple different writing
| systems, some with no writing tradition at all and with
| only ad-hoc transliteration unique to each user and use.
|
| Even being 90% confident would be a massive undertaking,
| and "speakers of this language may/may not use the
| internet" feels like high stakes for getting it wrong.
|
| It seems a little niche but I'm sure a few times a year
| some far out town gets connected and suddenly there are
| speakers of a previously unknown-to-the-internet language
| newly online.
| iudqnolq wrote:
| Note that the metric here is "is the tweet in one of the
| languages spoken by the user". This hypothetically allows
| more nuanced implementations than you contemplate.
|
| For example, they could have a language "unrecognized"
| and assume everyone speaks it.
|
| I broadly find this useful: I see tweets in other
| languages when they're retweeted by people I follow, and
| about half the time I machine translate them. But I don't
| want my whole feed to be that.
| thrashh wrote:
| Well if someone asked me to do that, I would suggest that
| it'd be based off their recent tweet history and not just
| one tweet. And I would make my case in the meeting.
|
| Second, it's already been done so my next suggestion
| would be to look what at all the computational linguistic
| majors have been up to.
| giraffe_lady wrote:
| Yeah that struck me too. I can see the reasons why you'd want
| it but the collateral damage on that must be huge.
|
| For example do they check for the common but nonstandard
| transliteration systems arabic speakers use? There have to be
| similar systems in other languages that don't use the roman
| or cyrillic alphabets too right?
|
| Or for that matter what about languages twitter simply isn't
| aware of? There are thousands with native speakers after all,
| does this make it basically impossible for them to
| organically use twitter together?
| Mezzie wrote:
| Also RIP the Conlang community on Twitter...
| stingraycharles wrote:
| The actual code comment doesn't mention "words" but rather if
| the "tweet language" isn't in one of the user's
| "understandable" languages. As such, I assume your example is
| perfectly fine (would be extremely surprising if it wasn't).
|
| Whether the user implies the reader or author, I don't know,
| I assume the reader as that would make most sense.
|
| https://twitter.com/aakashg0/status/1641976943141699584?s=61.
| ..
| simonw wrote:
| I was not at all impressed with the analysis in that thread. It
| makes a bunch of assumptions that don't feel very thorough to
| me, but announces them as if they are unimpeachable facts.
|
| Biggest example is this one:
|
| "9. Making up words or misspelling hurts - Words that are
| identified as "unknown language" are given 0.01, which is a
| huge penalty."
|
| The code in the screenshot for that looks like this:
| // Boost (demotion) if the tweet language is not one of user's
| // understandable languages, nor interface language.
| optional double unknownLanguageBoost = 0.01
|
| That doesn't match the description of "Making up words or
| misspelling hurts" at all!
| LegitShady wrote:
| input youtube thumbnail of cat in the hat enraged "DR SUESS
| CANCELLED?! TWITTER WON'T COMMENT!" ragebait youtuber.
| treis wrote:
| The systemic racism people are going to go wild about this
| randomwalker wrote:
| OP here. Unfortunately this thread is mostly misinformation.
| There were a bunch of viral threads from the growth hacker /
| influencer crowd, including this one, within hours of the code
| release with a very superficial understanding of the code (and
| how recsys work in general). That's partly what motivated me to
| write this article.
|
| See here for a rebuttal of the main tweet in that thread (near
| the bottom of the article).
| https://solomonmg.github.io/post/twitter-the-algorithm/
| ROTMetro wrote:
| If this is for their Crisis Misinformation Policy why only
| one specific callout and specifically directed to Ukraine?
| Seems like a generous assumptions to make on your part that
| it's a nothing burger. The takeaway we should go with is that
| we now know that internally they are willing to
| programatically segment out Ukraine related topics. The
| question to me that this new knowledge should lead to is why
| a policy to segmenting this? (not to call immediately jump to
| 'nothing burger' or as you put it in the above post
| 'misinformation').
| wunderland wrote:
| It's unclear if it penalizes discussion of Ukraine equally
| though.
|
| There have been many stories that have come to light in the
| last few months. Merkel and Macron admitting the Minsk
| agreements were used to buy time for CIA and British to arm
| rebels since 2014 was big story. Large amounts of money the US
| has supplied Ukraine and lack of oversight to where this is
| going (the total US aid now surpasses Russia's entire military
| budget per year). But this same poster (aakashg0) claims these
| stories have been suppressed, even though they would be counter
| to dominant narrative in western media.
|
| I think algorithmic moderation on a particular topic is hard;
| you still need someone in there boosting the stories you want
| people to read and downplaying the stories you don't.
| stefan_ wrote:
| Tell us more, who are the "rebels" in this story and what
| arms did Merkel send?
|
| (Is this what news in the PRC feel like?)
| wunderland wrote:
| The rebels are right-wing paramilitary groups. And Germany
| didn't send any weapons during 2014-2022, but she said in a
| Der Spiegel interview from Oct 2022 that during the Minsk
| negotiations, it became clear that the US' objective was to
| buy time to secretly arm Ukraine (which is newsworthy
| because this would imply a violation of the Minsk
| agreement).
| stefan_ wrote:
| So in a year the US has sent more than Russia's yearly
| defence budget, yet Minsk (which one, even?) was needed
| to secretly (what was secret?) arm the Ukraine over 8
| years? Who are the "right-wing paramilitary groups" and
| if they are Ukraine, since this is who you are alleging
| is being armed, why are they rebels if they are
| government-aligned?
| wunderland wrote:
| This is all very easy stuff for you to verify for
| yourself and wasn't the original point of my top comment
| (which was that these stories are hard to suppress
| without manual effort-- although apparently many
| Americans are unaware).
|
| But to be clear, the US was funding Ukrainian rebel
| groups (right-wing paramilitary organizations) 2014-2022
| but through clandestine means. This is much more
| difficult to do without the support of congress because
| the support has to be indirect -- the funding has to be
| off-the-books -- because this was a violation of the
| Minsk agreement.
|
| Since 2022, the floodgates have opened and the US is now
| openly sending money and weapons systems, now totaling
| over $100B since the Russian invasion. The Russian
| Defence budget is estimated to be $70-80B per year.
| Fauntleroy wrote:
| The gigantic key factor in all this that you're leaving
| out is that Ukraine is defending itself against a full-on
| invasion by a hostile neighbor.
| matthewdgreen wrote:
| I mean, the fact that the Ukraine re-armed itself after
| Russia invaded their territory isn't news, is it? I think it
| was reported on pretty substantially. And a good thing too
| since they were invaded a second time, this time with a
| strike towards their capital. I sort of assumed that was
| obvious public knowledge and don't understand why people are
| making it into a "story."
| jawns wrote:
| I would assume that if "probability that other users will
| positively engage with a tweet" is the primary determiner of
| reach, then the more you can help Twitter accurately predict that
| probability, the better, because otherwise the default
| probability is likely no higher than middle-of-the-pack.
|
| If that assumption holds, then I would guess this type of
| algorithm favors consistency of content. In other words, someone
| who picks a certain topic and consistently tweets only about that
| topic is going to be easier to form predictions around versus
| someone whose tweets show much more variety, in topics, styles,
| etc.
|
| What that might mean, from a "gaming the system" point of view,
| is that if you're a person who intends to primarily tweet about
| two or three disparate things, you might be better off creating a
| separate account for each, rather than a single account where
| engagement is harder to predict.
| gregbander wrote:
| As the article calls out, the code is right there. Post your
| results of tests, not knee jerk conjecture. Wrong opinions are
| a dime a dozen.
| whimsicalism wrote:
| It's probably less this and more "if you only talk about one
| topic, then when we show your posts to similar users, they are
| more likely to like it"
| just_boost_it wrote:
| That doesn't really make sense because most big accounts tweet
| about a range of topics. There's pretty well established ways
| for estimating the probability based on how the range of topics
| you might tweet about would match with the range of topics a
| user likes. That means you have to try and figure out what your
| base likes to see, and be like that. Tweeting about only a
| single topic means that you're only targeting people who are
| likely to like tweets from accounts that tweet about that one
| topic.
| teruakohatu wrote:
| > That doesn't really make sense because most big twitter
| accounts tweet about a range of topics.
|
| People engage with celebrities on everything. If an A list
| celeb announces they enjoy a slice of lemon in hot water,
| twenty news articles will be published around the world.
| delecti wrote:
| Incidentally, those same conditions seems to apply to other
| sites. Lots of Youtubers have multiple channels (a main
| channel, a livestream channel, a shorts channel).
| netcraft wrote:
| I've thought many times over the years that I would love to
| be able to subscribe to a particular playlist or "show" from
| a channel. There is several channels that I want to see their
| main stuff, but not their side content. Or a particular game
| from a lets-play-er, but not their other games.
| delecti wrote:
| Surprisingly Youtube did have that functionality, though it
| was removed quite a while ago. I specifically remembering
| being able to subscribe to "Is It A Good Idea To Microwave
| This?" (in the late '2000s time range), without also
| subscribing to the other videos on the channel.
| suddenclarity wrote:
| Pitch meetings and Ars Technicas interviews about old games
| come to mind. Fortunately, pitch meetings got his own
| channel last year.
| thrashh wrote:
| I don't think it's only helpful to the algorithm. It's also
| helpful for me as a subscriber.
|
| If I want to watch episodes of Breaking Bad, I don't want you
| to randomly throw in episodes or M*A*S*H (even though both
| are good).
___________________________________________________________________
(page generated 2023-04-11 23:00 UTC)