[HN Gopher] Social networks are getting stingy with their data
       ___________________________________________________________________
        
       Social networks are getting stingy with their data
        
       Author : hubraumhugo
       Score  : 95 points
       Date   : 2024-02-11 15:56 UTC (7 hours ago)
        
 (HTM) web link (techcrunch.com)
 (TXT) w3m dump (techcrunch.com)
        
       | henriquez wrote:
       | I keep seeing more and more traffic to my sites from the
       | fediverse. I think the centralized social network model is in
       | deep shit, at least as far as the WWW goes. The days of zero
       | interest money and investment based only on user counts are over
       | and traditional social networks are increasingly irrelevant.
       | TikTok is obviously not in deep shit but that's a separate can of
       | worms.
        
         | calamari4065 wrote:
         | I really don't see centralized social networks bring
         | competitive with the fediverse for much longer. After all, how
         | do you convince millions of people to use your service and pay
         | for it while you spy on them, sell every scrap of data you can,
         | and even then still show them invasive ads?
         | 
         | Meanwhile mastodon is free as in freedom. If you want to, you
         | can buy your admin a beer. People seem to like this arrangement
         | quite a lot. You, an individual, are giving money directly to
         | another individual for services they're providing to you with
         | zero obligations. It feels more like buying your plumber friend
         | a case of beer for fixing your sink and less like throwing
         | money into a corporate void for no discernible benefit as the
         | price slowly and invariably creeps upward.
         | 
         | The really cool part though is that centralized social media
         | has to compete with the fediverse, but the fediverse does not
         | need to compete. It's not interested in competing. It will
         | simply continue to exist for as long as there are users. No one
         | cares what percentage of the global population uses it, or
         | about infinite geometric growth forever. It's just people
         | talking to other people. It's not an experience that you get on
         | traditional social media anymore.
        
           | huimang wrote:
           | The public at large doesn't care about any of that at all. I
           | don't think mastodon will ever grow that large due to how
           | painful the signup process is. And that's fine! But we have
           | to acknowledge that only a very small portion of people care
           | about things like privacy to the point of uprooting their
           | social media where their friends are.
           | 
           | AT Proto is more interesting than ActivityPub anyway, but I
           | think BlueSky will eventually succeed because the signup is
           | painless. You just sign up on bluesky, there's no need of
           | discussion on which instance, etc. There's no worry about
           | losing everything if the particular instance you're on goes
           | down, or having to deal with migrations.
        
             | gargron wrote:
             | What in your opinion are the pain points in the current
             | sign up process?
        
               | calamari4065 wrote:
               | Mastodon.social. These enormous monolithic servers offer
               | the worst experience and degrade the whole network. And
               | now joinmastodon.org points you at mastodon.social first,
               | then a list of the largest servers.
               | 
               | In other words, centralizing.
        
         | GenerWork wrote:
         | >I think the centralized social network model is in deep shit,
         | at least as far as the WWW goes.
         | 
         | Meta just beat Q4 estimates by 10%. Just because you're seeing
         | more traffic to your site from the fediverse doesn't mean that
         | traditional social media is dying.
        
           | ceejayoz wrote:
           | You can beat estimates for a while by ramping up prices and
           | cutting staff and costs.
           | 
           | Facebook has, among other things, cut their developer support
           | to basically zero. Bugs in the APIs sit around for years. The
           | Groups API is just being discontinued entirely.
        
             | tracerbulletx wrote:
             | I imagine strategically they all think they got what they
             | needed out of being an open platform and don't need to do
             | it anymore. I guess we'll find out if that's true or if
             | someone can take their bacon.
        
             | StableAlkyne wrote:
             | > Facebook has, among other things, cut their developer
             | support to basically zero.
             | 
             | They're still doing well on PyTorch at least! Although they
             | did drop GPU support for Windows soon after the WSL2 became
             | usable (presumably because now users could just install the
             | Linux version on Win)
        
             | nradov wrote:
             | Why should they care about APIs anymore? Those no longer
             | contribute to profitability.
        
         | add-sub-mul-div wrote:
         | Come on, it's hard to deny that the last year has proven that
         | the vast majority don't leave when platforms like Reddit and
         | Twitter succumb to extreme enshittification. The eternal
         | September is passive and docile.
         | 
         | Those sites are already spiritually dead in the sense that
         | they're only tolerated and no longer enjoyed. But they've
         | achieved too big a network effect to be replaced anytime soon.
        
       | WesolyKubeczek wrote:
       | They needed devs to grow back in the day, but now the devs have
       | outlived their usefulness.
        
         | amelius wrote:
         | But seriously, didn't most devs see this coming?
         | 
         | It's the same everywhere. "Thank you for helping us make the
         | AppStore great, now give us 30% of your revenue."
        
           | candiodari wrote:
           | Obscurity in the future? Or obscurity now?
           | 
           | There was no alternative. The old internet, frankly, requires
           | people to pay. They don't if they can avoid it at all. This
           | is what killed most desktop software before, and it is what's
           | killing most internet software.
        
             | StableAlkyne wrote:
             | > They don't if they can avoid it at all.
             | 
             | Probably because the most common model is a $5-$20/month
             | subscription to some "premium" version of the site/app.
             | Subscription fatigue is a monster of the industry's own
             | making.
        
           | WesolyKubeczek wrote:
           | Was easy not to see in maybe in 2008, then in, AFAIR, 2010
           | Facebook started tightening the screws citing security first
           | (and, well, there have been good reasons not to give full
           | graph access to everyone at all), but then it became obvious
           | that they were twiddling with feeds and what they didn't like
           | at all was inadvertently giving you tools to build your own
           | feed but without the ads.
        
       | Moldoteck wrote:
       | I don't use it that much, but the farcaster protocol as a
       | backbone for a social app is super hot for devs right now. Most
       | of the stuff is open source, they even changed some login steps
       | so that it would be closer to web2 experience compared to usual
       | 'web3'. Idk for how long it'll keep being this friendly for devs,
       | but that's the state for now
        
       | j1elo wrote:
       | I was about to open a Mastodon account, but because I don't
       | really use to publish anything too often anyway, instead left it
       | for later.
       | 
       | A couple month passed and the instance I had in mind, closed for
       | good.
       | 
       | I've been told before in HN that Mastodon's solution is non-
       | existent for these situations. Has the landscape changed or
       | you're still f*d if you choose the wrong one? (aka. an incentive
       | for centralization or always chosing only among the most popular
       | instances)
        
         | ldjb wrote:
         | The solution is to run your own Mastodon instance.
         | Unfortunately, I find doing so can be quite a hassle. Obviously
         | it's not free, and not only do you need to configure it
         | properly, but you need to handle backups, updates and so on.
         | Even for me, with a fair bit of technical experience, it can be
         | challenging. I think there are significant barriers to doing so
         | for people with no or little technical knowledge.
        
           | ceejayoz wrote:
           | I'm paying a coffee a month for masto.host to do it for me.
           | Great little service.
        
           | CaptainOfCoit wrote:
           | > The solution is to run your own Mastodon instance
           | 
           | The problem is that Mastodon isn't really great for single-
           | user instances (hassle to upgrade, slurps resources for
           | breakfast, etc) but there are plenty of other ActivityPub-
           | compatible software that is great for single-user usage, like
           | Pleroma, Micro.blog, Akkoma and more.
        
           | Helmut10001 wrote:
           | I did it with a free VM and wrote a guide here [1]. Zero
           | administration work with docker automatic updates.
           | 
           | [1]: https://du.nkel.dev/blog/2023-12-12_mastodon-docker-
           | rootless...
        
             | fanfanfly wrote:
             | What service did you get your VM from?
        
               | Helmut10001 wrote:
               | Oracle Free Tier
        
             | mysteria wrote:
             | While following a guide can get someone set up quickly, the
             | problem is that they may not have the background to deal
             | with issues/breakage down the road. Maybe it's a botched
             | update, a strange error, or so forth. The security side is
             | also another can of worms and I recommend that people do
             | some reading into this topic before running a public web
             | service.
             | 
             | I'm not trying to dissuade you from writing this - such
             | guides are always appreciated by the community. It's just
             | that I've seen a lot of inexperienced people try setting
             | these things up with a guide and getting bit in the back a
             | couple months later.
        
               | Helmut10001 wrote:
               | Yes, your skepticism makes sense. I am running services
               | since 7 years, so it is not that I did not know what I
               | was getting in.
        
             | abdullahkhalids wrote:
             | Thanks. What's stopping someone from creating a .deb that
             | does most/all of this?
        
         | proactivesvcs wrote:
         | You're equally messed up if your account on a centralised
         | service is banned, in which case you chose the wrong one. You
         | could choose a server that's signed up to the Mastodon Covenant
         | to provide some more peace of mind, and should always take a
         | backup of your account from time-to-time.
        
         | huimang wrote:
         | You're incentivized to either choose a stable, robust instance,
         | or self-host. A lot of instances are pet projects that people
         | kill off once they get bored of mastodon and don't want to pay
         | for the upkeep anymore.
         | 
         | Self-hosting is the only way to really guarantee that it'll
         | still be up years later. But you might get randomly banned from
         | certain instances that blanket-ban single-user instances.
        
           | michaelt wrote:
           | I've got to say, that doesn't sound like a very good
           | solution.
           | 
           | How are people who aren't already part of the community
           | supposed to know what a stable, robust instance is?
           | 
           | And self-hosting is all very well, but if I wanted to join a
           | community comprised exclusively of greybeard unix sysadmins,
           | I'd use IRC :)
        
         | bachmeier wrote:
         | > I've been told before in HN that Mastodon's solution is non-
         | existent for these situations.
         | 
         | What are you worried about losing? You can download everything
         | if it's a concern. Honestly though, the concern you're raising
         | is non-existent on every platform, because the others only have
         | a single instance.
        
         | keep320909 wrote:
         | Run your own blog, on your own domain with github. It takes
         | like 2 hours to setup and costs $20/year. Lately, it really
         | feels like the whole "social media" decade was a dead end.
        
           | quaintdev wrote:
           | what 20$? I am running it for free except for domain name
           | cost.
        
             | fanfanfly wrote:
             | Can you share your setup/tech stack?
        
             | LTL_FTC wrote:
             | I'm guessing that's what their $20/year is for. GitHub
             | pages lets you use your own domain once you buy it.
        
         | Tijdreiziger wrote:
         | It's basically analogous to e-mail. To have an e-mail account,
         | you have to choose a host first. Most people choose a popular
         | host (Gmail, Outlook), but there are various reasons one could
         | prefer a different one.
        
         | stevenicr wrote:
         | I really feel there is a need and future growth in distrubuted
         | backup as a service -
         | 
         | A few of us should get together and make a 'backup your stuff
         | service' that can pull from mastodon and any other service, and
         | make 2 backups in two different places around the world.
         | 
         | Offer addons for storing BnW copies of pics maybe, addon's for
         | other services.
         | 
         | Should login, add link to your thing, walk through authorizing
         | whatever is needed, and getting an email or DM that backup
         | succeeds every week or something.
        
         | prmoustache wrote:
         | I chose a mastodon instance for which the financing is quite
         | clear. I donate once a year. Admin is reactive and all to
         | updates and also use the instance daily.
         | 
         | If somehow I find out admin start not giving updates and
         | updating the instance, or the donations are not meeting the
         | goal I assume I would have time to startup my instance and or
         | move somewhere else.
        
       | sehugg wrote:
       | This could have been a headline from 2011
        
       | rebolek wrote:
       | _their_ data?
        
         | renegat0x0 wrote:
         | I was wondering quote often about data possesion. It is their
         | software. It sits on their servers. They convert and maintain
         | it. Do i own my data? Currently - of course I do not. They do
         | with it whatever they want. They sell it to whom they want.
         | 
         | I have option only to not use their service.
        
         | Eisenstein wrote:
         | Read any of the contracts you have to agree to in order to use
         | one of these networks. You own your data but grant them a
         | perpetual, royalty-free, no-exceptions license to use it
         | however they want for ever. They get to eat the cake and have
         | it.
        
       | verticalscaler wrote:
       | Of course it isn't their data it is ours. But only worth anything
       | in aggregate.
       | 
       | It was well known how much this stuff was worth before and that
       | amount has been declining. With AI hype convincing people of
       | unknown potential it becomes a speculative asset again.
       | 
       | Reddit was a financial failure and now it is trying to IPO again
       | on the basis that your shit posts are the new NFTs. This issue
       | will resolve itself I don't think devs are missing out.
        
         | AznHisoka wrote:
         | Of all the social networks, I think Reddit is actually the one
         | with the highest signal to noise ratio. It's invaluable to get
         | real human opinions on product recommendations, travel
         | recommendations, how to do something, etc.
        
           | verticalscaler wrote:
           | It is infamously astroturfed to the maximum and has been for
           | over a decade. Nowadays not just for future landfill items
           | but also politics.
           | 
           | And by selling the data to LLMers its fate is sealed.
        
       | mvkel wrote:
       | Data is oil (including synthetic), and consumer data companies
       | (social networks) are sensing that their data is soon going to be
       | the only defensible IP they have. Gotta hoard every little bit of
       | it to maximize market cap
        
       | seydor wrote:
       | "Megaphones are trying to keep the shouts for themselves"
       | 
       | This is such a silly concept. Users and 'influencers' use social
       | media as megaphones, and they ll easily give their data to e.g.
       | openAI if they ask for it. Social media have no moat there
        
       | notsure357 wrote:
       | Does anyone really believe that an AI based on Reddit or
       | Twitter/X data would somehow be more superior than other AI's? Or
       | that it would somehow provide a snarky competitive advantage
       | included with other data? I don't see it.
        
         | notyourwork wrote:
         | Superior may depend on your goal. If it's mass disinformation
         | campaigns produced by generative AI, those mentioned data sets
         | may be ripe for the cause.
        
           | causal wrote:
           | Yup. They're exactly what you need if your want to imitate a
           | redditor or Twitter user.
           | 
           | Also not useless for just learning language.
        
         | Cheer2171 wrote:
         | Doesn't matter. Execs, MBA types, and VCs only hear "data is
         | the new oil" and think it is just as fungible.
        
           | delfinom wrote:
           | Unfortunately data post launch of ChatGPT is now worthless as
           | it's contaminated by the very same bots
        
             | fatihpense wrote:
             | reminds me of https://en.wikipedia.org/wiki/Low-
             | background_steel
        
       | markhahn wrote:
       | Data autonomy. How do we get across to people that there's real
       | value in owning your data - controlling it, hosting it, not just
       | being someone else's product.
       | 
       | Why should we not take the broadest possible view? You own your
       | likes, your comments, your amazon order history, your dental
       | xrays, your histology reports, everything. One way to incentivize
       | data consumers and processors would be to make them liable for
       | mishandling, make the data so radioactive that they don't even
       | want to hold onto it.
        
         | randunel wrote:
         | So... GDPR? You can download your own data, force the data
         | controller which acquired it to delete it and they're liable
         | for mishandling it. Companies don't usually want to hold on to
         | EU citizens' data because GDPR makes it quite radioactive, for
         | them and whomever they sell it to.
        
         | plagiarist wrote:
         | Yeah, leaking financial data for millions of people should ruin
         | the company and have fallout that hits the members of the board
         | and C-suite. Instead it's actually just an opportunity to sell
         | an identity theft "protection" subscription.
         | 
         | I have no faith that people will get clued in and make it
         | happen. Everyone is merrily lining up to use the third-party
         | face scanner at the airports.
        
         | paulryanrogers wrote:
         | Once data gets out it's impossible to completely take back.
         | Regulations could help corral law-abiding actors, so I agree
         | with the idea.
         | 
         | Though I can also see how the incentives of social platforms
         | encourage clamping down further and further on 3P access.
        
           | r3trohack3r wrote:
           | > Once data gets out it's impossible to completely take back
           | 
           | As a society, we understand that rights !== access. Just
           | because you share your data with Facebook, giving them access
           | to it, in the normal course of interacting with their servers
           | does not grant them rights to that data.
           | 
           | When Netflix delivers a video to your device, society
           | understands you can't make a copy of that video and share it
           | with your neighbor. That's called "The Pirate Bay."
           | 
           | Data on the internet is lacking equity: an ownership interest
           | in property. As you generate data online, platforms
           | accumulate equity in your data giving them control over that
           | valuable property.
           | 
           | If, instead, you accumulated equity in your data the entire
           | data broker market would be "The Pirate Bay."
        
           | godelski wrote:
           | This. Honestly I'd love a government to take up the privacy
           | mantel and protect its citizens. Sure, you lose the ability
           | to spy on your citizens but so does every other country in
           | the world. Seems like that's a net win.
           | 
           | If USPS was made to ensure that all Americans can
           | communicate, even making it explicit in the constitution. I
           | don't know why this wouldn't also apply to cell phones and
           | the internet. Are they not modern evolutions? Put the code on
           | the gov's GitHub along with the rest. Other players can
           | exist, but it sets a baseline standard. But any country can
           | do this, doesn't need to be the US.
        
         | cyanydeez wrote:
         | this isn't being done for privacy these orgs want to keep their
         | LLM gold chests to themselves.
        
         | happytiger wrote:
         | Pay them to hold on to their data and manage it? That's about
         | the only way for end-users to experience the actual value of
         | data.
        
         | nradov wrote:
         | Legally speaking you already own your healthcare data such as
         | dental X-rays. In many cases you can even download your data
         | from provider and payer organizations in industry standard
         | formats (although dentistry specifically is way behind in this
         | area). But for most patients this data is worth zero. Legal and
         | privacy concerns aside, there's just not much use for it.
         | Several start-ups have tried to de-identify and agregate such
         | data for sale to researchers but consistency and quality issues
         | make it tough to use in real studies.
        
         | idle_zealot wrote:
         | I don't think the approach of getting consumers to care can
         | work. Most people really just don't care about taking
         | precautions when the negative impacts of not doing so are so
         | diffuse and time-delayed; it's an unfortunate aspect of human
         | nature. We usually overcome these individual failings by
         | organizing into groups better suited for long-term planning. In
         | this case the solution that comes to mind would be to make
         | personal data legally onerous to hold and process for
         | companies, to the extent that they would go out of their way to
         | design their products and services to never touch the stuff,
         | and if they _do_ need it to operate then they would be
         | incentivized to store it locally on users ' devices and only
         | synchronize it in a completely encrypted form such that they
         | never have to deal with the legal implications of having access
         | to it.
        
         | CuriouslyC wrote:
         | AI is going to push data autonomy hard. Users are going to want
         | to subscribe to different models for different things, and plug
         | those models into whatever they're using. Everyone is going to
         | support it, and to make it work there needs to be data
         | exchange. Companies that don't support it to try and keep the
         | data walled are going to hemorrhage customers.
        
       | tgv wrote:
       | Devs? Or other leeching companies that might cut into their
       | revenue?
        
         | computerfriend wrote:
         | I used to make and run Twitter bots. I wasn't a leeching
         | company that might cut into their revenue.
        
           | dudinax wrote:
           | You're collateral damage
        
       | vdaea wrote:
       | After what they did to Facebook because of what Cambridge
       | Analytica did using API access, this makes all the sense in the
       | world. API access, even if it's read-only, is a huge liability.
        
         | tqi wrote:
         | Yeah the media loves to have it both ways because at the end of
         | the day, they'recompetitors in the same attention business.
        
       | axegon_ wrote:
       | I was having this exact thought around 2015, not longer after
       | spaCy became open source. When I first tested it out, I was blow
       | away how well it performed in every possible task: it was light
       | years ahead of nltk and gensim, which were the big and well
       | established players in that space. Even back then I was certain
       | that in the not-so distant future, data will cost a fortune:
       | considerably more than it already did. And I won't lie, starting
       | to harvest data online on a massive scale did cross my mind and
       | capitalize on it when the day comes. And now I really regret not
       | doing it. Reddit closed itself off, so did stackoverflow, twitter
       | is a no go, facebook made it nearly impossible. Cloudflare makes
       | traditional scraping nearly impossible, if scraping wasn't
       | already a nightmare with the modern web stacks: the doors are
       | nearly shut and it will only get worse.
       | 
       | It really hurts me to know that I expected this to happen and
       | didn't do anything about it. Oh well... One of many missed
       | opportunities in my life I suppose(WAY more than I'd like to
       | admit).
        
       | Eisenstein wrote:
       | This is hilarious because Sam Altman was on the board of reddit
       | until recently. I don't believe that reddit closed off API access
       | due to AI data scraping. They did it to force people to use their
       | shitty app so they can get more ad impressions before the IPO.
        
       | molticrystal wrote:
       | Twitter used to view itself as a microblogging service and was so
       | open that they allowed syndication by offering an rss endpoint,
       | we are light years away from that at this point.
        
       | chaseadam17 wrote:
       | Farcaster solves this. I'm surprised it's not mentioned in the
       | article or anywhere in the comments.
        
       | seanwbren wrote:
       | Farcaster is a decentralized social network, much like Twitter
       | but with Channels (which are similar to subreddits). All the data
       | is open.
       | 
       | A cryptographic signing key is attached to each account, and they
       | have been experiencing very fast growth over the past month.
       | 
       | A dashboard is possible to share because the data is open:
       | https://dune.com/pixelhack/farcaster
        
         | INTPenis wrote:
         | What's the role of Ethereum in farcaster? I noticed it in the
         | design Overview but I don't really have the time or motivation
         | to get deeper.
        
       | Yhippa wrote:
       | It seems to me that what you did _inside_ the social networks
       | used to have a lot of value but now it seems like what you do
       | _outside_ may be even more valuable to try to model users wants
       | (product or otherwise). If everybody starts closing their doors I
       | wonder if there will be a breakdown of the prediction abilities?
       | 
       | I guess every app and webpage is willing to sell your behavior to
       | the highest bidder so it probably doesn't matter.
        
       | INTPenis wrote:
       | And the great people driven fediverse should also be mindful
       | about who they let in as users. Because you need a user account
       | to get an API key.
       | 
       | The gatekeeper is always the mod who approves new user accounts.
       | Focus on that part and we might keep the data hungry monopolies
       | out of the fedi too.
       | 
       | Many small instances is much more viable than a few huge ones
       | that let anyone in.
        
       | srameshc wrote:
       | They were always stingy with their data. It has been repeatedly
       | challenging when they change their ToS. I will never build on top
       | of another social network unless they are federated in nature.
        
       | throwaway98797 wrote:
       | if only there was a way to guarantee data availability through
       | some kind of system
       | 
       | like redundant hubs
       | 
       | check out farcaster.xyz or download the client on warpcast.xzy
        
       | simplify wrote:
       | Farcaster a decentralized "enough" social network, which fixes
       | this. Easy to run your own node too
        
       | solobalbo wrote:
       | Use nostr
        
       | geor9e wrote:
       | I (stupidly) stored all my 4k video footage on facebook. It was
       | great for a while. Only later did I learn that if the video
       | viewership drops below a threshold, they delete the 4k stream
       | without warning, leaving only the 360p stream. So I just have a
       | bunch of blurry videos when I export all my facebook data.
        
         | casefields wrote:
         | Terrible for you but saves them a boatload of money at their
         | scale.
        
       | rad_gruchalski wrote:
       | About a month after closing an account on one of these social
       | networks one realises that nothing of value was lost.
       | 
       | Furthermore, ,,their data".
        
       ___________________________________________________________________
       (page generated 2024-02-11 23:01 UTC)