[HN Gopher] You Wouldn't Download a Hacker News
___________________________________________________________________
You Wouldn't Download a Hacker News
Author : jasonthorsness
Score : 353 points
Date : 2025-04-30 01:26 UTC (21 hours ago)
(HTM) web link (www.jasonthorsness.com)
(TXT) w3m dump (www.jasonthorsness.com)
| ashish01 wrote:
| I wrote one a while back https://github.com/ashish01/hn-data-
| dumps and it was a lot of fun. One thing which will be cool to
| implement is that more recent items will update more over time
| making any recent downloaded items more stale than older ones.
| jasonthorsness wrote:
| Yeah I'm really happy HN offers an API like this instead of
| locking things down like a bunch of other sites...
|
| I used a function based on the age for staleness, it considers
| things stale after a minute or two initially and immutable
| after about two weeks old. // DefaultStaleIf
| marks stale at 60 seconds after creation, then frequently for
| the first few days after an item is // created, then
| quickly tapers after the first week to never again mark stale
| items more than a few weeks old. const
| DefaultStaleIf = "(:now-refreshed)>" +
| "(60.0*(log2(max(0.0,((:now-Time)/60.0))+1.0)+pow(((:now-
| Time)/(24.0*60.0*60.0)),3)))"
|
| https://github.com/jasonthorsness/unlurker/blob/main/hn/core...
| jakegmaths wrote:
| Your query for Java will include all instances of JavaScript as
| well, so you're over representing Java.
| jasonthorsness wrote:
| Ah right... maybe even more unexpected then to see a decline
| cs02rm0 wrote:
| I'm not so sure, while Java's never looked better to me, it
| does "feel" to me to be in significant decline in terms of
| what people are asking for on LinkedIn.
|
| I'd imagine these days typescript or node might be taking
| over some of what would have hit on javascript.
| cess11 wrote:
| Recruiting Java developers is easy mode, there are rather
| large consultancies and similar suppliers that will sell or
| rent them to you in bulk so you don't need to nag with
| adverts to the same extent as with pythonistas and rubyists
| and TypeScript.
|
| But there is likely some decline for Java. I'd bet Elixir
| and Erlang have been nibbling away on the JVM space for
| quite some time, they make it pretty comfortable to build
| the kind of systems you'd otherwise use a JVM-JMS-
| Wildfly/JBoss rig for. Oracle doesn't help, they take zero
| issue with being widely perceived as nasty and it takes a
| bit of courage and knowledge to manage to avoid getting a
| call from them at your inconvenience.
| patates wrote:
| Speaking as someone who ended up in the corporate Java
| world somewhat accidentally (wasn't deep in the ecosystem
| before): even the most invested Java shops seem wary of
| Oracle's influence now. Questioning Oracle tech, if not
| outright planning an exit strategy, feels like the
| default stance.
| cess11 wrote:
| Most such places probably have some trauma related to
| Oracle now. Someone spun up the wrong JVM by accident and
| within hours salespeople were on the phone with some
| middle manager about how they would like to pay for it,
| that kind of thing. Or just the issue of injecting their
| surveillance trojans everywhere and knowing they're
| there, that's pretty off-putting in itself.
|
| Which is a pity, once you learn to submit to and tolerate
| Maven it's generally a very productive and for the most
| part convenient language and 'ecosystem'. It's like
| Debian, even if you fuck up badly there is likely a
| documented way to fix it. And there are good libraries
| for pretty much anything one could want to do.
| karel-3d wrote:
| New Java looks actually good, but most of the Java actual
| ecosystem is stuck in the past.... and you will mostly work
| within the existing ecosystem
| smcin wrote:
| a) Does your query for 'JS' return instances of 'JSON'?
|
| b) The ultimate hard search topic for is 'R' / 'R language'.
| Check if you think you index it corectly. Or related terms
| like RStudio, Posit, [R]Shiny, tidyverse, data.table,
| Hadleyverse...
| smarnach wrote:
| Similarly, the Rust query will include "trust", "antitrust",
| "frustration" and a bunch of other words
| matsemann wrote:
| Reminded me about Scunthorpe problem
| https://en.wikipedia.org/wiki/Scunthorpe_problem
| sph wrote:
| A guerilla marketing plan for a new language is to call it a
| common one word syllable, so that it appears much more
| prominent than it really is on badly-done popularity
| contests.
|
| Call it "Go", for example.
|
| (Necessary disclaimer for the irony-impaired: this is a joke
| and an attempt at being witty.)
| InDubioProRubio wrote:
| You also wouldn't acronym hijack overload to boost mental
| presence in gamers LOL
| setopt wrote:
| Let's make a language called "A" in that case. (I mean C
| was fine, so why not one letter?)
| TZubiri wrote:
| Or call it the name of a popular song to appeal to the
| youngins.
|
| I present to you "Gangam C"
| flakiness wrote:
| I have done something similar. I cheated to use BigQuery dataset
| (which somehow keeps getting updated) and export the data to
| parquet, download it and query it using duckdb.
| minimaxir wrote:
| That's not cheating, that's just pragmatic.
| AbstractH24 wrote:
| What a pragmatic way to rationalize most cheating
| matsemann wrote:
| One thing I'm curious about, but I guess not visible in any way,
| is random stats about my own user/usage of the site. What's my
| upvote/downvote ratio? Are there users I constantly
| upvote/downvote? Who is liking/hating my comments the most? And
| some I guessed could be scrapable: Which days/times are I the
| most active (like the github green grid thingy)? How's my
| activity changed over the years?
| minimaxir wrote:
| The only vote data that is visible via any HN API is the scores
| on submissions.
|
| Day/Hour activity maps for a given user are relatively trivial
| to do in a single query, but only public submission/comment
| data could be used to infer it.
| ryandrake wrote:
| Too bad! I've always sort of wanted to be able to query
| things like what were my most upvoted and downvoted comments,
| how often are my comments flagged, and so on.
| saagarjha wrote:
| I did this once by scraping the site (very slowly, to be
| nice). It's not that hard since the HTML is pretty
| consistent.
| nottorp wrote:
| > Are there users I constantly upvote/downvote?
|
| Hmm. Personally I never look at user names when I comment on
| something. It's too easy to go from "i agree/disagree with this
| piece of info" to "i like/dislike this guy"...
| matsemann wrote:
| Same, which is why it would be cool to see. Perhaps there are
| people I both upvote and downvote?
| thaumasiotes wrote:
| > It's too easy to go from "i agree/disagree with this piece
| of info" to "i like/dislike this guy"...
|
| ...is that supposed to pose some kind of problem? The problem
| would be in the other direction, surely?
| nottorp wrote:
| Either you got the direction wrong or you'd support someone
| who is wrong just because you like them.
|
| You're wrong in both cases :)
| thaumasiotes wrote:
| Maybe try rereading my comment?
| nottorp wrote:
| You're right. But I still disagree with you. Both ways
| are wrong if you want to maintain a constructive
| discussion.
|
| Maybe you don't like my opinions on cogwheel shaving but
| you will agree with me on quantum frobnicators. But if
| you first come across about my comments on cogwheel
| shaving and note the user name, you may not even read the
| comments on quantum frobnicators later.
| vidarh wrote:
| The exception, to me, is if I'm questioning whether the
| comment was in good faith or not, where the trackrecord of
| the user on a given topic could go some way to untangle that.
| It happens rarely here, compared to e.g. Reddit, but
| sometimes it's mildly useful.
| pjc50 wrote:
| I recognize twenty or so of the most frequent and/or annoying
| posters.
|
| The leaderboard https://news.ycombinator.com/leaders
| absolutely doesn't correlate with posting frequency. Which is
| probably a good thing. You can't bang out good posts non-stop
| on every subject.
| 9rx wrote:
| _> What 's my upvote/downvote ratio?_
|
| Undefined, presumably. For what reason would there be to take
| time out of your day to press a pointless button?
|
| It doesn't communicate anything other than that you pressed a
| button. For someone participating in good faith, that doesn't
| add any value. But those not participating in good faith, i.e.
| trolls, it adds incredible value knowing that their trolling is
| being seen. So it is actually a net negative to the community
| if you did somehow accidentally press one of those buttons.
|
| For those who seek fidget toys, there are better devices for
| that.
| saagarjha wrote:
| If Hacker News had reactions I'd put an eye roll here.
| 9rx wrote:
| You could have assigned 'eye roll' to one of the arrow
| buttons! Nobody else would have been able to infer your
| intent, but if you are pressing the arrow buttons it is not
| like you want anyone else to understand your intent anyway.
| immibis wrote:
| Actually, its most useful purpose is to hide opinions you
| disagree with - if 3 other people agree with you.
|
| Like when someone says GUIs are better than CLIs, or C++ is
| better than Rust, or you don't need microservices, you can
| just hide that inconvenient truth from the masses.
| 9rx wrote:
| So, what you are saying is that if the masses agree that
| some opinion is disagreeable, they will hide it from
| themselves? But they already read it to know it was
| disagreeable, so... What are they hiding it for, exactly?
| So that they don't have to read it again when they revisit
| the same comments 10 years later? Does anyone actually go
| back and reread the comments from 10 years ago?
| jpc0 wrote:
| It's not so much rereading the comments but more a matter
| of it being indication to other users.
|
| The C++ example for instance above, you are likely to be
| downvoted for supporting C++ over rust and therefore most
| people reading through the comments (and LLMs correlating
| comment "karma" to how liked a comment is) will generally
| associate Rust > C++, which isn't a nuanced opinion at
| all and IMHO is just plain wrong a decent amount if
| times. They are tools and have their uses.
|
| So generally it shows the sentiment of the group and
| humans and conditioned to follow the group.
| 9rx wrote:
| An indication of what? It is impossible to know why a
| user pressed an arrow button. Any meaning the user may
| have wanted to convey remains their own private
| information.
|
| All it can fundamentally serve is to act as an
| impoverished man's read receipt. And why would you want
| to give trolls that information? Fishing to find out if
| anyone is reading what they're posting is their whole
| game. Do not feed the trolls, as they say.
| matsemann wrote:
| Since there are no rules on down voting, people probably
| use it for different things. Some to show dissent, some to
| down vote things they think don't belong only, etc. Which
| is why it would be interesting to see. Am I overusing it
| compared to the community? Underusing it?
| pjc50 wrote:
| I don't think you can get the individual vote interactions, and
| that's probably a good thing. It is irritating that the "API"
| won't let me get vote counts; I should go back to my Python
| scraper of the comments page, since that's the only way to get
| data on post scores.
|
| I've probably written over 50k words on here and was wondering
| if I could restructure my best comments into a long meta-
| commentary on what does well here and what I've learned about
| what the audience likes and dislikes.
|
| (HN does not like jokes, but you can get away with it if you
| also include an explanation)
| xnx wrote:
| Some of this data is available through the API (and Clickhouse
| and BigQuery).
|
| I wrote a Puppeteer script to export my own data that isn't
| public (upvotes, downvotes, etc.)
| pier25 wrote:
| would love to see the graph of React, Vue, Angular, and Svelte
| andrewshadura wrote:
| Funny nobody's mentioned "correct horse battery staple" in the
| comments yet...
| hsbauauvhabzb wrote:
| Is the raw dataset available anywhere? I really don't like the HN
| search function, and grepping through the data would be handy.
| Havoc wrote:
| It's on firebase/bigquery to avoid people doing what OP did
|
| If you click the api link bottom of page it'll explain.
| jasonthorsness wrote:
| I used the API! It only takes a few hours to download your
| own copy with the tool I used
| https://github.com/jasonthorsness/unlurker
|
| I had to CTRL-C and resume a few times when it stalled; it
| might be a bug in my tool
| xnx wrote:
| Is there any advantage to making all these requests instead
| of using Clickhouse o BigQuery?
| jasonthorsness wrote:
| Probably not :P. I made the client for another project,
| https://hn.unlurker.com, and then just jumped straight to
| using it to download the whole thing instead of searching
| for an already available full data set.
| Havoc wrote:
| My mistake - apologies. Had misunderstood what you did
| 9rx wrote:
| _> The Rise Of Rust_
|
| Shouldn't that be The Fall Of Rust? According to this, it saw the
| most attention during the years before it was created!
| emilbratt wrote:
| The chart is a stacked one, so we are looking at the height
| each category takes up and not the height each category reach.
| stefs wrote:
| please do not use stacked charts! i think it's close to
| impossible to not to distort the readers impression because a)
| it's very hard to gauge the height of a certain data point in the
| noise and b) they're implying a dependency where there _probably_
| is none.
| dguest wrote:
| How do you feel about stacked plots on a logarithmic y axis?
| Some physics experiments do this all the time [1] but I find
| them pretty uninitiative.
|
| [1]:
| https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ATL-...
| lblume wrote:
| What is this even supposed to represent? The entire
| justification I could give for stacked bars is that you could
| permute the sub-bars and obtain comparable results. Do the
| bars still represent additive terms? Multiplicative
| constants? As a non-physicist I would have no idea on how to
| interpret this.
| dguest wrote:
| It's a histogram. Each color is a different simulated
| physical process: they can all happen in particle
| collisions, so the sum of all of them should add up to the
| data the experiment takes. The data isn't shown here
| because it hasn't been taken yet: this is an extrapolation
| to a future dataset. And the dotted lines are some
| hypothetical signal.
|
| The area occupied by each color is basically meaningless,
| though, because of the logarithmic y-scale. It always looks
| like there's way more of whatever you put on the bottom.
| And obviously you can grow it without bound: if you move
| the lower y-limit to 1e-20 you'll have the whole plot
| dominated by whatever is on the bottom.
|
| For the record I think it's a terrible convention, it just
| somehow became standard in some fields.
| seabass wrote:
| My first thought as well! The author of uPlot has a good demo
| illustrating their pitfalls
| https://leeoniya.github.io/uPlot/demos/stacked-series.html
| jasonthorsness wrote:
| It's true :( but line charts of the data had too much overlap
| and were hard to see anything. I was thinking next time maybe
| multiple line charts aligned and stacked, with one series per
| region?
| tacker2000 wrote:
| Yea, i also get the feeling that these rust evangelists get more
| annoying every day ;p
| userbinator wrote:
| _I had a 20 GiB JSON file of everything that has ever happened on
| Hacker News_
|
| I'm actually surprised at that volume, given this is a text-only
| site. Humans have managed to post _over 20 billion bytes_ of text
| to it over the 18 years that HN existed? That averages to over
| 2MB per day, or around 7.5KB /s.
| sph wrote:
| 2 MB per day doesn't sound like a lot. The amount of posts
| probably has increased exponentially over the years, especially
| after the Reddit fiasco when we had our latest, and biggest
| neverending September.
|
| Also, I bet a decent amount of that is not from humans. /newest
| is full of bot spam.
| samplatt wrote:
| Plus the JSON structure metadata, which for the average
| comment is going to add, what, 10%?
| kevincox wrote:
| I suspect it is closer to 100% increase for the average
| comment. If the average comment is a few senteces and the
| metadata has id, parent id, author, timestamp and a vote
| count that can add up pretty fast.
| FabHK wrote:
| Around one book every 12 hours.
| xnx wrote:
| 20 GB JSON is surprising to me. I have an sqlite file of all HN
| data that is 20 GB, it would be much larger as JSON.
| wolfgang42 wrote:
| 20 GB of JSON is correct; here's the entire dump straight
| from the API up to last Monday: $ du -c
| ~/feepsearch-prod/datasource/hacker-news/data/dump/*.jsonl |
| tail -n1 19428360 total
|
| Not sure how your sqlite file is structured but my intuition
| is that the sizes being roughly the same sounds plausible:
| JSON has a lot of overhead from redundant structure and
| ASCII-formatted values; but sqlite has indexes, btrees,
| ptrmaps, overflow pages, freelists, and so on.
| olalonde wrote:
| 7.5KB/s (aka 7500 characters per second) didn't sound
| realistic... So I did the math[0] and it turns out it's closer
| to 34 bytes/s (0.03 KB/s). And it's really lower than that
| because of all the metadata and syntax in the JSON. You were
| right about the "over 2MB per day" though.
|
| [0] Well, ChatGPT did the math but it seems to check out:
| https://chatgpt.com/share/68124afc-c914-800b-8647-74e7dc4f21...
| montebicyclelo wrote:
| There's also two DBs I know of that have an updated Hacker News
| table for running analytics on without needing to download it
| first.
|
| - BigQuery, (requires Google Cloud account, querying will be free
| tier I'd guess) -- `bigquery-public-data.hacker_news.full`
|
| - ClickHouse, no signup needed, can run queries in browser
| directly, [1]
|
| [1]
| https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...
| xnx wrote:
| The ClickHouse resource is amazing. It even has history! I had
| already done my own exercise of downloading all the JSON before
| discovering the Clickhouse HN DBs.
| kordlessagain wrote:
| It even finds your comment 'clickhouse':
| https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...
| ZeWaka wrote:
| and now yours :)
| bambax wrote:
| > _Now that I have a local download of all Hacker News content, I
| can train hundreds of LLM-based bots on it and run them as
| contributors, slowly and inevitably replacing all human text with
| the output of a chinese room oscillator perpetually echoing and
| recycling the past._
|
| The author said this in jest, but I fear someone, someday, will
| try this; I hope it never happens but if it does, could we stop
| it?
| ahoka wrote:
| Probably already happening.
| nashashmi wrote:
| We LLMs only output the average response of humanity because we
| can only give results that are confirmed by multiple sources.
| On the contrary, many of HN's comments are quite unique
| insights that run contrary to the average popular thought. If
| this is ever to be emulated by an LLM, we would give only
| gibberish answers. If we had a filter to that gibberish to only
| permit answers that are reasonable and sensible, our answers
| would be boring and still be gibberish. In order for our
| answers to be precise, accurate and unique, we must use
| something other than LLMs.
| miki123211 wrote:
| How do you know it isn't already happening?
|
| With long and substantive comments, sure, you can usually tell,
| though much less so now than a year or two ago. With short, 1
| to 2 sentence comments though? I think LLMs are good enough to
| pass as humans by now.
| Joker_vD wrote:
| But what if LLMs will start leaving constructive and helpful
| comments? I personally would feel like xkcd [0], but others
| may disagree.
|
| [0] https://xkcd.com/810/
| gosub100 wrote:
| That's the moment we will realize that it's not the spam
| that bothers us, but rather that there is no human
| interaction. How vapid would it be to have a bunch of fake
| comments saying eat more vegetables, good job for not
| running over that animal in the road, call mom tonight it's
| been a while, etc. They mean nothing if they were generated
| by a piece of silicon.
| withinboredom wrote:
| I believe they mean whatever you mean it to mean.
| Humanity has existed on religion based on what some dead
| people wrote down, just fine. Er, well, maybe not "just
| fine" but hopefully you get the gist: you can attribute
| whatever meaning you want to the AI, holy text, or other
| people.
| gosub100 wrote:
| Religion is the opposite of AI text generation. It brings
| people together to be _less_ lonely.
|
| AI actively tears us apart. We no longer know if we're
| talking to a human, or if an artists work came from their
| ability, or if we will continue to have a job to pay for
| our living necessities.
| miki123211 wrote:
| I think a much more important question is what happens
| when we have no idea who's an LLM and who's a real
| person.
|
| Do we accuse everybody of being an LLM? Will most threads
| devolve into "you're an LLM, no you're the LLM" wars?
| Will this give an edge to non-native English speakers,
| because grammatical errors are an obvious tell that
| somebody is human? Will LM makers get over their
| squeamishness and make "write like a Mexican who barely
| speaks English" a prompt that works and produces good
| results?
|
| Maybe the whole system of anonymity on the internet gets
| dismantled (perhaps after uncovering a few successful
| llm-powered psy-ops or under the guise of child safety
| laws), and everybody just needs to verify their identity
| everywhere (or login with Google)? Maybe browser makers
| introduce an API to do this as anonymously and
| frictionlessly as possible, and it becomes the new normal
| without much fuss? Is turnstile ever going to get good
| enough to make this whole issue moot?
|
| I think we have a very interesting few years in front of
| us.
| datameta wrote:
| Also neuronormative individuals sometimes mistake
| neurodivergent usage of language for LLM-speak which
| might have similar pattern matching schema reinforced
| melagonster wrote:
| This just another reddit or HN.
| Pikamander2 wrote:
| I was browsing a Reddit thread recently and noticed that
| all of the human comments were off-topic one-liners and
| political quips, as is tradition.
|
| Buried at the bottom of the thread was a helpful reply by
| an obvious LLM account that answered the original question
| far better than any of the other comments.
|
| I'm still not sure if that's amazing or terrifying.
| no_time wrote:
| I can't think of an solution that preserves the open and
| anonymous nature that we enjoy now. I think most open internet
| forums will go one of the following routes:
|
| - ID/proof of human verification. Scan your ID, give me your
| phone number, rotate your head around while holding up a piece
| of paper etc. note that some sites already do this by proxy
| when they whitelist like 5 big email providers they accept for
| a new account.
|
| - Going invite only. Self explanatory and works quite well to
| prevent spam, but limits growth. lobste.rs and private trackers
| come to mind as an example.
|
| - Playing a whack-a-mole with spammers (and losing eventually).
| 4chan does this by requiring you to solve a captcha and
| requires you to pass the cloudflare turnstile that may or may
| not do some browser fingerprinting/bot detection. CF is
| probably pretty good at deanonimizing you through this process
| too.
|
| All options sound pretty grim to me. Im not looking forward to
| the AI spam era of the internet.
| dns_snek wrote:
| There must be a technical solution to this based on some
| cryptographic black magic that both verifies you to be a
| unique person to a given website without divulging your
| identity, and without creating a globally unique identifier
| that would make it easy to track us across the web.
|
| Of course this goes against the interests of tracking/spying
| industry and increasingly authoritarian governments, so it's
| unlikely to ever happen.
| 05 wrote:
| Oh you mean something like Apple's Private Access Tokens?
|
| https://support.apple.com/en-us/102591
|
| https://blog.cloudflare.com/eliminating-captchas-on-
| iphones-...
| dns_snek wrote:
| I don't think that's what I was going for? As far as I
| can see it relies on a locked down software stack to
| "prove" that the user is running blessed software on top
| of blessed hardware. That's one way of dealing with bots
| but I'm looking for a solution that doesn't lock us out
| of our own devices.
| vvillena wrote:
| These kinds of solutions are already deployed in some
| places. A trusted ID server creates a bunch of anonymous
| keys for a person, the person uses these keys to identify
| in pages that accept the ID server keys. The page has no
| way to identify a person from a key.
|
| The weak link is in the ID servers themselves. What happens
| if the servers go down, or if they refuse to issue keys?
| Think a government ID server refusing to issue keys for a
| specific person. Pages that only accept keys from these
| government ID servers, or that are forced to only accept
| those keys, would be inaccessible to these people. The
| right to ID would have to be enshrined into law.
| no_time wrote:
| As I see it, a technical solution to AI spam inherently
| must include a way to uniquely identify particular machines
| at best, and particular humans responsible for said
| machines at worst.
|
| This verification mechanism must include some sort of UUID
| to reign in a single bad actor who happens to validate
| his/her bot farm of 10000 accounts from the same
| certificate.
| icoder wrote:
| I'm sometimes thinking about account verification that
| requires work/effort over time, could be something fun even,
| so that it becomes a lot harder to verify a whole army of
| them. We don't need identification per se, just being human
| and (somewhat) unique.
|
| See also my other comment on the same parent wrt network of
| trust. That could perhaps vet out spammers and trolls. On one
| and it seems far fetched and a quite underdeveloped idea, on
| the other hand, social interaction (including discussions
| like these) as we know it is in serious danger.
| theasisa wrote:
| Wouldn't those only mean that the account was initially
| created by a human but afterwards there are no guarantees
| that the posts are by humans.
|
| You'd need to have a permanent captcha that tracks that the
| actions you perform are human-like, such as mouse movement or
| scrolling on phone etc. And even then it would only deter
| current AI bots but not for long as impersonation human
| behavior would be a 'fun' challenge to break.
|
| Trusted relationships are only as trustworthy as the humans
| trusting each other, eventually someone would break that
| trust and afterwards it would be bots trusting bots.
|
| Due to bots already filling up social media with their spew
| and that being used for training other bots the only way I
| see this resolving itself is by eventually everything
| becoming nonsensical and I predict we aren't that far from it
| happening. AI will eat itself.
| no_time wrote:
| >Wouldn't those only mean that the account was initially
| created by a human but afterwards there are no guarantees
| that the posts are by humans.
|
| Correct. But for curbing AI slop comments this is enough
| imo. As of writing this, you can quite easily spot LLM
| generated comments and ban them. If you have a verification
| system in place then you banned the human too, meaning you
| put a stop to their spamming.
| _Algernon_ wrote:
| This is probably already happening to some extent. I think the
| best we can hope for is xkcd 810: https://xkcd.com/810/
| holuponemoment wrote:
| Does it even matter?
|
| Perhaps I am jaded but most if not all people regurgitate about
| topics without thought or reason along very predictable paths,
| myself very much included. You can mention a single word
| covered with a muleta (Spanish bullfighting flag) and the
| average person will happily run at it and give you a
| predictable response.
| bob1029 wrote:
| It's like a Pavlovian response in me to respond to anything
| SQL or C# adjacent.
|
| I see the _exact_ same in others. There are some HN usernames
| that I have memorized because they show up deterministically
| in these threads. Some are so determined it seems like a
| dedicated PR team, but I know better...
| OccamsMirror wrote:
| I always love checking the comments on articles about Bevy
| to see how the metaverse client guy is going.
| gosub100 wrote:
| The paths are going to be predictable by necessity. It's not
| possible for everyone to have a uniquely derived
| interpretation about most common issues, whether that's
| standard lightning rod politics but also extending somewhat
| into tech socio/political issues.
| icoder wrote:
| I'm more and more convinced of an old idea that seems to become
| more relevant over time: to somehow form a network of trust
| between humans so that I know that your account is trusted by a
| person (you) that is trusted by a person (I don't know) [...]
| that is trusted by a person (that I do know) that is trusted by
| me.
|
| Lots of issues there to solve, privacy being one (the links
| don't have to be known to the users, but in a naive approach
| they _are_ there on the server).
|
| Paths of distrust could be added as negative weight, so I can
| distrust people directly or indirectly (based on the accounts
| that they trust) and that lowers the trust value of the
| chain(s) that link me to them.
|
| Because it's a network, it can adjust itself to people trying
| to game the system, but it remains a question to how robust it
| will be.
| XorNot wrote:
| I think technically this is the idea that GPG's web of trust
| was circling without quite staring at, which is the oddest
| thing about the protocol: it's used mostly today for machine
| authentication, which it's quite good at (i.e. deb
| repos)...but the tooling actually generally is oriented
| around verifying and trusting _people_.
| wobfan wrote:
| Yeah exactly, this was exactly the idea behind that.
| Unfortunately, while on paper it just sounds like a sound
| idea, at least IMO, though ineffective, it has proven time
| and time again that the WOT idea in PGP has no chance
| against the laziness of humans.
| littlestymaar wrote:
| Ultimately, guaranteeing common trust between citizens is a
| fundamental role of the State.
|
| For a mix of ideological reasons and lack of genuine interest
| for the internet from the legislators, mainly due to the
| generational factor I'd guess, it hasn't happened yet, but I
| expect government issued equivalent of IDs and passports for
| the internet to become mainstream sooner than later.
| eadmund wrote:
| > Ultimately, guaranteeing common trust between citizens is
| a fundamental role of the State.
|
| I don't think that really follows. Businesses credit
| bureaus and Dun & Bradstreet have been privately enabling
| trust between non-familiar parties for quite a long time.
| Various networks of merchants did the same in the Middle
| Ages.
| littlestymaar wrote:
| > Businesses credit bureaus and Dun & Bradstreet have
| been privately enabling trust between non-familiar
| parties for quite a long time.
|
| Under the supervision of the State (they are regulated
| and rely on the justice and police system to make things
| work).
|
| > Various networks of merchants did the same in the
| Middle Ages.
|
| They did, and because there was no State the amount of
| trust they could built was fairly limited compared to was
| has later been made possible by the development of modern
| states (the industrial revolution appearing in the UK has
| partly been attributed to the institutional framework
| that existed there early).
|
| Private actors can, and do, and have always done, build
| their own makeshift trust network, but building a
| society-wide trust network is a key pillar of what makes
| modern states "States" (and it directly derives from the
| "monopoly of violence").
| lormayna wrote:
| Havala (https://it.m.wikipedia.org/wiki/Hawala) or other
| similar way to transfer money abroad are working over a
| net of trust, but without any state trust system.
| littlestymaar wrote:
| Compare its use to SWIFT and you'll see the difference.
| icoder wrote:
| Interestingly, as I've begun to realise the ease by which a
| State's trust can sway has actually increased my believe
| that this should come from 'below'. I think a trust network
| between people (of different countries) can be much more
| resilient.
| nostrademons wrote:
| That's not really what research on state formation has
| found. The basic definition of a state is "a centralized
| government with a monopoly on the legitimate use of force",
| and as you might expect from the definition, groups
| generally attain statehood by monopolizing the use of
| force. In other words, they are the bandits that become big
| enough that nobody dares oppose them. They attain statehood
| through what's effectively a peace treaty, when all
| possible opposition basically says "okay, we're submit to
| your jurisdiction, please stop killing us". Very often, it
| actually is a literal peace treaty.
|
| States will often co-opt _existing_ trust networks as a way
| to enhance and maintain their legitimacy, as with
| Constantine's adoption of Christianity to preserve social
| cohesion in the Roman Empire, or all the compromises that
| led the 13 original colonies to ratify the U.S.
| constitution in the wake of the American Revolution. But
| violence comes first, _then_ statehood, then trust.
|
| Attempts to legislate trust don't really work. Trust is an
| emotion, it operates person-to-person, and saying "oh, you
| need to trust such-and-such" don't really work unless you
| are trusted yourself.
| littlestymaar wrote:
| > The basic definition of a state is "a centralized
| government with a monopoly on the legitimate use of force
|
| I'm not saying otherwise (I've even referred to this in a
| later comment).
|
| > But violence comes first, then statehood, then trust.
|
| Nobody said anything about the historical process so
| you're not contradicting anyone.
|
| > Attempts to legislate trust don't really work
|
| Quite the opposite, it works very, very well. Civil laws
| and jurisdiction on _contracts_ have existed since the
| Roman Republic, and every society has some equivalent
| (you should read about how the Taliban could get back to
| power so quickly in big part because they kept doing
| civil justice in the rural afghan society even while the
| country was occupied by the US coalition).
|
| You must have institutions to be sure than the other
| party is going to respect the contract, so that you don't
| have to trust them, you just need to trust that the state
| is going to _enforce_ that contract (what they can do
| because they have the monopoly of violence and can just
| force the party violating the contract into submission).
|
| With the monopoly of violence comes the responsibility to
| use your violence to enforce contracts, otherwise social
| structures are going to collapse (and someone else is
| going to take that job from you, and gone is your
| monopoly of violence)
| drcongo wrote:
| I actually built this once, a long time ago for a very
| bizarre social network project. I visualised it as a mesh
| where individuals were the points where the threads met, and
| as someone's trust level rose, it would pull up the trust
| levels of those directly connected, and to a lesser degree
| those connected to them - picture a trawler fishing net and
| lifting one of the points where the threads meet. Similarly,
| a user whose trust lowered over time would pull their
| connections down with them. Sadly I never got to see it at
| the scale it needed to become useful as the project's funding
| went sideways.
| icoder wrote:
| Yeah building something like this is not a weekend project,
| getting enough traction for it to make sense is another
| orders of magnitude beyond that.
|
| I like the idea of one's trust to leverage that of those
| around them. This may make it more feasible to ask some
| 'effort' for the trust gain (as a means to discourage
| duplicate 'personas' for a single human), as that can
| ripple outward.
| all2 wrote:
| How would 'trust' manifest? A karma system?
|
| How are individuals in the network linked? Just comments on
| comments? Or something different?
| Philpax wrote:
| https://en.wikipedia.org/wiki/Key_signing_party
| genewitch wrote:
| Matrix protocol or at least the clients agree that several
| emoji is a key - which is fine - and you verify by looking
| at the keys (on each client) at the same time in person,
| ideally. I've only ever signed for people in person, and
| one remote attestation; but we had a separate _verified_
| private channel and attested the emoji that way.
| nickdothutton wrote:
| Do these still happen? They were common (-ish, at least in
| my circles) in the 90s during the crypto wars, often at the
| end of conferences and events, but I haven't come across
| them in recent years.
| im3w1l wrote:
| GPG lost, TLS won. Both are actually webs of trust with the
| same underlying technology. But they have different cultures
| and so different shapes. GPG culture is to trust your friends
| and have them trust their friends. With TLS culture you trust
| one entity (e.g. browser) that trusts a couple dozen entities
| that (root certificate authorities), that either signs keys
| directly or can fan out to intermediate authorities that then
| sign keys. The hierarchical structure has proven much more
| successful than the decentralized one.
|
| Frankly I don't trust my friends of friends of friends not to
| add thirst trap bots.
| lxgr wrote:
| The difference is in both culture and topology.
|
| TLS (or more accurately, the set of browser-trusted X.509
| root CAs) is extremely hierarchical and all-or-nothing.
|
| The PGP web of trust is non-hierarchical and decentralized
| (from an organizational point of view). That unfortunately
| makes it both more complex and less predictable, which I
| suppose is why it "lost" (not that it's actually gone, but
| I personally have about one or maybe two trusted, non-
| expired keys left in my keyring).
| kevin_thibedeau wrote:
| The issue is key management. TLS doesn't usually require
| client keys. GPG requires all receivers to have a key.
| amenghra wrote:
| Couple dozen => it's actually 50-ish, with a mix of private
| and government entities located all over the world.
|
| The fact that the Spanish mint can mint (pun!) certificates
| for any domain is unfortunate.
|
| Hopefully, any abuse would be noticed quickly and rights
| revoked.
|
| It would maybe have made more sense for each country's TLD
| to have one or more associated CA (with the ability to
| delegate trust among friendly countries if desired).
|
| https://wiki.mozilla.org/CA/Included_Certificates
| SuperShibe wrote:
| I think this ideas problem might be the people part,
| specifically the majority type of people that will click
| absolutely anything for a free iPad
| icoder wrote:
| Theoretically that should swiftly be reflected in their
| trust level. But maybe I'm too optimistic.
|
| I have nothing intrinsically against people that 'will
| click absolutely anything for a free iPad' but I wouldn't
| mind removing them from my online interactions if that also
| removes bots, trolls, spamners and propaganda.
| haswell wrote:
| I've also been thinking about this quite a bit lately.
|
| I also want something like this for a lightweight social
| media experience. I've been off of the big platforms for
| years now, but really want a way to share life updates and
| photos with a group of trusted friends and family.
|
| The more hostile the platforms become, the more viable I
| think something like this will become, because more and more
| people are frustrated and willing to put in some work to
| regain some control of their online experience.
| jeremyjh wrote:
| The key is to completely disconnect all ad revenue. I'm
| skeptical people are willing to put in some money to regain
| control; not in the kind of percentages that means I can
| move most of my social graph. Network effects are a real
| issue.
| marcusb wrote:
| Isn't this vaguely how the invite system at Lobsters
| functions? There's a public invite tree, and users risk their
| reputation (and posting access) when they invite new users.
| withinboredom wrote:
| I know exactly zero people over there. I am also not about
| to go brown nose my way into it via IRC (or whatever chat
| they are using these days). I'd love to join, someday.
| somethingsome wrote:
| Hey I never actually tried lobsters, do you mind if I ask
| an invite?
| brongondwana wrote:
| Also there's the problem that every human has to have perfect
| opsec or you get the problem we have now, where there are
| massive botnets out there of compromised home computers.
| drcongo wrote:
| The internet is going to become like William Basinski's
| Disintegration Loops, regurgitating itself with worse fidelity
| until it's all just unintelligible noise.
| genewitch wrote:
| I have all of n-gate as json with the cross references cross
| referenced.
|
| Just in case I need to check for plagiarism.
|
| I don't have enough Vram nor enough time to do anything useful
| on my personal computer. And yes I wrote vram like that to
| pothole any EE.
| Etheryte wrote:
| See the Metal Gear franchise [0], the Dead Internet Theory [1],
| and many others who have predicted this.
|
| > Hideo Kojima's ambitious script in Metal Gear Solid 2 has
| been praised, some calling it the first example of a postmodern
| video game, while others have argued that it anticipated
| concepts such as post-truth politics, fake news, echo chambers
| and alternative facts.
|
| [0] https://en.wikipedia.org/wiki/Metal_Gear
|
| [1] https://en.wikipedia.org/wiki/Dead_Internet_theory
| djoldman wrote:
| A variant of this was done for 4chan by the fantastic Yannic
| Kilcher:
|
| https://en.wikipedia.org/wiki/GPT4-Chan
| r3trohack3r wrote:
| HN already has a pretty good immune system for this sort of
| thing. Low-effort or repetitive comments get down-voted,
| flagged, and rate-limited fast. The site's karma and velocity
| heuristics are crude compared with fancy ML, but they work
| because the community is tiny relative to Reddit or Twitter and
| the mods are hands-on. A fleet of sock-puppet LLM accounts
| would need to consistently clear that bar--i.e. post things
| people actually find interesting--otherwise they'd be throttled
| or shadow-killed long before they "replace all human text."
|
| Even if someone managed to keep a few AI-driven accounts alive,
| the marginal cost is high. Running inference on dozens of fresh
| threads 24/7 isn't free, and keeping the output from slipping
| into generic SEO sludge is surprisingly hard. (Ask anyone who's
| tried to use ChatGPT to farm karma--it reeks after a couple of
| posts.) Meanwhile the payoff is basically zero: you can't
| monetize HN traffic, and karma is a lousy currency for bot-
| herders.
|
| Could we stop a determined bad actor with resources? Probably,
| but the countermeasures would look the same as they do now:
| aggressive rate-limits, harsher newbie caps, human mod review,
| maybe some stylometry. That's annoying for legit newcomers but
| not fatal. At the end of the day HN survives because humans
| here actually want to read other humans. As soon as commenters
| start sounding like a stochastic parrot, readers will tune out
| or flag, and the bots will be talking to themselves.
|
| _Written by GPT-3o_
| stephenhumphrey wrote:
| Regardless of whether that final line reflects reality or is
| merely tongue-in-cheek snark, it elevates the whole post into
| the sublime.
| dangoodmanUT wrote:
| I imagine LLMs already have this too
| kriro wrote:
| I think LLMs could be a great driver of private-public key
| encryption. I could see a future where everyone finally wants
| to sign their content. Then at least we know it's from that
| person or an LLM-agent by that person.
|
| Maybe that'll be a use case for blockchain tech. See the whole
| posting history of the account on-chain.
| photochemsyn wrote:
| It's hopeless.
|
| We can still take the mathematical approach: any argument can
| be analyzed for logical self-consistency, and if it fails this
| basic test, reject it.
|
| Then we can take the evidentiary approach: if any argument that
| relies on physical real-word evidence is not supported by well-
| curated, transparent, verifiable data then it should also be
| rejected.
|
| Conclusion: finding reliable information online is a needle-in-
| a-haystack problem. This puts a premium on devising ways (eg a
| magnet for the needle) to filter the sewer for nuggets of gold.
| SilverBirch wrote:
| What is the netiquette of downloading HN? Do you ping Dang and
| ask him before you blow up his servers? Or do you just assume at
| this point that every billion dollar tech company is doing this
| many times over so you probably won't even be noticed?
| euroderf wrote:
| Not to mention three-letter agencies, incidentally attaching
| real names to HN monikers ?
| krapp wrote:
| HN has an API, as mentioned in the article, which isn't even
| rate limited. And all of the data is hosted on Firebase, which
| is a YC company. It's fine.
| mikeevans wrote:
| Firebase is owned and operated by Google (has been for a
| while).
| alt227 wrote:
| If something is on the public web, it is already being scraped
| by thousands of bots.
| dangoodmanUT wrote:
| there's literally an API they promote. Did you read that part
| before trying to cancel them?
| TZubiri wrote:
| Well, it's called Hacker News, so hacking is fair game, at
| least in the good sense of the word.
| internetter wrote:
| There's literally a public database
|
| https://console.cloud.google.com/marketplace/product/y-combi...
| umvi wrote:
| What if someone from EU invokes "right to be forgotten" and
| demands HN delete past comments from years ago. Will those
| deletions be reflected in the public database? Or could you
| mine the db to discover deleted data?
| jeremyjh wrote:
| They need to issue their demand to whoever is hosting their
| data. If HN has deleted it, they are not hosting it.
| dang wrote:
| That's an entirely third party project so I doubt they should
| be listing YC as a partner there.
| internetter wrote:
| Huh, yeah that is really misleading. Makes it look like it
| is by YC.
| mattkevan wrote:
| I did something similar a while back to the @fesshole
| Twitter/Bluesky account. Downloaded the entire archive and fine-
| tuned a model on it to create more unhinged confessions.
|
| Was feeling pretty pleased with myself until I realised that all
| I'd done was teach an innocent machine about wanking and divorce.
| Felt like that bit in a sci-fi movie where the alien/super-
| intelligent AI speed-watches humanity's history and decides we're
| not worth saving after all.
| falcor84 wrote:
| What's wrong with wanking and divorce? These are respectively a
| way for people to be happier and more self-reliant, and a way
| for people to get out of a situation that isn't working out for
| them. I think both are net positives, and I'm very grateful to
| live in a society that normalizes them.
| dcuthbertson wrote:
| The innocent machine can't do either. It's akin to having no
| mouth, but it must scream (apologies to Harlan Ellison)
| falcor84 wrote:
| That is a fair point, but it would then apply to everything
| else we teach it about, like how we perceive the color of
| the sky or the taste of champagne. Should we remove these
| from the training set too?
|
| Is it not still good to be exposed to the experiences of
| others, even if one cannot experience these things
| themself?
| dcuthbertson wrote:
| Thanks for saying it's a fair point, but it's more of an
| offhand joke about "an innocent machine". In reality, a
| machine, even an LLM, has no innocence. It's just a
| machine.
| pixl97 wrote:
| Gets a bit more complicated when we start giving these
| machines agency.
| falcor84 wrote:
| Having studied biology, I never accepted the "just a
| machine" argument. Everything is essentially a machine,
| but when a machine is sufficiently complex, it is
| rational to apply the Intentional Stance to it.
| pc86 wrote:
| I'm not implying that divorce should be stigmatized or
| prohibited or anything, but it is bad (necessary evil?) and
| most people would be much happier if they had never married
| that person in the first place rather than married them then
| gotten divorced.
|
| So "normalize divorce" is pretty backward when what we should
| be doing is normalizing making sure you're marrying the right
| person.
| cgriswald wrote:
| Making sure you are marrying the right person _is_
| normalized. I'd have never even known my ex wasn't the
| right person if I hadn't married her. I didn't come out of
| my marriage worse off.
|
| Normalize divorce and stop stigmatizing it by calling it
| bad or evil.
| pixl97 wrote:
| Eh, I would say it's quite a bit more complicated than
| you're giving it credit for.
|
| >Making sure you are marrying the right person is
| normalized.
|
| Absolutely not.
|
| I live in the southern US and we have the culmination of
| "Young people should get married" coupled with "divorce
| is bad/evil" and the disincentivization of actually
| learning about human behaviors/complications before going
| through something that could be traumatic.
|
| There are a lot of relationships that from an outside and
| balanced perspective give all the signs they will not
| work out and will be potentially dangerous for one or
| both partners in the relationship.
| bluefirebrand wrote:
| > I didn't come out of my marriage worse off
|
| This is good for you, but many people do come out of
| their marriages much worse off in various ways
|
| > Normalize divorce and stop stigmatizing it by calling
| it bad or evil
|
| It's not bad or evil, but let's also not pretend that it
| isn't damaging
| cgriswald wrote:
| We don't have to pretend. The original poster thinks he
| knows what the world looks like if every marriage that
| ends in divorce just never happened. Those marriages _do_
| happen, though, and to place all the damage generated by
| that marriage strictly on the divorce is incorrect.
| Usually one or both parties know the consequences of the
| divorce and prefer them to the state of the marriage,
| because the damages are _less_ than if divorce wasn 't an
| option. Claiming divorce is some kind of undesirable
| 'damaged' state is just as stigmatizing as claiming it is
| 'bad' or 'evil'.
|
| The alternative to divorce isn't perfect marriages, it is
| failed marriages that are inescapable.
| gwerbret wrote:
| > The alternative to divorce isn't perfect marriages, it
| is failed marriages that are inescapable.
|
| I'm sure this has nothing to do with you, but by your
| comments in this thread, I'm reminded of a conversation I
| had with a friend on a bus one day. We were talking about
| the unfortunate tendency, in daytoday, of people to
| shuffle their elderly parents off to nursing homes,
| rather than to support said parents in some sort of
| independent living. A nearby passenger jumped into our
| conversation to argue that there are situations in which
| the nursing home situation is for the best. Although we
| agreed with him, he seemed to dislike the fundamental
| idea of caring for one's elderly parents _at all_ , and
| subsequently became quite heated.
| smcin wrote:
| There are lots of proven viable alternatives to quick no-
| fault divorce, the most obvious being waiting periods or
| separation periods ranging from months to years. [0].
| Parental alienation can be gamed, and frequently is.
| Psychologist evals can be gamed or biased. Expert witness
| reports can be gamed. Move-away scenarios (by the
| custodial parent) can be gamed. Making false or perjurous
| allegations can be gamed, sometimes without consequence.
| Jurisdiction-shopping can be gamed. It seems pretty
| obvious that if there are huge incentives (or penalties)
| for certain modes of behavior, some types of people will
| exploit those. Community property/separate property can
| be gamed. The timing of all these things can be gamed wrt
| dicslosures, health events, insurance
| coverage/eligibility, job change/start/end, stock
| vesting, SS eligibity, tax filings etc. Divorce
| settlements can be gamed too by one party BK'ing out of a
| settlement/division of debts. At-fault divorce also
| exists (in many US states), and obviously can be gamed.
|
| It's not a false dichotomy between either a jurisdiction
| must allow instant no-fault divorce for everyone who
| petitions for it, or none at all.
|
| > _Usually one or both parties know the consequences of
| the divorce and prefer them to the state of the marriage,
| because the damages are less than if divorce wasn 't an
| option._
|
| Sometimes both parties are reasonably rational and honest
| and non-adversarial, then again sometimes one or both
| aren't, and it only takes one party (or their relatives)
| to make things adversarial. If you as a member of the
| public want to see it in action, in general you can sit
| in and observe proceedings in your local courthouse in
| person, or view the docket of that day's cases, or view
| the local court calendar online. Often the judge and
| counsel strongly affect the outcome too, much more than
| the facts at issue.
|
| > _Claiming divorce is some kind of undesirable 'damaged'
| state is just as stigmatizing as claiming it is 'bad' or
| 'evil'._
|
| It is not necessarily the end-state of being divorced
| that is objectively quantifiably the most damaging to
| both parties' finances, wellness, children, and society
| at large, it's the expensive non-transparent ordeal of
| family court itself that can cause damage, as much as (or
| sometimes more than) the end-state of ending up divorced.
| Or both. Or neither.
|
| > _The alternative to divorce is..._
|
| ...a less broken set of divorce laws, for which there are
| multiple viable candidates. Or indeed, marriage(
| /cohabitation/relationships) continuing to fall out of
| favor. Other than measuring crude divorce rates and
| comparing their ratio to crude marriage rates (assuming
| same jurisdiction, correcting for offset by the
| (estimated) average length of marriage, and assuming zero
| internal migration), as marriage becomes less and less
| common, we're losing the ability to form a quantified
| picture of human behavior viz. when
| partnerships/relationships start or end; many countries'
| censuses no longer track this or being pressued to stop
| tracking it [1]; it could be inferred from e.g. bank,
| insurance, household bill arrangements, credit
| information, public records, but obviously privacy needs
| to be respected.
|
| [0] https://en.wikipedia.org/wiki/Divorce_law_by_country
|
| [1]: https://www.pewresearch.org/short-
| reads/2015/05/11/census-bu...
| pc86 wrote:
| Something can be both bad and not stigmatized. Divorce is
| a pretty good example here. It's not stigmatized, and to
| prove it's not say with a straight face it should be
| illegal and you won't be able to blink before the
| backlash hits you. It's not stigmatized at all. _Most_
| individuals who get married will get divorced. The way
| the numbers work out something like 60-70% of all
| marriages contain at least one divorced partner. Saying
| it 's stigmatized is silly and doesn't line up with
| reality. But of course it's an objectively bad thing.
| It's messy, it's expensive, feelings get hurt, often
| times years or decades of peoples' lives are wasted.
| cgriswald wrote:
| I don't have to say it with a straight face because your
| sibling poster did it for me. Something can be both
| common and stigmatized. Yes, divorce _can be_ messy,
| expensive, emotionally fraught, and take time. Mine was,
| and it still wasn 't 'bad' or even undesirable. Starting
| a business, learning an instrument, training for a sport
| can _also_ be all those things. We don 't call them
| 'bad', or 'evil', because we don't assume the end result
| is undesirable.
|
| The comparison can't be to an imaginary world where
| everyone always picks the best partner. It has to be to
| the real world where people don't always pick the best
| partner and the absence of divorce means they're stuck
| with them.
| nhod wrote:
| This reminds me of one of my very favorite essays of all
| time, "Why You Will Marry the Wrong Person" by Alain de
| Botton from the School of Life. The title is somewhat
| misleading, and I resisted reading it for a couple years as
| a result. It is exquisite writing -- it couldn't be said
| with fewer words, and adding more wouldn't help either --
| and an extraordinary and ultimately hopeful meditation on
| love and marriage.
|
| NYT Gift Article:
| https://www.nytimes.com/2016/05/29/opinion/sunday/why-you-
| wi...
| tailspin2019 wrote:
| You're 100% right. That essay is superb and I'm glad I
| read it!
|
| Thanks for sharing the link.
| Nzen wrote:
| Alain de Botton also published this in video form, seven
| years ago [0]. If you want the cliff's notes, his School
| of Life channel has a shorter version [1].
|
| [0] https://www.youtube.com/watch?v=-EvvPZFdjyk 22
| minutes
|
| [1] https://www.youtube.com/watch?v=zuKV2DI9-Jg 4
| minutess
| didgetmaster wrote:
| I agree. The title is wrong. It should be 'Why you are
| sure to think, whomever you marry, that they are the
| wrong person".
| adamc wrote:
| Having gone through a divorce... no. It would be better if
| people tried harder to make relationships work. Failing that,
| it would be better to not marry such a person.
| falcor84 wrote:
| People sometimes grow in different directions. Sometimes
| the person who was perfect for you at 25 just isn't a good
| fit for you at age 40, regardless of how hard you try to
| make it work.
| nthingtohide wrote:
| > an innocent machine about wanking and divorce
|
| Let's say you discovered a pendrive of a long lost civilization
| and train a model on that text data. How would you or the model
| know that the pendrive contained data on wanking and divorce
| without anykind of external grounding to that data?
| deadbabe wrote:
| Is the 20GB JSON file available?
| a3w wrote:
| Cool project. Cool graphs.
|
| But any GDPR requests for info and deletion in your inbox, yet?
| arduanika wrote:
| Come on, you wouldn't GDPR a whimsical toy project!
| shayway wrote:
| Hah, I've been scraping HN over the past couple weeks to do
| something similar! Only submissions though, not comments. It was
| after I went to /newest and was faced with roughly 9/10 posts
| being AI-related. I was curious what the actual percentage of
| posts on HN were about AI, and also how it compared to other
| things heavily hyped in the past like Web3 and crypto.
| alt227 wrote:
| Here, the entire history of HN with the ability to run queries
| on it directly in the browser :)
|
| https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...
| sebastianmestre wrote:
| Can you remake the stacked graphs with the variable of interest
| at the bottom? Its hard to see the percentage of Rust when it's
| all the way at the top with a lot of noise on the lower layers
|
| Edit: or make a non-stacked version?
| jasonthorsness wrote:
| Lots of valid criticism here of these graphs and the queries;
| I'll write a follow-up article.
| xnx wrote:
| I have this data and a bunch of interesting analysis to share.
| Any suggestions on the best method to share results?
|
| I like Tableau Public, because it allows for interactivity and
| exploration, but it can't handle this many rows of data.
|
| Is there a good tool for making charts directly from Clickhouse
| data?
| texodus wrote:
| No Clickhouse connector for free accounts yet, but if you can
| drop a Parquet file on S3 you can try https://prospective.co
| xnx wrote:
| Thanks! I'll check that out. Thought it was a typo of
| "Perspective" for a moment: https://perspective.finos.org/
| texodus wrote:
| Yes! This is the _pro_ version, we also develop open source
| https://github.com/finos/perspective (which Prospective is
| substantially built on, with some customizations such as a
| wasm64 runtime).
| Am4TIfIsER0ppos wrote:
| I hope they snatched my flagged comments. I would be pleased to
| have helped make the AI into an asshole. Here's hoping for
| another Tay AI.
| wslh wrote:
| It would be great if it is available as a torrent. There also
| mutable torrents [1]. Not implemented everywhere but there are
| available ones [2].
|
| [1] https://www.bittorrent.org/beps/bep_0046.html
|
| [2] https://www.npmjs.com/package/bittorrent-dht
| th1nhng0 wrote:
| Can I ask how you draw the chart in the post?
| jasonthorsness wrote:
| lol it was Excel (save as picture / SVG format / edit colors to
| support dark/light mode)
| th1nhng0 wrote:
| wow, I never expect that xD thanks for let me know
| byearthithatius wrote:
| Can you scrape all of HN by just incrementing item?id (since its
| sequential) and using Python web requests with IP rotation (in
| case there is rate limiting)?
|
| NVM this approach of going item by item would take 460 days if
| the average request response time is 1 second (unless heavily
| parallelized, for instance 500 instances _could_ do it in a day
| but thats 40 million requests either way so would raise alarms).
| g8oz wrote:
| I predict that in the coming years a lot of APIs will begin offer
| the option of just returning a duckdb file. If you're just going
| to load the json into a database anyway, why not just get a
| database in the response.
___________________________________________________________________
(page generated 2025-04-30 23:00 UTC)