[HN Gopher] Track HN: Survival Rate of Show HN Stories
___________________________________________________________________
Track HN: Survival Rate of Show HN Stories
Author : namiwang
Score : 126 points
Date : 2023-06-13 17:25 UTC (5 hours ago)
(HTM) web link (nami.land)
(TXT) w3m dump (nami.land)
| gkoberger wrote:
| I'm happy to say that the reports of my death here are greatly
| exaggerated :)
|
| I'm the owner of both #4 and #140 on the Top-scoring Show HN
| Stories that Didn't Survive... but both are very much alive!
|
| #4 StackSort was a Github.com page, but on 2021 they made it so
| only Github.io wroks. If dang sees this, I'd really appreciate if
| you could change the URL for
| https://news.ycombinator.com/item?id=5395463 to use github.io!
|
| #140 ReadMe has the same io/com issue, in the opposite direction!
| we redirect readme.io to readme.com now, which seems to be why
| it's flagged.
| codetrotter wrote:
| The main page https://gkoberger.github.io/ that the
| https://gkoberger.github.com/ link suggests going to gives a
| 404 as well. Could be a good idea to add a main page for
| https://gkoberger.github.io/ that links the StackSort page and
| anything else
| 12907835202 wrote:
| How on earth did you get readme.com?
|
| I'm assuming someone else owned it, whenever I see that and all
| the "make an offer" links I move on and ignore it. Was the
| process easy?
| AndrewKemendo wrote:
| I am also, along with gkoberger happy to say that we didn't die
| after our Show HN (Show HN: A Covid-19 testing location site that
| a group of us are building)
|
| https://news.ycombinator.com/item?id=22650725
|
| In fact we were so successful that we were able to shut it down
| less than a year after we started (It's on the list as a very
| reasonable Type II error ;))
|
| Thanks to the HN community for helping us get an amazing
| Temporary product out and shut down successfully
| karaterobot wrote:
| I know you mention there are lots of reasons for false positives
| and negatives, but does your methodology account for length of
| time at all? Meaning, if a project was posted to HN in 2009, it
| could have been successful for 14 years and then closed down, or
| just changed URLs somewhere along the way, and in that case it
| would be counted as a failure even though it wasn't. Likewise, if
| it was posted in May, 2023 and is still around, that doesn't mean
| much because it's still flying the Grand Opening banner,
| practically.
| h0l0cube wrote:
| Exactly. Some of these graphs are really flawed. Like the
| heatmap for the top 1% which pretty much mirrors the submission
| heatmap. I want to see what _portion_ of submissions _for that
| time slot_ reached 1%, not of all submissions. There could be
| time slots that perform exceedingly well outside of popular
| times.
| billllll wrote:
| I'd love to get some correlation with rank, or even filtering of
| lower scoring posts.
|
| From what I know, HN posts are often used as a signal for
| viability of a project. In that case, you can't make a conclusion
| on the effectiveness of Show HN posts, because some of them will
| die off by design.
| reaperman wrote:
| > Extra: ChatGPT Gave a Wrong RegexPermalink I consulted ChatGPT
| for a regex to extract domains from urls, and it gave a flawed
| one:
|
| ^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\\.)?([^:\/\n?]+).
|
| It even gave reasonable detailed explanations which convinced me.
| Later tests revealed that this regex doesn't work for url with @
| in path, such as https://foo.com/@./bar. The correct one should
| be
|
| ^(?:https?:\/\/)?(?:[^@\/\n]+@)?(?:www\\.)?([^:\/?\n]+).
|
| ---------------------
|
| The trick is to ask ChatGPT what the right tool for the job is in
| your language of choice. For python, ChatGPT will happily give
| you: from urllib.parse import urlparse
| extract_domain = lambda url: urlparse(url).netloc.replace('www.',
| '', 1) # Example usage url = 'https://foo.com/@./bar'
| domain = extract_domain(url) print(domain) # Output:
| foo.com
|
| -------------
|
| I don't think RegEx is typically the "most" correct tool for the
| job for things which likely have built-in parser libraries (XML,
| HTML, URLs, JSON, etc)
| [deleted]
| tagawa wrote:
| What timezone is used for the submission heatmap?
| littlestymaar wrote:
| Oh, Airmash is dead. I remember seeing it on HN then spending
| half of my workday this day playing it.
| gadgetoid wrote:
| The community revived it to https://airmash.online/ pretty
| sharpish, does this count as dead?
| coding123 wrote:
| Is this why HN was so slow yesterday?
| Semaphor wrote:
| The top 250 has 8 dead projects from 2023. Of those 8, 5 are not
| dead at all, 1 is alive but has an expired certificate and only 2
| (the lowest ranked) are dead. This does not seem like useful
| data.
| elaus wrote:
| Recently I was browsing through old threads where users showed
| off their personal websites and blogs. I wanted to find some
| inspiration for my own website.
|
| What I found instead were about 3/4 dead links - even though the
| threads were all from the last 4-5 years. I found that quite sad,
| because people often talked with great passion about their
| websites and they sounded really cool. Also i LOVE those small,
| personal islands in the big, commercialized and in many ways
| centralized web.
| manuelmoreale wrote:
| Sadly that is nothing new. I used to run a website gallery and
| link rotting is incredibly high.
|
| Same is true for another couple of projects I'm running now.
| I'm collecting personal websites and quirky small web
| experiments and the same is happening there.
|
| Somewhat related is the phenomenon of dead blogs. Plenty of
| those with a couple of interesting posts and then abandoned.
| nvy wrote:
| Neat idea, thanks for sharing.
|
| Curious choice to highlight Show HNs that didn't survive, but not
| the ones that did.
|
| Is there a reason for this?
| malfist wrote:
| Same, I read the article twice in case I missed it, but no,
| nothing about the ones that did survive, even on the "more
| data" section.
| zX41ZdbW wrote:
| > Looking for a Sponsor to Host the Database PubliclyPermalink >
| In the meantime, it'd be great if anyone can query the database.
| I tried to host a public database and real-time query interface
| online, but couldn't afford the bill for a smooth Postgres
| instance to hold around 20G (40M rows plus indices) data. While a
| $20 instance could suffice, it's pretty slow from usable,
| comparing to the local one on my M2 MacBook Air.
|
| Here is the database with publicly available SQL endpoint:
| https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...
| SushiHippie wrote:
| Nice, but seems to be last updated 2022-12-12 and funnily the
| IDs that don't exist have a time of 1970-01-01 00:00:00
| smallerfish wrote:
| Phind (#2 on your list) is still up and running also (https://www
| .phind.com/search?q=false%20negative&source=searc...).
| CryptoBanker wrote:
| How do you have 40mm rows of data on Show HN for only ~126,000
| stories?
| SushiHippie wrote:
| Comments and the stories that are not "SHOW HN".
|
| From TFA:
|
| > For this analyze, I considered submissions made before May
| 31, 2023, 23:59 UTC. The dataset consists of 4,714,023 stories
| and 30,363,533 comments from 867,097 users.
| david_shaw wrote:
| Thanks for making and sharing this - although I'm surprised it's
| not a "Show HN" itself!
|
| I was curious about the top post that didn't survive - an HTML5
| game called "airma.sh" - and I wanted to check it out. I _think_
| I found a working mirror: https://www.crazygames.com/game/airmash
|
| It's possible that this is a different game, but it seems to fit
| the description.
|
| Interestingly, the person who submitted that post stopped being
| active on HN after that discussion.
| flyinglizard wrote:
| Airmash lives very well on this community hosted site:
| https://airmash.online/
|
| The original author was never to be heard from again.
| gadgetoid wrote:
| Airmash still lives at https://airmash.online/ and there's also a
| space mod - Starmash - at https://airmash.cc/
|
| I apologise in advance for the hours you'll lose to these
| (again?)
| ravenstine wrote:
| You're telling me substack.com doesn't even make the top 100?
| oliverobscure wrote:
| Great visualisation. I was quite surprised that the submission
| dates and times appeared unimodal around an American morning
| peak.
| _dain_ wrote:
| n=1 but I know at least one non-american who has stayed up late
| so that the submission coincides with this peak time
| oezi wrote:
| Using a stacked barchart for dead vs alive isn't a great choice
| in my mind. Normalize to 100% please.
| TomNomNom wrote:
| Just a silly aside with regards to the regex to extract domains
| from URLs, my little tool called unfurl [0] exists to solve that
| exact sort of problem :)
|
| [0]https://github.com/tomnomnom/unfurl
| opello wrote:
| bagder (of curl) also made trurl to address URL manipulation:
|
| https://github.com/curl/trurl
| folli wrote:
| Nice work!
|
| I'd actually be interested in factors that make a Show HN a
| success vs failure.
|
| Objectively, there's an obvious one your dataset: time of
| submission. Tuesday afternoon (which timezone? I assume US west
| coast?) seems to be key. No way this correlates with the quality
| of submissions.
|
| Subjectively: it seems to become much harder recently. I managed
| once a couple of years ago for a short time to reach the front
| page with an Android app, now I'm barely able to get above 20
| points, even though the product is (again, subjectively) cooler
| and has a possibly wider audience
| (https://news.ycombinator.com/item?id=35671245).
|
| Not complaining, but perhaps nowadays Show HN is not an easy way
| anymore to "get the word out" and get some early user feedback
| for and from indie hackers? Any other sites that might be of
| interest?
| OJFord wrote:
| Its badge on a product's home page is to me a negative signal,
| but partly since it does still happen (quite a lot) - people do
| seem to use ProductHunt.
|
| (I suppose I'd use it - and pretty much anything - but just not
| put 'omg #1' badge on my site, if I had something to launch
| myself.)
|
| Completely tangential now, but I think its problem is right in
| the title - who is hunting a product? It's a complete echo
| chamber, surely nobody who doesn't have something to launch is
| actively using it - 'it's Wednesday so I need a new Gmail-
| integrating Jira spline reticulator'.
| trewqasdf wrote:
| The pandemic really got the activity going during 2020 (first bar
| chart), but maybe not so surprising with everyone pivoting to
| remote work. And obviously all discssusions about vaccines and
| how different government were handling things.
| hawski wrote:
| Regarding database hosting, if you would consider giving the data
| away, I would suggest converting it to an SQLite database and
| sharing it over Torrent.
| xnx wrote:
| I second this. You've done a great service to collect this
| data. I'm guessing the file must be much smaller than 20GB when
| compressed.
| zX41ZdbW wrote:
| It is only around 5 GB in ClickHouse. Details:
| https://github.com/ClickHouse/ClickHouse/issues/29693
| zX41ZdbW wrote:
| I've also did an experiment by generating and searching
| embeddings for all the comments on HN. Here is the
| walkthrough: https://www.youtube.com/watch?v=hGRNcftpqAk
| _andrei_ wrote:
| Phind, the 2nd entry, is live and well.
| jumploops wrote:
| No affiliation, but the second to top deceased site is still
| alive and kicking [0]
|
| Spot checking the top results might give a better estimate for
| how many are actually alive vs. just using bot protection.
|
| [0]https://news.ycombinator.com/item?id=35543668
| sentrysapper wrote:
| https://harvestsignal.com/ is also still alive, but the site
| certificate expired.
| bagels wrote:
| It just errors out right now. How can we differentiate: always
| errors out vs dead?
| gkoberger wrote:
| Vercel (and AWS) are down right now, hence the error.
___________________________________________________________________
(page generated 2023-06-13 23:01 UTC)