[HN Gopher] Show HN: Free tool to find RSS feeds, even if not li...
___________________________________________________________________
Show HN: Free tool to find RSS feeds, even if not linked on the
page
I developed a small tool to find RSS feeds for websites. You can
try it out here: https://lighthouseapp.io/tools/feed-finder In
>90% of cases the standard way of checking meta tags is enough to
find the feeds. But my goal for this tool is that it finds feeds
regardless if they're linked somewhere or not. That if this feed
finder doesn't find a feed, no feed exists. It's a big goal and
admittedly not there yet, but it does a few things that are a step
in that direction. * Checks meta tags of parent pages (sometimes
the article itself doesn't have the meta tag, but the main blog
page does) * Checks common suffixes like /rss, /index.xml and many
others (sometimes the feed exists but isn't linked) * Checks the
sitemap * Checks all links on the page * Checks 3rd party feeds
(OpenRSS for now, when I find more such repositories I'll add them
too) There are a couple of additional ideas I have, like checking
search engines and crawling the entire domain (highly inefficient,
but possible). Would love if you could try it, and even more if
you post sites where it doesn't work.
Author : domysee
Score : 102 points
Date : 2024-09-10 12:05 UTC (10 hours ago)
| cranberryturkey wrote:
| Cool. I wrote a script to search google and find sites with rss
| feeds so I can create a collection on a particular topic.
| domysee wrote:
| That's awesome. Is there any specific search text you used to
| find the feeds? I know Bing has a command to do that but don't
| know about Google.
| djbusby wrote:
| Don't forget DDG and Kagi - might of some tools too
| GavCo wrote:
| Cool. I'm a big fan of RSS feeds.
|
| Wondering if it's necessary to continue with the other checks if
| you find a feed in the meta tags?
| domysee wrote:
| Probably not, but I'm trying to find all feeds.
|
| I guess the best option is to show results as soon as they are
| found, without waiting for everything to complete.
| nanna wrote:
| Great work! I've stopped using Twitter but I managed to taper
| from it by following things using RSS feeds drawn from Nitter.
| Don't know if that still works but could be an idea?
| domysee wrote:
| Twitter feeds would definitely be great to have, will check
| Nitter to see how I can get them. Thanks for the suggestion!
| oidar wrote:
| The tool misses reddit rss feeds.
| domysee wrote:
| Thanks for the hint, will fix that!
| snthd wrote:
| check out
|
| https://github.com/DIYgod/RSSHub/
|
| https://github.com/DIYgod/RSSHub-Radar
| domysee wrote:
| This is great, thank you!
| richardbui95 wrote:
| I tried it on my website, ebookany.com, but didn't find anything.
| So sad :(( But your idea is quite interesting.
| domysee wrote:
| That's good to know, thank you, helps me debugging
| DamonHD wrote:
| FYI it's only finding one (Atom) feed at earth.org.uk, even
| though there are several feeds, Atom and RSS.
|
| Your method described above should have found at least two feeds
| I think.
| domysee wrote:
| Interesting, I'll check that, thanks for letting me know!
| superkuh wrote:
| This is 100% a feature that should be in the browser, not a third
| party tool. I still use an very old version of Firefox for this.
| Too bad Mozilla decided auto-discovery wasn't necessary in 2016
| and removed it. Then two years later claimed no one was aware of
| RSS/Atom feeds and didn't use them (I wonder why?!?). All so they
| could try to replace it with their profit/adware that is pocket
| and we all know how that went.
|
| >Mozilla is working on alternatives such as Pocket or Reader
| Mode, and on improving WebExtensions which could provide features
| related to RSS/Atom feeds without the toll on maintenance. (ref:
| https://www.ghacks.net/2018/07/25/mozilla-plans-to-remove-rs...)
| cxr wrote:
| [deleted]
| domysee wrote:
| That's super interesting, will definitely try it, thank you!
| AiAi wrote:
| Interesting. These days I was trying to subscribe to some blogs,
| and they didn't have a RSS button in their page, so I had to
| inspect the page to find out the feed URL. Not sure why keep a
| RSS feed but hide from the visitors. It could be it expected the
| feed reader to be able to identify it, but since I was using
| Thunderbird it did not.
| domysee wrote:
| Most feed readers find at least feeds that are linked with a
| link tag in the header, if it's <link rel="alternate"
| type="application/rss+xml" ... />
|
| Probably they're expecting people to just paste the website URL
| in the feed reader and them identifying it. But it would be
| nice to see the RSS URL linked somewhere.
| Klonoar wrote:
| Some of these cases are sites that are built on a CMS that
| exposes RSS by default, but people don't consider showing a
| link/button/whatever in their design.
| account42 wrote:
| > Application error: a client-side exception has occurred (see
| the browser console for more information).
|
| Ok then.
|
| Also, this would make more sense as a browser extension.
| Especially if it brought back the RSS icon in the address bar to
| indicate when a feed is available (although maybe you don't want
| it to do all of the checks until prompted).
| domysee wrote:
| Which URL did you try?
|
| Yeah the checks are quite expansive, depending on the URL it
| might more than a hundred requests.
|
| A browser extension would make sense. Guess I have another
| project :D
| djbusby wrote:
| 100!? I have a tool to find feeds from sites - checks like 4
| things.
| mdp2021 wrote:
| Well, it must miss many then: my list already is only (and
| omits a few variations e.g. with 'atom'):
| .../rss , .../rss.xml , .../.rss , .../rss_full.xml ,
| .../feed , .../rss-feed , .../feed/all/ , .../MySection.xml
| , .../MySection.atom , feedserver.example.com/section/index
| Circlecrypto2 wrote:
| I am very grateful for this actually. I still read RSS and when I
| find a good news site I tend to spend 15 minutes or more looking
| for their feed.
| jayemar wrote:
| Are you opposed to this being used programmatically? I've been
| working on a site [0] that replays feeds, but the initial step is
| to first find the feed given a website, and it's not always able
| to find it. I'd be interested in using your service to try to
| find the feed when I'm unable to do so.
|
| [0] https://refeed.to
| pogue wrote:
| Can you explain the purpose of replaying a feed is?
| jayemar wrote:
| My initial use case was for reading content from blogs that
| had been published before I'd subscribed to their feed. I
| could visit their site and read their previous posts, but I
| much prefer the slow drip of an RSS feed. So I created
| refeed.to to be able to add 1 post per day from the blog to
| my feed starting from their first post.
|
| Since creating it I also use it to inject a few extra
| cartoons into my feed (xkcd every day!) and have also had fun
| with tech flashbacks from trustedreviews.com. So it's just a
| way to add a little variation to my feed.
| LorenDB wrote:
| If I can't find an RSS link directly, I generally copy the root
| URL into archive.org and search for all URLs matching "xml",
| which includes content type, not just URL names.
| Cieric wrote:
| Tried the hacker news front page
| (https://news.ycombinator.com/news) and when clicking on OpenRSS
| I get this error:
|
| TypeError: URL constructor: is not a valid URL. [NextJS]
| (5603-cb6f1c5a9761f9d0.js:14:5466)
|
| Browser is Firefox 130.0 on Windows.
|
| Would be really nice to see this working really well since I
| search for RSS feeds a lot for a bunch of different things.
| Whether the RSS feed is good is always another question.
| sodality2 wrote:
| Great idea. I tried it with my personal site
| (https://matthew.science) and it didn't find any, which
| admittedly doesn't have any meta tags, but it is linked at the
| footer at https://matthew.science/atom.xml. It was the default
| feed URL for my SSG. I'd recommend adding this to the common
| suffix list.
| rollcat wrote:
| Quick rant about websites that go into all the trouble of
| _having_ an RSS feed but not linking to it in the <head>... I
| don't want to go hunting for the cute orange button, I want to
| copy and paste "https://example.com" into my feed reader and let
| the computer handle the work.
|
| If you maintain any website with a news feed, go right now and
| check that you have this in your <head>: <link
| rel="alternate" type="application/rss+xml" href="/rss.xml"
| title="News feed" />
| ^^^^^^^^ change! ^^^^^^^^^
|
| (Also note whether and where you need to use application/rss+xml,
| application/atom+xml, or application/json.)
| jacobvespers wrote:
| Missed this one. https://www.occourts.org/rss/news-events
| jacobvespers wrote:
| Missed this one... https://www.occourts.org/rss/news-events
| asddubs wrote:
| my suggestion is a way to have users of the extension suggest a
| feed URL if it doesn't find one
| dotBen wrote:
| RIP Google Reader
| jcul wrote:
| This is great, it's hard to believe sites can have RSS feeds but
| make it so difficult to find.
|
| I suspect some sites are just running some framework than enables
| it and don't even realize they have one.
|
| I have used this site in the past to find feeds:
| https://www.rsssearchhub.com/
|
| In the past I was looking for a feed for https://ra.co, but could
| not find it, though I had seen old posts referencing a RSS feed.
|
| I ended up emailing them and, to my delight, they let me know
| they still have an unsupported RSS feed here:
|
| https://ra.co/xml/rss_news.xml
|
| Just for feedback, this tool doesn't find the feed, though it
| doesn't look like a standard URL to me.
___________________________________________________________________
(page generated 2024-09-10 23:01 UTC)