[HN Gopher] Show HN: OpenOrb, a curated search engine for Atom a...
       ___________________________________________________________________
        
       Show HN: OpenOrb, a curated search engine for Atom and RSS feeds
        
       Alternative search engines are neat, as are RSS feeds. OpenOrb is a
       self-hosted app which allows visitors to search over a list of
       blogs you love. If you put your 10 favourite blogs in there, it'll
       search just those blogs and not show you any sponsored content or
       machine-generated garbage (unless... you follow blogs written by
       machines?)  Personal RSS feed readers can usually do this sort of
       thing, but RSS readers aren't meant to be shared, so you can think
       of the search engine as a 'curated feed list as a public service'.
       I wrote a longer blog post about OpenOrb here:
       https://raphael.computer/blog/openorb-curated-search-engine/
        
       Author : lowercasename
       Score  : 225 points
       Date   : 2024-04-22 10:26 UTC (12 hours ago)
        
 (HTM) web link (openorb.idiot.sh)
 (TXT) w3m dump (openorb.idiot.sh)
        
       | gbrindisi wrote:
       | I really like the idea! At some point I put up a miniflux
       | instance and it has surprisingly been a breath of fresh air for
       | my content consumption. What miniflux and my setup lacks is a way
       | to retrieve stuff I read and this OpenOrb might fit the use
       | case... I will try it out!
        
         | freetonik wrote:
         | What do you mean by "retrieve stuff I read"?
        
           | gbrindisi wrote:
           | sometimes I stumble into stuff I'm sure I've already read
           | something about in an article but if I didn't bookmarked or
           | made a note of it it's very hard to find again (miniflux
           | flushes content after a certain age)
        
             | freetonik wrote:
             | Ah, I see, thanks for the explanation! I'm asking because
             | I'm working on a similar project, which allows to both
             | search and save blog posts permanently.
        
               | flir wrote:
               | Speaking for me personally, I've always felt "search my
               | history" should be implemented in the browser, not as an
               | external tool. "Search and save blog posts" seems like a
               | subset of the real problem.
        
             | gen220 wrote:
             | You can set the age to arbitrary points in the past, if
             | storage isn't a concern. I've actually found miniflux's
             | search feature fairly solid for dredging up old stuff I've
             | read!
        
             | corney91 wrote:
             | I've set the following in Miniflux to stop it deleting
             | things:
             | 
             | CLEANUP_ARCHIVE_READ_DAYS=-1 CLEANUP_ARCHIVE_UNREAD_DAYS=-1
        
         | swyx wrote:
         | https://github.com/miniflux/v2 in case anyone else was also
         | wondering
        
       | renegat0x0 wrote:
       | You can find many RSS feeds, links in my repository
       | 
       | https://github.com/rumca-js/Internet-Places-Database/tree/ma...
       | 
       | It contains also domain lists, that include tag indicating, if it
       | is personal, or not.
        
       | marginalia_nu wrote:
       | Tangentially on this note, if anyone is interested, I can produce
       | a list of every RSS feed known to the marginalia search crawler.
       | It's a pretty noisy list, but any thing I can do to help the
       | spread, discovery and adoption of RSS I'm happy to help with so
       | just let me know.
       | 
       | I a tool in place to export this data to help power the
       | experimental RSS preview feature[1], but haven't had the
       | inspiration to do much with that yet.
       | 
       | [1] e.g. https://search.marginalia.nu/site/jvns.ca
       | 
       | --edit-- Ok so there was interest. Give me a moment, I'll need to
       | run an extraction script. Check back in a few hours or bookmark
       | https://downloads.marginalia.nu/exports/
        
         | djoldman wrote:
         | I think the community would be interested in list and you'd get
         | a lot of downloads if you offered it up.
        
         | petercooper wrote:
         | I would be very keen to have access to that list and to,
         | ideally, have a go at cleaning it up and producing a topical
         | subset for broader use in certain fields I'm interested in
         | (e.g. all the "developer blogs", say). I offer an OPML file of
         | several hundred engineering/dev related blogs at
         | https://engineeringblogs.xyz/ but I'm starting to think a
         | little bigger.
        
         | lowercasename wrote:
         | That would be _so_ cool! What an amazing resource that would
         | be.
        
         | hactually wrote:
         | I started a submission based platform ( bao.social but not
         | currently resolving) as a side project because I missed the
         | accessibility for RSS. would be keen on the list or even just
         | connecting with you and OP
        
           | marginalia_nu wrote:
           | Feel free to shoot me an email if you want to have a chat,
           | bounce ideas or whatever. That goes for other people as well
           | ;-)
           | 
           | I'm a bit busy with finalizing my grant-funded work in the
           | immediate future so reply times may be a bit slow, but such
           | is life.
        
         | yakkomajuri wrote:
         | I'd be keen on this and would import all of them into Recess
         | (https://app.recessfeed.com) - also working on RSS adoption and
         | discovery!
        
         | mariusor wrote:
         | Isn't there a way to integrate this type of info into the
         | actual search engine? Ie, search for type:rss or atom and
         | return the links to the RSS feeds?
         | 
         | [edit] I mean, to have it closer to what OP showed.
        
           | 8organicbits wrote:
           | I know the Google search console lets you upload a site map,
           | which can be an RSS feed, so the information is readily
           | available. I suspect Google isn't incentivised to promote
           | RSS, especially after they killed Google Reader.
        
         | marginalia_nu wrote:
         | Alright, about half a million RSS feeds available at:
         | https://downloads.marginalia.nu/exports/ [select feeds.csv]
         | 
         | The data is, as mentioned, pretty noisy. It's a best-effort
         | guess as to which is the canonical RSS feed for the particular
         | domain. There doesn't appear to be any convention for
         | specifying this, so when there's multiple a fair bit of
         | guesswork is involved. Expect a fair number of dead URLs, lots
         | of spam from CRMs that generate uninteresting feeds.
        
       | keepamovin wrote:
       | I wonder when RSS will experience its "Google Search in 1997"
       | moment? Right now it's beginning to nibble at Yahoo Directory
       | days
        
         | tl wrote:
         | That would be 2005 when Google Reader launched. RSS for people
         | who didn't know what RSS was.
        
           | keepamovin wrote:
           | No, I mean: "Google" moment as in what Google originally was.
           | Let me rephrase in edit my original comment to "Google
           | Search" moment.
           | 
           | Basically, when Google came on the scene in 1997, it blew
           | away Yahoo Directory. Do I have my dates right? Hahaha :)
        
             | cykros wrote:
             | RSS if anything is in decline, rather than its ascent,
             | because of the fact that in many ways it offers access to
             | content in a way that diminishes ad views.
             | 
             | It's not impossible that it could come back from this
             | state, and indeed, outside of this issue, there's nothing
             | wrong with it as a system, and podcasts make heavy use of
             | it. But it's worth being aware of this headwind.
        
               | keepamovin wrote:
               | No I totally know it's in ascent, that's my point! Haha!
               | :) Hmm, how to express what I'm saying more clearly --
               | seems it's been missed? Haha! :)
               | 
               | I mean, like RSS seems like where the web was in 1996 -
               | on the ascent! - waiting for its "Google Search" moment,
               | whereas these types of RSS curations in this product and
               | others like it recently, a little bit like Yahoo
               | Directory!
        
               | pantulis wrote:
               | I think you are being slightly over enthusiastic here.
        
               | tl wrote:
               | > No I totally know it's in ascent, that's my point!
               | Haha! :)
               | 
               | How do you "know" this? Show some proof! RSS has two
               | well-known use cases: news and podcasts. It is fighting a
               | pitched battle against players with deep pockets who want
               | you to consume content where they can monetize it with
               | ads.
               | 
               | Google Reader survived for as long as it did because such
               | a service is incredibly cheap to run. Google only ended
               | it to push people to Google+. Many of the various
               | competing providers that popped up during that period are
               | still around, but I would not say it is flourishing.
               | 
               | This is what Google thinks of RSS:
               | 
               | https://trends.google.com/trends/explore?date=all&geo=US&
               | q=R...
               | 
               | Note a rise and plateau centered around 2005 and a brief
               | peak in 2013 (when Google killed Reeder).
        
               | toyg wrote:
               | I agree with your view, but if we put down our old
               | greybeard hats for a minute - isn't it nice to see a new
               | generation of people potentially getting excited about
               | RSS? The parent comment is clearly by an optimistic
               | youngster, who has just discovered an awesome technology
               | that (he thinks) could change the world. And maybe it
               | can! Just because we've seen it beaten once (well, a few
               | times), it doesn't mean it's dead, and maybe, just maybe,
               | there is something we can't see that will be the real RSS
               | killer app.
               | 
               | Take podcasting - when RSS was first devised, nobody
               | thought of such a use-case; it just happened that the
               | media-attachment hacks tacked on top of it merged, at a
               | particular time and place, with some other emerging tech
               | (the iPod), creating something so good that it's still
               | around.
        
               | tl wrote:
               | > The parent comment is clearly by an optimistic
               | youngster
               | 
               | My only objection is when the "youngster" had his
               | viewpoint questioned, his response was "no I totally know
               | it's in ascent". Objective evidence points in the
               | opposite direction.
        
               | toyg wrote:
               | Youth be youth, innit ;)
        
               | throwaway14356 wrote:
               | rss _is_ the advertising.
               | 
               | It allows me to conveniently keep track of tens of
               | thousands of websites.
               | 
               | If you don't have a feed, no problem. Ill just read
               | something else.
               | 
               | With few exceptions I can't be bothered to keep looking
               | at a web page hoping something new has happened.
        
               | toyg wrote:
               | RSS advertises _the content_ , but not the actual
               | _sponsors_ of such content (i.e. commercial ads). It 's
               | also pretty hard to make it track readers.
               | 
               | That's why the likes of Meta and Google just don't like
               | it.
        
       | yakkomajuri wrote:
       | Someone compiled this a while ago which is a pretty good starter
       | list for content discovery:
       | https://github.com/outcoldman/hackernews-personal-blogs
       | 
       | I've imported most of them into https://app.recessfeed.com/ and
       | found some nice ones to follow through that
        
       | freetonik wrote:
       | Nice! I was thinking about the same kind of tool a while back,
       | and developed a community-based curated feed reader with full-
       | text search. It's not public yet (sign ups are behind an
       | invitation code), but search works for guests:
       | https://minifeed.net/global
        
         | lowercasename wrote:
         | This is super nice, and it looks like it's going to have some
         | really great features, well beyond OpenOrb's! Excited to keep
         | an eye on this.
        
           | freetonik wrote:
           | Thanks for the kind words! I'm still a bit hesitant to make a
           | "Show HN" at this time, but there are indeed potentially
           | interesting implemented and planned features, like:
           | 
           | - full text search across all blogs (implemented) and across
           | blogs user subscribed to (planned)
           | 
           | - subscribing to users to see the blogs they follow in your
           | "friendfeed" (implemented)
           | 
           | - favorites, with contents saved to permanent storage
           | (implemented)
           | 
           | - custom lists of blogs and posts (planned)
           | 
           | - comments (not sure about this one yet)
        
       | lbhdc wrote:
       | This is a cool idea.
       | 
       | When I search for "history" it returned only technical articles,
       | and heavily favored dan luus website.
       | 
       | Are technical blogs the primary focus?
        
         | freetonik wrote:
         | I believe the instance currently has very few blogs indexed:
         | https://openorb.idiot.sh/feeds
         | 
         | But you can deploy your own instance and add any blogs you
         | want.
        
         | marginalia_nu wrote:
         | Given that techy people have a strong disposition to have a
         | blog, more so than other demographics, there's an implicit bias
         | toward the technical within the blogosphere, especially in its
         | diminished state.
        
       | tiffanyh wrote:
       | For a brief moment, I thought this was related to
       | https://OrbStack.dev
        
       | toastal wrote:
       | Great to it's hosted on a free software forge too not locking in
       | contributions!
       | 
       | Not sure I always agree that feeds should have the full post tho.
       | This not only (obviously) bloats the size of the feed, but there
       | are valid reasons to want to drive users to your site--especially
       | if you have demos or you write about code & have your code blocks
       | syntax highlighted (statically, never do this with a JavaScript)
       | as it provides a better reading experience. You can put styling
       | technically in Atom/RSS but even then, a lot of readers won't be
       | applying the styling. That said, I definitely appreciate the full
       | post if your site is full of trackers, ads, marketing garbage or
       | other bloat since I can skip the site. Is this some site
       | engineers giving us the nod on a better UX? I read a gridiron
       | football news site & boy does _that_ feed become take a site from
       | unusable to pleasant (good photography).
        
         | fabianholzer wrote:
         | As a feed consumer I am always happy if a feed contains the
         | full content, but I am not sure if the feed must also include
         | all articles that a site ever published. That would basically
         | make the feed a serialized version of the whole website (which
         | is indeed what a few feeds that I subscribe to do by including
         | sections that are common on personal sites like
         | about/contact/now as items of their feed - but those are the
         | minority). That would actually be fine as long as the archive
         | is small or at certain size, when the feed is paginated. But I
         | am under the impression that most feed generators do not have
         | pagination in mind, also I don't know how well the individual
         | aggregators and readers handle it on the consuming end.
        
       | INGSOCIALITE wrote:
       | but does this filter out the rss feeds that are just a headline
       | and then a "click here to read the whole story" link?
       | 
       | that's what killed rss it wasn't google reader going away, it was
       | the ad-weaponizing of the feeds themselves
        
       | 2genders17634 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/elyzaRnjkWPwNcJXTnMCDF
        
       | 2genders18584 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/elyza IZbngpvHXoBiASctv
        
       | 2genders28879 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/elyza pFOfaIYwVYhCnnKlW
        
       | 2genders37082 wrote:
       | hi are u lonely want ai gf?? https://discord.gg/elyza
       | ynbtMInRLBzJqFhkK
        
       | 2genders23486 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/elyza VNTXGbQHBsRGMBAnb
        
       | 2genders37574 wrote:
       | hi are u lonely want ai gf?? https://discord.gg/elyza
       | GoKPGKhCfsbykliFZ
        
       | 2genders47065 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/elyza dfPuOwgrUlTHqaGPI
        
       | 2genders14968 wrote:
       | hi are u lonely want ai gf?? https://discord.gg/candyai
       | DseXMJKpoiyVpujAK
        
       | 2genders5576 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/candyai TaDOlxvvLRNMukoSp
        
       | 2genders11320 wrote:
       | hi are u lonely want ai gf?? https://discord.gg/elyza
       | CPTnKwgNdeaBSQIsG
        
       | 2genders14511 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/candyai lEbYOjjkpJYjpIeZj
        
       | 2genders20975 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/elyza oErtvdArSLlXvaXFf
        
       | 2genders36915 wrote:
       | hi are u lonely want ai gf?? https://discord.gg/elyza -- FOLLOW
       | THE HOMIE https://twitter.com/hashimthearab jEmqkIXLsItbfiqLI
        
       | 2genders25899 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/candyai BmNVnswQjprPqeuNr
        
       | 2genders22601 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/elyza -- FOLLOW THE HOMIE
       | https://twitter.com/hashimthearab SgvKbcVzwgoLFdBQA
        
       | 2genders9020 wrote:
       | hi are u lonely want ai gf?? https://discord.gg/elyza -- FOLLOW
       | THE HOMIE https://twitter.com/hashimthearab LYCCIpIdwgIGAjWKB
        
       | 2genders33247 wrote:
       | hi are u lonely want ai gf?? https://discord.gg/elyza -- FOLLOW
       | THE HOMIE https://twitter.com/hashimthearab XEmXaKqSQqgcyBBTN
        
       | 2genders12206 wrote:
       | Are you lonely? Do u want an AI girlfriend?
       | https://discord.gg/candyai VZKxEIZpGADneAakW
        
       ___________________________________________________________________
       (page generated 2024-04-22 23:00 UTC)