[HN Gopher] Show HN: Feedsmith -- Fast parser & generator for RS...
___________________________________________________________________
Show HN: Feedsmith -- Fast parser & generator for RSS, Atom, OPML
feed namespaces
Hi HN! While working on a project that involves frequently parsing
a lot of feeds, I needed a fast JavaScript-based parser to extract
specific fields from feed namespaces. Existing Node packages were
either too slow or merged all feed formats, losing namespace
information. So I decided to write it myself and created this NPM
package with a simple API. Feedsmith supports all feed formats and
many popular namespaces, including: Podcast, Media, iTunes, Dublin
Core, and more. It can also parse and generate OPML files. I am
currently adding support for more namespaces and feed generation
for RSS, Atom and RDF. The library grew into something bigger than
I initially anticipated, so I also started creating a dedicated
documentation website to describe all the features.
Author : macieklamberski
Score : 35 points
Date : 2025-05-06 18:03 UTC (4 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| piotrkulpinski wrote:
| Looks great! Do you have any benchmarks comparing the performance
| with similar packages?
| macieklamberski wrote:
| Thanks! For now I have some benchmarks for parsing, as this has
| been my main focus regarding performance. It consistently ranks
| in the top 2 with the caveat that other libs do not support
| most of the feed namespaces that Feedsmith does.
|
| Here are the results: https://github.com/macieklamberski/feedsm
| ith/blob/main/bench....
| jauntywundrkind wrote:
| Well done, congrats! Those are great looking results!
| renegat0x0 wrote:
| Nice project! Good job!
|
| Now somebody might also find interesting what I have done.
|
| - I decided that implementing RSS reader for 100x time is really
| stupid, so naturally I wrote my own [0]
|
| - my RSS reader is in form of API [1], which I use for crawling
|
| - can be installed via docker. User has to only parse JSON via
| API. No need to use requests, browsers, status codes
|
| - my weapon of choice is python. There is python feedparser
| package, but I had problems in using in parallel, because some
| XML shenanigans, errors
|
| - my reader, serves crawling purpose, so I am interested in most
| basic elements, like thumbnails, so all nuance from RSS is lost
|
| - detects feeds from sites automatically
|
| Links
|
| [0] https://github.com/rumca-js/crawler-
| buddy/blob/main/src/webt...
|
| [1] https://github.com/rumca-js/crawler-buddy
___________________________________________________________________
(page generated 2025-05-06 23:00 UTC)