[HN Gopher] Uncertain Future for Marginalia Search
___________________________________________________________________
Uncertain Future for Marginalia Search
Author : panic
Score : 88 points
Date : 2022-04-29 01:27 UTC (1 days ago)
(HTM) web link (memex.marginalia.nu)
(TXT) w3m dump (memex.marginalia.nu)
| marginalia_nu wrote:
| Hopefully this will turn out to be a good thing. Maybe having
| some time to work on the project full time is exactly what's
| needed to push it forward.
|
| Still a bit uncomfortable how sketchy it feels in the longer
| term. But whatever. All I can do about it is do a good job.
| O_H_E wrote:
| This might be intentional on your part, but I couldn't find
| your Patreon linked anywhere from the blog.
|
| This might be a good time to start linking that in obvious
| places.
|
| Fwiw it was very easy to find it through Google, but ironically
| not through marginalia.
|
| I hope you the best in your endeavors.
| marginalia_nu wrote:
| Yeah I have it linked from the search engine as a top
| link[1], but I can only have 2-3 of them so I haven't linked
| to it anywhere in the blog.
|
| Haven't really been a priority to get donations since I've
| had more than plenty income.
|
| Maybe I should look over the design.
|
| [1] https://memex.marginalia.nu/projects/edge/supporting.gmi
| imiric wrote:
| I've been following the project for a while now, and while I
| don't use it yet, we need it and more like it to succeed if we
| ever hope to loosen Google's chokehold on the web.
|
| Best of luck to you and to the project!
|
| I'm curious about a few things:
|
| 1. What's your (planned) business model?
|
| 2. Have you tried asking for sponsorships, either from companies
| or individuals? You should have an easy way for people to donate.
| I'm sure you'd have some support there, especially if your day
| job situation is unstable.
|
| 3. Is it just you working on it right now? Have you considered
| open sourcing it to get community contributions, or hiring more
| devs (once donations pick up or maybe someone would be willing to
| work on it on their free time as you do)? I can imagine that
| writing a search engine is a gargantuan effort, and doing it
| alone must be close to impossible.
| marginalia_nu wrote:
| > 1. What's your (planned) business model?
|
| Dunno. In general I don't have a lot of faith in the
| profitability of search engines. Ads _can_ work if you 're
| Google-scale, the other option is subscriptions, but in that
| case, you need to be _really_ good and my search engine just
| isn 't, outside of some areas. That's actually one of my bigger
| design problems, how to let people understand which queries are
| likely to be useful. It looks like Google, and people assume it
| has the affordances of Google. It doesn't, and if you go in
| with those assumptions, you'll be disappointed.
|
| My model, as far as I've planned one, is just to keep the
| operation as cheap as possible and subsist on donations and
| maybe partnerships with other search engines. A big part of
| what I'm exploring is ways of doing as much as possible with
| low power hardware. I think rather than indexing 1 billion
| documents, 90% of which are garbage that will never be a good
| search result for any query ever, if I can index 100 million
| 50% of which are potentially good hits, then maybe that goes a
| decent way.
|
| > 2. Have you tried asking for sponsorships, either from
| companies or individuals? You should have an easy way for
| people to donate. I'm sure you'd have some support there,
| especially if your day job situation is unstable.
|
| I haven't really been fishing for this. I honestly didn't see
| having to change jobs as I am right now. I do have a donations
| page from before, but all of this was fairly sudden, so I
| haven't really gone over that whole process all too much yet.
|
| > 3. Is it just you working on it right now? Have you
| considered open sourcing it to get community contributions, or
| hiring more devs (once donations pick up or maybe someone would
| be willing to work on it on their free time as you do)? I can
| imagine that writing a search engine is a gargantuan effort,
| and doing it alone must be close to impossible.
|
| It's been just me up until now. Solo work can be ridiculously
| efficient when beginning a new project, especially when doing
| the sort of exploratory programming this has been. I also
| haven't felt I have had enough bandwidth to manage an open
| source project. But I am approaching a point where it's
| becoming a bit much to do all by myself, especially given this
| isn't my only project.
|
| So I am considering open sourcing it or bringing more people
| in, just need to think a bit about a good format for such a
| collaboration. It's relatively high maintenance and requires
| manual operations to keep going. As it stands, a lot of the
| code isn't trivially testable, running it (even with few
| documents) requires large language models and so on.
| Kye wrote:
| A handful of people have paid me about $100/month
| collectively for a few years expecting nothing in return via
| Patreon. Marginalia has a much, much, much bigger audience
| and I suspect you could manage a multiple of that sort of "I
| don't care what you make, don't expect anything in return,
| and I'm just glad to see you making stuff" support.
|
| I would recommend Ko-fi though since they make real live
| subscriptions right on your Stripe, so you can migrate them
| if you ever decide to do it in-house.
| hahnchen wrote:
| I just use google to search hn, "site:news.ycombinator.com
| <query>"
| NeutralForest wrote:
| Very important project, I hope you'll be able to settle into
| something comfortable!
| benwills wrote:
| In a very different way, I'm also involved in a search-related
| project. (edited to add: also going solo on my project as well)
| If you ever want to bounce ideas around, I'd totally be up for
| that.
|
| Related: you mention other sources than Common Crawl for WARC
| data. Is there a list of those somewhere?
| marginalia_nu wrote:
| Sure, my email is in my profile if you want to chat.
|
| Some WARCs that go into IA get published on archive.org, not
| all of them, but some:
| https://archive.org/search.php?query=warc
|
| It's also an all-around useful format as you can produce it
| from wget and other common tools. But the big reason I'm moving
| toward something relatively homomorphic to WARCs is to be able
| to (in the future) publish my own crawls.
| benwills wrote:
| Thanks for that link. I've done a bit of work with the Common
| Crawl data (and proposed moving to ZSTD with a proof of
| concept and performance metrics in C a few years ago).
|
| I'll send you an email later this weekend to connect.
| theobeers wrote:
| This is a great search engine. I entered "Persian
| transliteration," since that's what I was working on today. It
| sent me to the readme for a program written in 1996,[0] which
| takes a Latin-script transliteration of some Persian text, and
| generates ASCII art that resembles the way that text would be
| written in Persian script. Useful? Eh... Delightful? 100%. It
| would never have occurred to me that such a program would exist.
|
| Best wishes to you, Marginalia developer.
|
| [0]: http://www.payvand.com/gerdsooz/README.html
| jmclnx wrote:
| I never heard of it, but looks good. I hope they can succeed. And
| good luck to you too.
| ColinHayhurst wrote:
| I wish you well and we welcome what you are doing with
| marginalia. As you know search needs a shakeup. One vital
| approach to a real shakeup is true independence of crawler and
| index. If it's any encouragement, Marc our founder started Mojeek
| as a hobby project back in 2004.
| marginalia_nu wrote:
| Thanks, man.
| yuhong wrote:
| kumarsw wrote:
| Using Marginalia always reminds me just how much we have lost
| since the golden age (2000-2010) of the internet. Thanks for
| bringing it back in a small way.
| daxfohl wrote:
| Surprised Elon bought that dumpster fire instead of something
| like this.
| marginalia_nu wrote:
| Yeah, it would be far cheaper too :-/
___________________________________________________________________
(page generated 2022-04-30 23:00 UTC)