[HN Gopher] Show HN: Marginalia - Exploration Mode
___________________________________________________________________
Show HN: Marginalia - Exploration Mode
I've been a bit obsessed with the idea of flipping through the
internet a bit like you would a magazine, of undirected browsing as
a discovery mechanism, and I think I'm approaching something that's
beginning to feel pretty fun. The link at the top will return
results out of a pool of approximately 10,000 domains, you can
refresh to get new ones. You can also explore in a directed fashion
by using the 'Similar Domains'-buttons. These are not random. A
sampler, beyond the random sites offered with the head link
https://search.marginalia.nu/explore/www.amiga-news.de
https://search.marginalia.nu/explore/www.aaronsw.com
https://search.marginalia.nu/explore/therealbitcoin.org I don't
have thumbnails for all 500k domains in the database yet, but I
think it's getting to a number where it's reasonable useful.
Author : marginalia_nu
Score : 161 points
Date : 2022-01-23 16:29 UTC (6 hours ago)
(HTM) web link (search.marginalia.nu)
(TXT) w3m dump (search.marginalia.nu)
| ancientsofmumu wrote:
| This feels like what StumbleUpon was (a positive correlation :) )
| -- would you be willing to add what criteria "similar" is based
| on in the info box upper left? For example I have no clue what
| this below domain is, so would be curious as to what the
| algorithm uses as "similar" to show me more (keywords? links?
| domain names? hosting providers? tech used? country located?
| etc.) https://search.marginalia.nu/explore/cblgh.org
| marginalia_nu wrote:
| It's mostly adjacency in the link graph. I use a mix of direct
| neighbors and Personalized PageRank to produce the list.
| arendtio wrote:
| Feels pretty cool, more like the traditional internet.
|
| After a few minutes I found, that I would prefer a page that is
| not left-aligned, something like #article {
| margin: 0 auto; }
|
| A minor change that makes it much more comfortable to use IMHO.
| marginalia_nu wrote:
| Hmm, are you on mobile, desktop? What's your browser and screen
| resolution?
|
| I did relatively recently redesign the whole stylesheet, so
| there's probably a few minor problems to iron out.
| arendtio wrote:
| Desktop, Firefox 3840x2160 with window.devicePixelRatio = 1.5
|
| So I run into the max-width of 160ch (which feels good), but
| I have a lot of whitespace on the right.
| marginalia_nu wrote:
| Hmm, yeah. I think I see what you mean. Good call. I've
| pushed a new CSS.
| arendtio wrote:
| Cool, it looks great now. Thanks :-)
| laputan_machine wrote:
| This is really cool, it's like being back in 1998 again when
| browsing the internet was exciting, bookmarked! I'll be exploring
| this for a while I think
| gbuk2013 wrote:
| I don't get it - I clicked on about 10 sites and none of them
| look anything like the screenshot picture?
| marginalia_nu wrote:
| I wanted to provide an example of what the content of the
| websites look like, which you'll rarely find on the front page.
| So the screenshots are of URLs that are actually indexed by my
| search index. If you use the 'Info' link you can usually find
| the particular page. On the flip side, actually linking to
| those URLs may land you on a privacy policy or some weird deep
| link.
|
| Dunno, maybe it's a confusing choice.
| broahmed wrote:
| This is really cool. I have 7 tabs of quirky barely known
| websites open after maybe less than 5 minutes of interacting with
| Marginalia. This is so much fun!
| marsa wrote:
| any final fantasy series related fansites? for me there's
| always at least 1-2 on the list whenever i reload.
| marginalia_nu wrote:
| Haha, the fansite-sphere is one of like 6-7 hotspots the
| random function favors. May be there's a smidge too many
| right now, but I've tried to get a good mix of bits and bobs
| with hopefully a little bit for everyone.
| MayeulC wrote:
| It's awesome! I like most suggestions, it feels a bit like
| https://wiby.me/surprise but generally less weird.
|
| I've often wanted to have a go at making my own search engine,
| and I think I would penalize any form of advertising (especially
| big ad networks, referral links) or tracking (Google Analytics,
| etc.) as these can create (or reveal) perverse incentives. This
| would likely get rid of most of the "SEO spam" that we see
| nowadays. Reading the about page[1], this seems like what you are
| doing here, but I can't really tell as it's light on details.
|
| Q: would this be able to handle foreign-language sites? I don't
| yet have a blog/personal website, but if I did, I guess it would
| be mixed-language. Should I submit some of my friends' blogs,
| even though they might not be entirely (or at all) written in
| English?
|
| A relatively new sort of search-engine junk, especially visible
| in non-English results from big search engines is also auto-
| generated (or probably machine-translated) websites, full of
| nonsensical content. They might be translated from genuine sites
| in other languages, I'm not sure. It would seem hard to fend
| these off, but luckily, fighting perverse incentives such as
| advertisement revenue probably gets rid of them too.
|
| I also wondered if this was curated list, and if the list was
| available somewhere, but it seems it's just a good old spider,
| and I guess that exposing too much info about the metrics might
| enable some to game the system? Not that marginalia is big enough
| to make it an attractive target, of course!
|
| [1]: https://memex.marginalia.nu/projects/edge/about.gmi
| marginalia_nu wrote:
| I'm keeping a few of the details intentionally sketchy, but in
| general, I do think it's relatively resilient to manipulation.
| I'm using a Personalized PageRank which uses the opinions of a
| secret subset of websites to calculate a ranking. I've also
| selected those websites to be not be particularly likely to be
| bribed.
|
| Bilingual sites should be fine, I think. It will reject
| individual pages that don't have enough English text on them,
| but as long as it finds pages with English relatively easily
| they ought to get indexed.
| joebob42 wrote:
| It's gem after gem after gem, this is brilliant
| marsa wrote:
| > undirected browsing as a discovery mechanism
|
| this seems to be a thoroughly underrated way of discovery and
| it's so disappointing when websites focus solely on search,
| forcing users to know and articulate what they came for.
|
| browsing through these random sites you list is indeed a very fun
| -- and liberating -- experience. thank you for putting this
| together!
| marginalia_nu wrote:
| I do think a lot of recommendation algorithms these days are
| bit too good at finding things that are similar to what we
| like. Which means you never discover things that you'll like,
| but are not similar to the things you've tried before. It
| becomes incredibly samey after a while.
|
| The great joy of, say, flipping through a magazine or browsing
| a library is that they are passive, and don't know who you are,
| and can't adapt what you read based on what you're likely to
| read. So you might read something unexpected, you might
| discover something you didn't even know about yourself.
| marsa wrote:
| thing is they're often really really bad too. e.g. when i
| 'explore' albums on youtube music it force feeds me a limited
| selection of new releases based on popularity, percieved
| genre preference, and my geographic location, probably some
| other stuff as well. less than 5% of those recommendations
| end up being of any interest to me.
|
| meanwhile all i really desire is a full list of releases
| ordered by date and just let me sift through that myself, but
| there seems to be no way to get that list, at least not
| through the regular user interface.
|
| it's very frustrating.
| blowski wrote:
| For the last few months, I've consumed every piece of media
| reviewed by the FT Weekend. It's been a mixed bag, but I've
| made some wonderful discoveries.
| marsa wrote:
| thanks for the suggestion -- is this what you're refering
| to? https://www.ft.com/arts/music/albums
|
| i'll probably add it to my sources, but still a
| comprehensive list of new releases would be a dream come
| true.
| ncpa-cpl wrote:
| > browsing through these random sites you list is indeed a very
| fun -- and liberating -- experience. thank you for putting this
| together!
|
| Yeah! I really liked the concept too.
|
| This reminds me of the early Stumble Upon or even channel
| surfing cable tv back when it was analog!
| marginalia_nu wrote:
| I really liked StumbleUpon before it kinda turned to shit. I
| also kinda miss the feeling of not having everything be tuned
| for user engagement. It's a big part why there is no vote-
| arrows, thumbs up, stars, et cetera involved here. You shake
| the snow globe and get what you get.
| phendrenad2 wrote:
| Hey this works great. Found some new and interesting sites.
| slx26 wrote:
| Quirky sites alleviate my disdain for humanity, somewhat. Thanks.
| rixed wrote:
| It looks like a book shop and I like the idea.
|
| A nitpick though: Shouldn't the "capture in progress" pages be
| excluded from the random search?
| marginalia_nu wrote:
| Yeah, the whole thing isn't super polished, still a work in
| progress. There's also a few thumbnails that were captured mid-
| loading I'd like to improve down the line.
|
| Right now it's a mix between domains I simply haven't captured
| a thumbnail for yet, and domains that for some reason won't be
| captured (errors, etc). Once I reduce the first category, I'll
| look for a way of hiding the second category.
| was_a_dev wrote:
| I almost kinda didn't bother because of the cloudflare DDoS
| protection. I know that can be petty, but I wouldn't have waited
| if it was from a Google results page for example.
| marginalia_nu wrote:
| I just turned it up a notch right now for a moment, a lot of
| people are really aggressively bot-scraping new HN submissions
| for whatever reason. It's like a minor DoS every time you
| submit a link.
|
| That's fine for a blog I guess, but this I perform a non-
| trivial calculation for each request, so I'd rather not have
| bot spam. (This is hosted on a computer in my living room, so I
| can't just scale it up)
| Nextgrid wrote:
| > I perform a non-trivial calculation for each request
|
| Any reason why caching wouldn't work here? Do the results
| have to be different on each request instead of being cached
| for a short while (10 seconds)?
| marginalia_nu wrote:
| Oh yeah, you could probably do some sort of caching to that
| effect. This is just a fun toy I hacked together, so it's
| not super optimized.
| was_a_dev wrote:
| I mean fair enough. I gave you the time since I came from HN
| and knew the risk/reward for good content was strong.
|
| If it was more than a toy, it would need to be less aggresive
| kreeben wrote:
| >> This is hosted on a computer in my living room
|
| Since you serve Swedish weather info from marginalia I'm
| assuming you live in Sweden, is that correct? Could you very
| briefly explain how you host and serve pages from your living
| room and what your bandwidth is?
|
| Does your ISP get cranky when you see DOS type of traffic?
|
| I'm a fellow hobbyist search engine dev, also from Sweden.
| Whenever I demonstrate my search engine by hosting in the
| cloud the expenses get so big I have to go offline after a
| short while and I've therefore been contemplating personal,
| living room hosting.
| marginalia_nu wrote:
| > Since you serve Swedish weather info from marginalia I'm
| assuming you live in Sweden, is that correct?
|
| It is indeed.
|
| > Could you very briefly explain how you host and serve
| pages from your living room and what your bandwidth is?
|
| 100/100 mbit municipal broadband, through Bahnhof.
|
| > Does your ISP get cranky when you see DOS type of
| traffic?
|
| Haven't heard a word form them, although you'd be surprised
| how far I am from saturating my line. Your average
| bittorrent enthusiast probably uses a lot more. I do try to
| not be a nuisance though. Cloudflare helps take the edge
| off things, as does running a local DNS cache.
|
| > I'm a fellow hobbyist search engine dev, also from
| Sweden. Whenever I demonstrate my search engine by hosting
| in the cloud the expenses get so big I have to go offline
| after a short while and I've therefore been contemplating
| personal, living room hosting.
|
| You might also consider server rental. Can get away with
| SEK 2-4k/month. My server, including UPS and other expenses
| is like SEK 40k, plus I expect to burn through an SSD once
| a year or so.
| kreeben wrote:
| Very helpful, thank you. Which part of Sweden are you in,
| by the way.
| marginalia_nu wrote:
| Up north.
| [deleted]
| gillesjacobs wrote:
| You posted on related topics a few weeks back with your
| Marginalia projects and I spent an hour browsing your sites.
| Making the "small web" and its creative weirdness visible again
| pulls on my nostalgia strings. Good work!
| 1vuio0pswjnm7 wrote:
| The <h1> banner is "Search the internet" but are we only
| searching www servers.
|
| Can we use marginalia.nu to search for servers offering other
| protocols like ftp.
| marginalia_nu wrote:
| It's been my ambition to support Gemini and Gopher down the
| line.
___________________________________________________________________
(page generated 2022-01-23 23:00 UTC)