[HN Gopher] Show HN: Answer Overflow - Indexing Discord content...
___________________________________________________________________
Show HN: Answer Overflow - Indexing Discord content into the web
Hi! I'm Rhys, I develop Answer Overflow a search engine for
Discord channels. Answer Overflow indexes content from channels
into Google making them discoverable on the web. I'm sharing this
again after seeing a lot of discussion during the Reddit blackout
about the inaccessibility of information sent in Discord servers.
Answer Overflow is a verified bot in over 100 communities, fully
complies with the Discord ToS, and is open source!
https://github.com/AnswerOverflow/AnswerOverflow Check out some of
the communities here! T3 Community -
https://www.answeroverflow.com/c/966627436387266600 C# -
https://www.answeroverflow.com/c/143867839282020352 Reactiflux -
https://www.answeroverflow.com/c/143867839282020352 All -
https://www.answeroverflow.com/browse Please let me know what
feedback you have, thanks for checking it out!
Author : rhyssullivan1
Score : 80 points
Date : 2023-06-18 19:50 UTC (3 hours ago)
(HTM) web link (www.answeroverflow.com)
(TXT) w3m dump (www.answeroverflow.com)
| berkle4455 wrote:
| I'm sure Discord and their communities are absolutely ecstatic
| about opening up the doors to openAI and others to scrape their
| collective work for the latest LLM.
|
| Walled gardens are going to get a whole lot stricter.
| mdaniel wrote:
| Welcome back. How does this compare to Linen
| (https://github.com/linen-dev/linen.dev#readme), which claims to
| support Slack and Discord? I do see the license difference, but
| didn't know if that was the major differentiator
| rhyssullivan1 wrote:
| Couple key differences:
|
| - Answer Overflow works on a consent basis for displaying
| messages (https://docs.answeroverflow.com/user-
| settings/displaying-mes...), while Linen does all the messages
| in a community. The consent system Answer Overflow has helps a
| lot with respecting user privacy while also getting content
| indexed.
|
| - Linen appears to be building out a competitor to Slack &
| Discord while Answer Overflow is focused on building on top of
| those platforms, so we've got very different roadmaps. From
| what I can gather from the Linen roadmap, they're implementing
| things like voice chat, private channels, etc. Whereas with
| Answer Overflow some of the things I'm focused on is answer
| automation, tracking outdated answers, analytics for where to
| improve your docs etc
|
| - Answer Overflow is pretty much only focused on Discord
| servers, it wouldn't be too hard to support both Slack and
| Discord but what's nice about focusing on Discord for now is it
| helps with our goal of being the best indexing tool
| specifically for Discord
|
| - Global search (https://www.answeroverflow.com/search), you
| can search all Answer Overflow communities at the same time
|
| The team at Linen have built out a great product though and
| it's cool watching them succeed with it!
| mid-kid wrote:
| I was talking about needing a solution like this just a second
| ago. Down from the heavens, descends this. I'll be sure to give
| it a try!
| rhyssullivan1 wrote:
| Send me a message if you have any questions! Happy to help with
| getting it setup
| mid-kid wrote:
| This might sound a little bit picky, but from a cursory look
| around the project, it feels a bit too corporate and
| platform-ey for my tastes. I'm only interested in two things:
| generating (ideally static, and seo-friendly) web pages out
| of a discord forum channel and selfhosting it so we can
| archive the data ourselves (and won't be bound to content
| policies of answeroverflow.com). All of the extra bells and
| whistles with the bot auto-managing channels, analytics, AI
| and whatever else superfluous and make me sweat a little, as
| I'll have to comb through the documentation to make sure
| everything is set up correctly. It's also really a shame to
| read that selfhosting will be a "Pro" feature. I'll give
| props for considering users wanting to opt-out, however, and
| it does at least seem rather simple to set up.
| rhyssullivan1 wrote:
| Where did you see self hosting is a pro feature? My bad if
| the website gives that impression it will be free, the
| whole codebase is MIT licensed.
|
| For all the extra bells and whistles, it's mainly for
| people who are doing community support at scale who need it
| which would be paid customers - I do sort of need a way to
| support myself so I can buy groceries. The core of the
| product that matter is free and working well for indexing
| content so now the focus is "what else can we do to improve
| community support as a whole?"
|
| As for self hosting, if you submit a PR for supporting it
| I'd be happy to get that merged but it's not really a
| priority at the moment. The codebase is setup to be pretty
| easy to make a self hosted version though.
| mid-kid wrote:
| Haha, that's fair. I'll consider trying to set it up
| myself and see how it goes.
|
| I got the idea that it was a pro feature out of the
| roadmap list on the website, where it's listed as "coming
| soon", and "pro" is only mentioned when you click on the
| waitlist join link. If it means custom domains, it might
| be better off being listed as "custom domains" or
| something similar. That's how it's called on google apps
| and such. It also doesn't help that the roadmap on the
| website doesn't match the one on the github page, I
| thought the roadmap features on the github page might be
| pro features as well.
| rhyssullivan1 wrote:
| Ah I see how that's confusing, sorry about that! I'll
| update it in both places to make that clearer
| tudorw wrote:
| nice, there is a lot of good stuff on discord!
| apignotti wrote:
| Genuine question: I love Discord, but how on earth is it possible
| that such functionality was not built-in to begin with?
|
| I really don't understand how the need for indexing and search
| was overlooked.
| Kiro wrote:
| It makes no sense to index the vast majority of content. You
| would need to cherry pick really hard among all the noise to
| find the stuff worth putting online.
| jasonjmcghee wrote:
| Interesting comment. I would think Reddit is similar in terms
| of content, yet "site:reddit.com <query>" is common as a
| general search pattern (pre-blackout)
| michaelmior wrote:
| I would argue it makes no sense to index the vast majority of
| content _without good search_. If your search is good enough,
| you can index everything and then surface only the good stuff
| at query time.
| thunky wrote:
| What I wonder is why would anyone that cares about
| archiving/search would choose to use Discord?
| esafak wrote:
| It's not made for knowledge discovery; it's for gamers. Just
| look at that busy UI! The content is assumed to have no
| historical value.
| thrashh wrote:
| Discord is a chatroom first. What non-enterprise chat comes
| with archives?
|
| A forum is totally different.
|
| And even then, forums weren't designed to be archived from the
| start. People just wrote web crawlers and search engines.
|
| (I know Discord has some forum-like functionality now but the
| point stands.)
| rhyssullivan1 wrote:
| I think it's due to how Discord evolved as a platform
|
| Discord start as "your private place for your friends to talk"
| during a time where there were a lot of privacy issues with
| other communication methods.
|
| Then as it grew beyond this scope of being a private place for
| friends, it would have been good for indexing to be added but
| indexing a normal text channel is really hard since you don't
| know where the conversation starts / stops to submit to a
| sitemap.
|
| Now we've got large public communities and forum channels so
| it's possible they roll out their own version soon, but it does
| still slightly go against how their product was originally
| created so there may be some hesitation with adding it due to
| not knowing what the community reaction will be like.
| madeofpalk wrote:
| Discord has 'indexing' and search, just like how Slack does.
| It's just not on the public & open web - only searchable inside
| of Discord.
| easygenes wrote:
| While I see the value here, I don't really think most Discord
| communities are appropriate to be indexed. It breaks the whole
| cozy web aspect of it. [1]
|
| [1] https://maggieappleton.com/cozy-web
| rhyssullivan1 wrote:
| Most Discord communities aren't meant to be indexed I agree!
| Thanks for linking that article it was interesting to read
|
| There's lots that have support channels though for programming
| libraries, for games, etc and having all of that content locked
| away can be really damaging.
|
| One of the interesting things I've noticed is when a community
| for a more niche game / programming library joins Answer
| Overflow, they often shoot up to being top performers on the
| site which is great to see.
|
| Along with that, not all channels are indexed, mainly just help
| channels. What's nice with this is it keeps that cozy feeling
| of a private place to talk, while helping more people find a
| community they will enjoy and keeping information accessible.
|
| Long term, I'd like to implement forms of anti-abuse tools for
| communities to use so they can understand what the types of
| people who join their server from Answer Overflow are like. For
| example, if it turns out that 90% of the people who join are
| abusive, then it'd make sense for them to turn off indexing.
|
| You could possibly make the argument that for the long term
| health of some communities, having indexed content helps to
| keep the community active
| TeMPOraL wrote:
| The "cozy web" is out of control these days. A lot of social
| utility is lost by default because everyone uses Whatsapp and
| Discord and other such information black holes, places where
| knowledge goes to die. It's OK if you're using these to chat
| with your family or friends, but it's kind of... less OK, when
| every open source project these days, including major
| programming languages, tells you to join their Slack or Discord
| for support and learning.
|
| What's happening is that these "communities" demand you to
| commit _first_ , and deny providing value to passive
| participants. If that sounds reasonable to some, let me point
| out that the _entire value of the Internet_ is built on doing
| the opposite. Wikipedia, Reddit, StackOverflow, everything that
| you can find through a search engine - those are all resources
| made available by people and groups that, for various reasons,
| decided to _share_ knowledge instead of hoarding it, invite
| passive participation instead of demanding active commitment.
| The good days of the Internet, the ones people mourn, back
| before it got fully commercialized? They were built on the
| sentiment of openly sharing information, giving them "pay it
| forward" style - not gate-keeping them in webs of trust, and/or
| demanding people to pay with effort.
|
| Maybe I'm too old, but I _hate_ the "cozy web" with passion.
| philippejara wrote:
| Most discord communities that are big enough to get indexed
| were supposed to be forums anyway, or part of one.
___________________________________________________________________
(page generated 2023-06-18 23:00 UTC)