[HN Gopher] Show HN: Auto Wiki - Turn your codebase into a Wiki
___________________________________________________________________
Show HN: Auto Wiki - Turn your codebase into a Wiki
Hi HN! I'm Omar from Mutable.ai. We want to introduce Auto Wiki
(https://wiki.mutable.ai/), which lets you generate a Wiki-style
website to document your codebase. Citations link to code, with
clickable references to each line of code being discussed. Here are
some examples of popular projects: React:
https://wiki.mutable.ai/facebook/react Ollama
https://wiki.mutable.ai/jmorganca/ollama D3:
https://wiki.mutable.ai/d3/d3 Terraform:
https://wiki.mutable.ai/hashicorp/terraform Bitcoin:
https://wiki.mutable.ai/bitcoin/bitcoin Mastodon:
https://wiki.mutable.ai/mastodon/mastodon Auto Wiki makes it easy
to see at a high level what a codebase is doing and how the work is
divided. In some cases we've identified entire obsolete sections of
codebases by seeing a section for code that was no longer
important. Auto Wiki relies on our citations system which cuts back
on hallucinations. The citations link to a precise reference or
definition which means the wiki generation is grounded on the basis
of the code being cited rather than free form generation. We've
run Auto Wiki on the most popular 1,000 repos on GitHub. If you
want us to generate a wiki of a public repo for you, just comment
in this thread! The wikis take time to generate as we are still
ramping up our capacity, but I'll reply that we've launched the
process and then come back with a link to your wiki when it's
ready. For private repos, you can use our app
(https://app.mutable.ai) to generate wikis. We also offer private
deployments with our own model for enterprise customers; you can
ping us at info@mutable.ai. Anyone that already has access to a
repo through GitHub will be able to view the wiki, only the person
generating the wikis needs to pay to create them. Pricing starts at
$4 and ramps up by $2 increments depending on how large your repo
is. In an upcoming version of Auto Wiki, we'll include other
sources of information relevant to your code and generate
architectural diagrams. Please check out Auto Wiki and let us know
your thoughts! Thank you!
Author : oshams
Score : 78 points
Date : 2024-01-08 18:26 UTC (4 hours ago)
(HTM) web link (wiki.mutable.ai)
(TXT) w3m dump (wiki.mutable.ai)
| eduardosalaz wrote:
| Does it parse Julia files? I am having trouble with generating
| the wiki for a Julia repository, what surprised me was that it
| could parse and understand .tex files! Looks promising.
| oshams wrote:
| Hey ! Yes, it should work, is there a public repo in particular
| you'd like us to Auto Wiki? Please bear with us as we ramp up
| on capacity.
| e28eta wrote:
| Should all the repos in the "Explore" section already be
| generated? I clicked on apple/swift from the Languages tab and
| got "Wiki not found", which isn't what I expected to see
| oshams wrote:
| Sorry to hear that, can you try again?
| https://wiki.mutable.ai/apple/swift works for me.
| e28eta wrote:
| Looks good now, thanks!
| janfoeh wrote:
| I have the same issue with https://wiki.mutable.ai/d3/d3 -
| "Wiki not found".
| OmarShehata wrote:
| The Bitcoin and Mastadon links don't seem to be working! (wiki
| not found)
|
| Would love to see this for Godot
| (https://github.com/godotengine/godot). Maybe Maplibre too
| (https://github.com/maplibre/maplibre-native)!
| oshams wrote:
| We're trying work out why it doesn't load for a subset of
| people. We tested on all browsers/OS configs with 0.5%
| coverage. Please accept my apologies.
|
| We are generating those two wikis now. Thanks for the request.
| CGamesPlay wrote:
| I'd love to see the wiki generated for a less already-documented
| example. These high-profile projects are good demos and the
| results look compelling (I checked out AutoGPT's and NeoVim's),
| but these projects already have a ton of documentation that helps
| the model substantially. What are the smaller projects where it
| has to generate documentation from code (and not necessarily
| well-commented code) rather than existing documentation?
| oshams wrote:
| Great point! Here's an example of an obscurer repo with a good
| wiki: https://wiki.mutable.ai/dadongshangu/async_FIFO
| CGamesPlay wrote:
| Impressive. I would be interested in this once it hits
| general availability. I would also love to see it operate on
| a local repository, because I may not be hosting my source
| code on Github.
| oshams wrote:
| Hey! we're able to do completely private deployments.
| localhost wrote:
| Would it scale to something like the Linux kernel? How much would
| it cost to process something of that size?
| oshams wrote:
| Yes, there's no practical limit to how far we can scale it. We
| would actually do that for free since it's open source. Do you
| want it?
| localhost wrote:
| I think it would be a wonderful education tool for future
| hackers on that codebase!
| comex wrote:
| [Edit: Apparently I'm reviewing the wrong product; see replies.]
|
| I tried the app version on one of my old repos. It's a somewhat
| challenging test case because there are few comments and parts of
| the code are incomplete, though I'd say the naming convention is
| pretty good. The app suggested the question "What is the purpose
| of the 'safemode-ui-hook.m' file?" I accepted the suggestion, and
| the output was... completely wrong.
|
| I'm not surprised it guessed the purpose wrong; even a human
| would need some context to understand what's going on in that
| particular file, though of course the AI did worse by being
| confidently wrong rather than saying it didn't know. But the AI
| also made specific claims that could be seen as wrong just by
| reading the file. It claimed the file "defines a
| SUBSTITUTE_safemodeUIHook C struct" when neither that struct name
| nor anything like it appears anywhere in the file. The name seems
| to just be mashed together from the repo name and file name.
|
| Which makes me wonder, did the AI even see the content of the
| file? Is it pre-summarized somehow in a way that makes it know
| very little about the file? Or did the AI see it in full, but
| hallucinate anyway?
| oshams wrote:
| It sounds like you're referring to our chat product. We are
| aware of the limitations of that, this is why we created auto
| wiki! We plan to integrate the two in the future.
| comex wrote:
| I see...
|
| The site said an Auto Wiki didn't exist for my repo but could
| be generated via app.mutable.ai, so I went there and assumed
| it was substantially the same product despite the slightly
| different interface. I guess I didn't find the actual Auto
| Wiki functionality on the app domain.
| huevosabio wrote:
| Love the idea! I will try it on my own repo!
|
| As an aside, I've been thinking of creating an auto-wiki for game
| lore based on what AI npcs say, i.e. convert their hallucinations
| into canon.
|
| How was your experience of taking unstructured text (though code
| is more structured) and making it into wikis?
|
| How difficult is to have it do incremental updates vs re-create
| it all?
| oshams wrote:
| Thank you for the praise. Sounds cool re your idea, is there
| something you want to try it on that we can help you with? We
| cache much of the work so incremental updates are quite easy,
| the asymptotic work is low.
| huevosabio wrote:
| So, I have this game (https://zaranova.xyz/) where the AI
| tries to guess who is a human and you as a player must
| pretend to be an AI.
|
| Now, most of the conversations are between the AIs and the
| hallucinate stuff based on the very small background lore I
| gave as a prompt. I've been thinking of having a background
| process that takes all of these conversations through many
| different games and start building a coherent game lore wiki,
| automatically.
|
| Basically, letting these interactions build the game lore.
|
| It is quite different from using a repo because code is 1)
| relatively structured, 2) coherent, 3) and refers to a single
| instance. In contrast, the corpus of conversations may have
| conflicting narratives of the game lore which need to be
| reconciled and there is close to no structure!
|
| Anyway, happy to chat if you think this is an interesting
| topic for auto-wikis.
| pbowyer wrote:
| Would be great to see for https://github.com/symfony/symfony,
| thanks! As that's a monorepo it may provide a challenge to the
| tool.
| oshams wrote:
| We love a challenge, on it!
| vinibrito wrote:
| Wow, looks nice! I almost felt like I could understand Bitcoins
| code xD
|
| Could you do Appwrite? https://github.com/appwrite/appwrite
|
| I'm not affiliated to them, just wanted to get started hacking
| it.
| oshams wrote:
| On it! Even if you were affiliated we would do it.
| oshams wrote:
| And that's great to hear you can now understand Bitcoin, this
| was one of my prime motivations actually!
| int_19h wrote:
| I'd love to see what it can do with
| https://github.com/microsoft/debugpy - and especially how it
| would handle vendored dependencies.
| oshams wrote:
| On it !
| faizshah wrote:
| I would like to see how it performs on a lesser known project
| with less stars, these are all well known projects which would
| exist in the training data in blog posts etc.
| oshams wrote:
| Does this meet that requirement?
| https://wiki.mutable.ai/dadongshangu/async_FIFO
| faizshah wrote:
| I'm seeing "Wiki not found" for that link, perhaps an AuthZ
| issue?
| oshams wrote:
| Try reloading that usually works. BTW, we identified the
| issue that was affecting the subset of users on the first
| load. We are pushing out a fix as we speak.
| teraflop wrote:
| Cool concept. Right off the bat I see some big issues with the
| generated CPython documentation:
|
| > This provides a register-based virtual machine that executes
| the bytecode through simple opcodes.
|
| Python's VM is stack-based, not register-based.
|
| > The tiered interpreter in .../ceval.c can compile bytecode
| sequences into "traces" of optimized microoperations.
|
| No such functionality exists in CPython, as far as I know.
|
| > The dispatch loop switches on opcodes, calling functions to
| manipulate the operand stack. It implements stack manipulation
| with macros.
|
| No it doesn't. If you look at the bytecode interpreter, it's full
| of plain old statements like `stack_pointer += 1;`.
|
| > The tiered interpreter is entered from a label. It compiles the
| bytecode sequence into a trace of "micro-operations" stored in
| the code object. These micro-ops are then executed in a tight
| loop in the trace for faster interpretation.
|
| As mentioned above, this seems to be a complete hallucination.
|
| > During initialization, .../pylifecycle.c performs several
| important steps: [...] It creates the main interpreter object and
| thread
|
| No, the code in this file creates an internal thread _state_
| object, corresponding to the already-running thread that calls
| it.
|
| > References: Python/clinic/import.c.h The module implements
| finding and loading modules from the file system and cached
| bytecode.
|
| This is kinda sorta technically correct, but the description
| never mentions the crucial fact that most of this C code only
| exists to bootstrap and support the _real_ import machinery,
| which is written in Python, not C. (Also, the listed source file
| is the wrong one: it just contains auto-generated function
| _wrappers_ , not the actual implementations.)
|
| > Core data structure modules like .../arraymodule.c provide
| efficient implementations of homogeneous multidimensional arrays
|
| Python's built-in array module provides only one-dimensional
| arrays.
|
| And so on.
| oshams wrote:
| Thank you for this feedback. We actually have an Auto Wiki v2
| in the works which is even higher quality, would be interesting
| to see how it changes when that comes out.
| faizshah wrote:
| Can you talk a little bit about the crawler or what
| information are you feeding the agent about the repostory? My
| main concern is that this is just hallucinating the
| documentation and that with the more well known repos like
| React it can pull the data from training data like blogs etc.
|
| I think the concept is really great would just like to
| understand especially for enterprise use cases.
| oshams wrote:
| It's purely based on the code and we force the LLM to cite
| the code it describes to cut back on hallucination. We are
| adding a self verification steps (like chain-of-
| verification / https://arxiv.org/abs/2309.11495) in Wiki
| V2.
| GrinningFool wrote:
| I would expect something like this to output only factually
| correct documentation since it would be used as reference;
| but it sounds like even under the upcoming v2 that's not the
| case?
| HL33tibCe7 wrote:
| Autogenerating documentation that could then be corrected
| by a human could still have value, in fairness
| oshams wrote:
| We think chain-of-verification and fine tuning will help a
| lot see the other response. We are really excited to launch
| v2 ASAP so we can address these concerns.
| nerdponx wrote:
| Great example of plausible but completely incorrect outputs
| from an AI model that would go largely undetected by a non-
| expert human.
| Amigo5862 wrote:
| The only thing I see that this adds over existing docs-to-HTML
| tooling is that it uses a wikipedia-inspired theme.
|
| Meanwhile on the negative side, it adds hallucinations. You say
| you "cut back" on them but as teraflop's comment shows, it still
| has plenty.
|
| BTW: even the Mastodon link from your OP says "wiki not found"
| for me.
| hk__2 wrote:
| That's nice but the name is confusing: it's not generating a wiki
| at all, but a documentation website with a Wikipedia-like theme.
| Wikis are collaborative websites; Wikipedia is only one of them.
| TachyonicBytes wrote:
| As long as this is happening, might as well try some of my
| favorites: https://github.com/wasm3/wasm3,
| https://github.com/WebAssembly/wabt,
| https://github.com/bytecodealliance/wasmtime
| TachyonicBytes wrote:
| Also, how does it "know" which parts are the important parts?
| Example, from the React repo, we have this:
| The key components of React's implementation include:
| The reconciler, implemented in .../react-reconciler, which
| contains the ReactFiberReconciler class and algorithms for
| recursively diffing virtual DOM trees and scheduling rendering
| work. The beginWork and completeWork phases drive the
| reconciliation process.
|
| But the reconciler seems to be an experimental, not core,
| recent package, not a key one.
| noone_youknow wrote:
| Nice! I'd be interested to see how it handles
| https://github.com/rosco-m68k/rosco_m68k , it's a mixed software
| / hardware repo, with a lot of code in assembler and C (for an
| old platform). Might be a challenge?
| codetrotter wrote:
| When I click the link https://wiki.mutable.ai/bitcoin/bitcoin
| from your post it says that the wiki doesn't exist yet
|
| Then when I clicked again it loaded.
|
| Then I clicked the one for D3 and it said the same. And I clicked
| it again and it still said the same.
|
| Is this some kind of weird manifestation of a DB conn error or
| something?
| oshams wrote:
| Try reloading, we identified the issue and are pushing out the
| fix now (in code review, almost done!)
| codetrotter wrote:
| (Edited my post before you responded. Originally my comment
| only mentioned clicking Bitcoin and getting message that
| there was no wiki.)
| paxys wrote:
| The entire point of a Wiki is that it can be collaboratively
| edited. This is static documentation, just with a Wikipedia-like
| UI.
| oshams wrote:
| Would this be your top request? We're thinking of adding that
| functionality.
| paxys wrote:
| I haven't used it enough to know if the two models can
| feasibly coexist. It may be better to just name it more
| accurately. There will be immense demand for an "automatic
| documentation generator". A wiki, not quite as much.
| hobo_mark wrote:
| Not the op but I don't think it's a request, it's semantics.
| You are calling wiki something that is really not a wiki,
| because wikis are editable.
| loktarogar wrote:
| Regardless if it's a wiki or not right now, documentation
| that is wrong and cannot be fixed is worthless
| oshams wrote:
| FYI, we fixed the issue that was causing people to not see wikis
| on the initial load. Thank you for the feedback!
| TheEzEzz wrote:
| Super cool. When I think about accelerating teams while
| maintaining quality/culture, I think about the adage "if you want
| someone to do something, make it easy."
|
| Maintaining great READMEs, documentation, onboarding docs, etc,
| is a lot of work. If Auto Wiki can make this substantially
| easier, then I think it could flip the calculus and make it much
| more common for teams to invest in these artifacts. Especially
| for the millions of internal, unloved repos that actually hold an
| org together.
| oshams wrote:
| Thank you! We like the analogy of dehydrating knowledge that
| can be used (hydrated) later. Beyond even unloved repos, we'd
| even argue broader organizational knowledge that seems to have
| been lost to history like Roman Concrete or how to precisely
| build the Saturn V could potentially be "stored" using AI.
| sysread wrote:
| Why does it request write access to my repos via gh auth?
| solarkraft wrote:
| I'd quite like some more high level documentation for the Matrix
| JS SDK (https://github.com/matrix-org/matrix-js-sdk). I've been
| looking at it for quite some time and still don't understand how
| timelines work. Would bw fascinating if your tool could bring
| something useful to light.
| oshams wrote:
| On it!
| stevage wrote:
| Please do stevage/map-gl-utils
|
| And turfjs/turf
|
| Feedback: it's confusing that you're using the word wiki. I guess
| you mean, the style is similar to Wikipedia? But otherwise the
| concept of a wiki, an editable set of interconnected pages, seems
| irrelevant and just confusing here?
| oshams wrote:
| On it! Appreciate the feedback. We might actually make them
| editable and then that would kill the confusion.
| velox_neb wrote:
| Reading these wikis makes me feel we need to invent some visual
| convention to indicate AI-generated text. Like a particular color
| or font. This would make it so people don't feel cheated after
| they realize they just spent several minutes trying to make sense
| of something churned out by an LLM. (I mean this as a voluntary
| design enhancement for sites that want to be nice, of course
| people can always cheat.)
___________________________________________________________________
(page generated 2024-01-08 23:00 UTC)