[HN Gopher] Show HN: Transform your codebase into a single Markd...
___________________________________________________________________
Show HN: Transform your codebase into a single Markdown doc for
feeding into AI
CodeWeaver is a command-line tool designed to weave your codebase
into a single, easy-to-navigate Markdown document. It recursively
scans a directory, generating a structured representation of your
project's file hierarchy and embedding the content of each file
within code blocks. This tool simplifies codebase sharing,
documentation, and integration with AI/ML code analysis tools by
providing a consolidated and readable Markdown output.
Author : tesserato
Score : 189 points
Date : 2025-02-14 13:23 UTC (9 hours ago)
(HTM) web link (tesserato.web.app)
(TXT) w3m dump (tesserato.web.app)
| ainiriand wrote:
| My codebase sitting at 4M lines: hold my spaghetti.
| nahco314 wrote:
| This is self-promotional, but https://github.com/nahco314/feed-
| llm has TUI to choose what to give to llm. There are many
| similar tools out there, but I think this approach is
| relatively effective for larger code bases.
| ycombinatornews wrote:
| You can ask Cursor to use information from specific folder (aka
| your 4M lines) and it would summarize it and use that.
|
| Not a replacement for full 4M lines but it might work for some
| tasks/prompts
| pmx wrote:
| How does this compare to / differ from
| https://github.com/yamadashy/repomix ?
| akoculu wrote:
| or https://github.com/azer/llmcat
| imdsm wrote:
| I simply have a bash script called printall which takes in
| some args, and outputs markdown codeblocks with filenames and
| a tree. One of hundreds of scripts built up over the years.
| akoculu wrote:
| if you add fzf to speed up file / folder selection, you'll
| have your own llmcat :)
| ycombinatornews wrote:
| My question exactly. Repomix seems to be tested util for
| something like that.
| apineda wrote:
| There is also https://github.com/regenrek/codefetch which I
| personally like
| ActVen wrote:
| Same question here. I have found repomix to get the job done
| really well.
| tesserato wrote:
| Some advantages of CodeWeaver are that it is compiled, so it
| might be faster; you can grab a compatible executable from the
| releases section instead of using `go install` so, no
| dependencies. You can manually specify what to exclude via a
| comma-separated list of regular expressions so it might be more
| flexible. I never used Repomix so, those assumptions might not
| hold. On the other hand, remix seems to be awfully more
| complete, a full-fledged solution to convert source code to
| monolithic representations. I wrote CodeWeaver because I only
| needed something that worked and, occasionally, I could trust
| to keep sensitive data away from sketchy LLMs (And wasn't aware
| of other solutions).
| SillyUsername wrote:
| or https://github.com/bodo-run/yek
| fragmede wrote:
| Which like, kinda neat that it exists, but who's using tooling
| that bad that they're manually copying and pasting that much code
| into, what, a web browser text entry box?
|
| Use better tools people!
| rorytbyrne wrote:
| This seems useful for _building_ new tools. It 's not strictly
| an end-user tool.
| dazzawazza wrote:
| Exactly, the LLM-RAG boffins are all over stuff like this.
| nahco314 wrote:
| I have always used o1 pro and deep research, but these are only
| available through the web UI. there is no doubt that cursor and
| others have a better UI, but the demand for this type of tool
| exists because OpenAI does not release an API
| mkagenius wrote:
| I have one for CVEs in case there are security folks here -
| recursively finds details like code commit diff which fixed the
| vulnerability in references links too to generate one single
| json.
|
| 1. https://github.com/BandarLabs/cveingest
| rorytbyrne wrote:
| Does anyone know of tools that go the other direction? i.e.
| taking a technical writeup (scientific paper, architecture docs,
| or similar) and emitting a candidate codebase.
| elashri wrote:
| Maybe I don't understand but isn't this what you use LLMs for?
| codazoda wrote:
| I don't know of a tool but I've had some success doing this
| with a one shot short prompt. I say something like, "Here's a
| readme. Develop this in Go." Followed by the readme.
|
| I've been getting complete working code with this strategy but
| I'm creating projects that are relatively simple.
|
| I also notice that I have to give a little deeper context about
| "how" it should work, which I normally wouldn't do.
| lgas wrote:
| Yes, I often use one LLM to generate a PRD and the include it
| in the codebase, then ask Cursor agent to implement some part
| of the system using the PRD as a reference. It can't emit an
| entire codebase in one-shot (unless it's trivial project like
| "build me a flappy bird clone") but you can use it as
| scaffolding to manage implementing a whole project in chunks.
| tecleandor wrote:
| As a note, CodeWeaver might be a confusing name, as CodeWeavers
| (the Wine development company) exists since 1996... (
| https://en.wikipedia.org/wiki/CodeWeavers )
| teekert wrote:
| My first though: Is this somehow using Wine?
|
| It's not mentioned on the page but is it using [0] in the
| background? Edit -> It's a Go program so I guess not.
|
| [0] https://github.com/microsoft/markitdown
| retropragma wrote:
| I really want a tool like this that can extract a function and
| its dependency graph (to a certain depth maybe, and/or exclude
| node_modules).
|
| I wrote this library [1] and hope to add the fine-grained
| "reference resolution" utility to it at some point, which could
| make implementing such a tool a lot simpler.
|
| [1]: https://github.com/aleclarson/ts-module-graph
| reddalo wrote:
| Unfortunate naming, given that CodeWeavers is already a company
| making a Windows "emulator" for Linux and macOS. [1]
|
| [1] https://www.codeweavers.com/
| lgas wrote:
| All names are taken. There's no need to point this out every
| time.
| Rexxar wrote:
| Some are more confusing than others.
| anamexis wrote:
| Not all names are registered trademarks for software.
| Arch-TK wrote:
| CodeWeavers are actually making wine, not just some "emulator".
| They then distribute this along with some QOL tools as a
| commercial product called CrossOver.
| Keyframe wrote:
| Wouldn't it be wonderful to have a tool where you interact with
| AI interactively through the codebase via IDE / vim / emacs tree?
| Say, you open your codebase and start with prompts and AI+tool
| navigates to a function or a place where it needs to and modifies
| stuff while chatting to you about it? Or you jump to somewhere,
| highlight where you are to scope down the focus of it (while it
| still retains all of the code in history/memory). Sort of like
| pair programming. It sounds so obvious that I'm almost sure I've
| missed that already existing somewhere. I think I tried google's
| thing (forgot the name) but it sucked / wasn't that.
| hk__2 wrote:
| I tried various solutions but I still haven't found a chat tool
| that allows me to navigate a large monorepo. I'd like to be
| able to say "open the file where there is the function to do
| <xyz>", but current tools don't understand that.
| lgas wrote:
| This works fine in Cursor. As far as I know, you can't say
| "open the file..." but you can say "where is the function to
| do <xyz>" and it'll include a link to the file in it's
| response and then you can click to open it.
| zknowledge wrote:
| Apologies if I'm missing something, but aren't you describing
| Cursor/Copilot/Windsurf?
| Keyframe wrote:
| you're not. looks like that's kind of it, but would the thing
| have the context of the whole project when I'm in a
| file/class/function? With copilot, in my case, it was so far
| mostly like a fancy autocomplete that has immediate vicinity
| in its memory where it would be vastly more useful if it had
| the context of the whole project / all files.
| cjonas wrote:
| Cursor indexes the entire code use with embeddings. It
| works well in small single app projects
| kohlerm wrote:
| it is also the "right thing to do" IMHO.
| ajoseps wrote:
| the vscode extension cline also does this
| meesles wrote:
| This doesn't sound good to me, you end up with a large codebase
| that no human has actually laid eyes on. When you get a bug
| weird enough that you can't reason the LLM through it, then
| what? What if a bug is because of interactions between two
| systems, and you don't own one of them? What if there's an
| issue due to convoluted business process failures, that just
| end in a bug report like "my data is missing!"? I honestly
| think in the latter case, the LLM will just fix a 'bug' and
| miss the forest for the trees.
|
| I prefer the idea of the other comment reply where you use AI
| as a tool to explore a codebase and assist you, not something
| you instruct to do the work. It can accelerate you building
| that experience and intuition at a level we've never been able
| to do before.
| Keyframe wrote:
| Nothing like that at all. For example I have a few codebases
| kind of large (for certain quantity of large) where I know
| the code since either I wrote it or participated heavily in.
| Talking snippets at a time loses a ton of context which would
| yield better offered solutions if you had, well.. the whole
| context.
| squeegee_scream wrote:
| I think you're describing Aider.chat. There are 2 Emacs
| packages for it, one official and a very recent fork. Aider is
| a cli so it works great with vim as well.
|
| In Emacs I've had good experience with gptel as well but I
| prefer aider for the coding workflow
| Keyframe wrote:
| I'll check it out, thanks!
| causal wrote:
| This could be a lot better. The example linked in the Github
| README is a markdown file full of binary garbage because it also
| tried to convert gzip files to markdown.
|
| Pretty big flag that this isn't ready for primetime.
| tesserato wrote:
| Thank you for pointing that out. Just fixed it.
| tempoponet wrote:
| A new tool like this comes out every week, and that's great! But
| I think it's fair to ask how this compares to popular ones like
| RepoMix? Anyone keeping an eye on this space will want to know
| why this is different from what's already out there and being
| used.
| tesserato wrote:
| I actually wrote this a couple of months ago, so perhaps
| nothing similar existed back then (I remember doing some
| research back then, mostly focused on VS Code plugins).
| Nevertheless, the idea was also to test how Golang could
| facilitate the distribution of such micro tools throughout the
| internal team, so I probably would have still made it. It is
| nice to know that similar tools exist. I'll take a look at
| them.
| atum47 wrote:
| Damn, I did that the other day but manually. I just cat
| everything from a folder in the order that I wanted and fed it to
| ChatGPT so it could write a README for tiny.js
| lars512 wrote:
| I've been enjoying `files-to-prompt` by Simon Willison:
| https://github.com/simonw/files-to-prompt
| franze wrote:
| here is mine
|
| https://github.com/franzenzenhofer/thisismy
|
| supports files, resursive directories, .gitignor and
| .thisismyignore and online ressources / URLs + tree commands
|
| also available as a chrome extension
| https://thisismy.franzai.com/
| rapind wrote:
| Somewhat related. I built an Elm app all in one file as an
| experiment and to see if I like it. It's a little over 7k lines
| and I'm occasionally adding more to it.
|
| It's actually pretty straightforward if you're in a language with
| lexical scoping, and it simplifies some things, like includes /
| cyclical, no modules, no hunting through files, etc.
|
| I feel like this set up could integrate really well w/ AI models.
|
| I've found that the only real limitation, at least in my
| experiment, was a lack of decent editor support. I use vim so
| this wasn't really much of an issue for me with many great ways
| to navigate a file, and a combination of vertical and horizontal
| splits on a large screen, but when I opened it up in other
| "modern" editors the ergonomics fell apart quite a bit.
|
| I think the biggest downside was re-using variable names between
| large scopes occasionally made it hard to find the reference I
| wanted (E.g. i, x, key, val), but again, better editor support
| allowing you to limit your search to within the current scope
| would help. Also easily mitigated with more verbose throwaway
| variable naming.
| squeegee_scream wrote:
| I write Elm and use Emacs primarily, and sometimes neovim. Are
| you using lsp in vim? You're doing it right by staying in one
| file until it hurts, that's the recommendation for Elm, but I
| can't recall if I've had issues using go-to-def or other lsp
| functions like your describing
| rapind wrote:
| No LSP. It honestly doesn't speed me up any. I already have
| the standard library memorized, plus some of the common
| community lib methods (List.Extra) and my typing speed is
| faster than I can think anyways.
|
| I'm thinking the same approach would also work well in F#,
| Haskell, OCaml.
| Aurornis wrote:
| > no hunting through files, etc.
|
| It's easy to switch to files by name with a few keystrokes.
| Files are names to group things I'm looking for.
|
| I would much rather do that than try to search through a 7,000
| line file for what I need.
|
| > I feel like this set up could integrate really well w/ AI
| models.
|
| Massive files or too many files break AI models. Grouping
| functionality into smaller files and including only relevant
| files is key. The file and folder names can be hints about
| where to find the right files to include.
| rapind wrote:
| > I would much rather do that than try to search through a
| 7,000 line file for what I need.
|
| I mean I'm not arguing for it as a best practice. I did it as
| an experiment (as I stated), and discovered it's actually
| really easy, and snappy for me to navigate in Vim. Mileage
| may vary with other editors. Have you tried it?
|
| > Massive files or too many files break AI models
|
| It's growing faster than I code! With the latest Gemeni at
| least it's much larger at 1-2 mil tokens. I'm sure we'll hit
| a ceiling though, but I also think we may find some context
| caching / rag type optimizations eventually.
| cruffle_duffle wrote:
| The big problem with that is you'll eventually blow your
| context window feeding the model with stuff that it mostly
| doesn't need in order to complete its task.
| rapind wrote:
| I can't think of anything I'd want to add to the context for
| Elm at least, assuming the standard libraries are already in
| the model (or can be added via RAG). Gemeni is 2m tokens now
| and I expect this will grow at least until it's no longer
| meaningful.
| crisbal_ wrote:
| I use the following for feeding into AI find .
| -print -exec cat {} \; -exec echo \;
|
| Which will return for each file (and subfolders) the filename and
| then the content of the file.
|
| Then `| pbcopy` to copy to clipboard and paste it into ChatGPT or
| similar.
| DrPhish wrote:
| That's very nice and compact. I do the same with a short bash
| script, but wrap each file in triple-backticks and attempt to
| put the correct language label on each eg:
|
| Filename: demo.py
|
| ```python ...python code here...
|
| ```
| mbonnet wrote:
| Mind sharing the script?
| genewitch wrote:
| Seconded because just having something autowrapped like that
| and putting the clipboard would save me time: release the
| snyder cut, er, bash script!
| singpolyma3 wrote:
| I guess this only works for very small codebase?
| OsrsNeedsf2P wrote:
| Correct, but it's the same as what OP shared.
|
| You should use Aider/Cursor for proper indexing/intelligent
| codebase referencing
| boredemployee wrote:
| not sure if it's cursor's fault, but very often it doesn't
| give me the real or complete code of my codebase when auto
| editing/auto completing.
|
| any tips?
| soco wrote:
| I'm still puzzled how come people are convinced by Cursor,
| while my experience was meh at best. Can it index your
| stuff? okay it can. Can it refactor a simple function? No
| it cannot, it can't even rename a damn Java class. How can
| I trust it to generate then code based on my codebase? So,
| what is your use case then? Or can anybody point me to some
| blog/articles/videos showing some _real_ use cases for
| Cursor? Real as in, something that it provenly can do?
| risyachka wrote:
| I think you know the correct answer:)
| schaefer wrote:
| Wait, just one question...
|
| Can I call this c++ code "machine code" now?
| __mharrison__ wrote:
| Interesting. I've been converting Jupyter notebooks into markdown
| for the same purpose. Am considering making a custom tool.
| tesserato wrote:
| I also have this use case, and would be interested in such a
| tool. If you intend to write your tool in Golang, consider
| instead extending CodeWeaver.
| cjonas wrote:
| I could see this being quite useful in the background for apps
| like cursor when they need to perform a full codebase search. I
| imagine it could be more effective in breaking up larger
| codebases where embeddings start to fall out. If you could fit
| the entire document into context, you'd be able to "point the
| model" in the right direction.
|
| The challenge is maintaining it... But you'd maybe ask the model
| to do that incrementally on every commit, or just throw it away
| and regenerate from scratch occasionally.
| tribeca18 wrote:
| https://www.repoprompt.com is better. You need more granular
| control if you're planning to use this in real large codebases.
| maurycy wrote:
| find . -type f -name '*.py' -exec sh -c 'echo "# $1"; cat "$1";
| echo ""' _ {} \; | pbcopy
| lornajane wrote:
| For extra points, compile your docs into one file and feed it
| that as well.
|
| (unless the reason you're giving AI the code is that you don't
| have any docs for either humans or machines)
| squeegee_scream wrote:
| This is great, but I'm pretty sure this is trivial using Emacs
| and org mode. You could then use pandoc to convert org to
| markdown
| lgas wrote:
| It's trivial using a number of approaches, eg. a simple bash or
| python script. But I think there's still a fair amount of value
| in building a common tool for these sorts of things. Everyone
| that builds their own one off solution will inevitably
| encounter more and more of the edge cases (oh I need to honor
| .gitignore... oh, I need to be able to override .gitignore and
| include some ignored things... oh I need to deal with huge
| files... etc) and with a common tool the tool can collect the
| ways of dealing with all of these edge cases.
|
| Now no one will need something that can handle all of the edge
| cases, but whatever edge cases they need to be handled will
| already be handled. The overall time and frustration saved this
| way can be huge.
| therealmarv wrote:
| I use aider /copy-context command for that
|
| https://aider.chat/docs/usage/copypaste.html
|
| and with /paste you can apply the changes.
| beklein wrote:
| Tip: If you ever need to do this on a public GitHub repository
| you can use "gitingest".
|
| This will open a website that creates a copy of all the file
| contents of the repo (code, docs, ...) It's a great tool to use
| when using new/obscure code with LLMs in my opinion.
|
| The UX is so just easy and great, change the URL from
| <https://github.com/user_name/repo_name> to
| <https://gitingest.com/user_name/repo_name>
|
| //edit: fixed URLs
| mkagenius wrote:
| I copied the UX to my https://gitpodcast.com (creates podcast
| on a github repo, same replace `hub` with `podcast`)
| skeledrew wrote:
| This is like a rediscovery of an org-mode capability that has
| existed for decades, and doesn't do as much.
| hatmatrix wrote:
| Is it? I use org-babel regularly but wasn't aware of it -
| what's the function called? As great as org-mode / org-babel
| is, the user base is too small to not be overlooked.
| ActVen wrote:
| Any unique benefits over using this vs something like Repomix?
| https://github.com/yamadashy/repomix
| tesserato wrote:
| CodeWeaver is compiled, so it might be faster. Also, you can
| grab a compatible executable from the releases section, and
| you're good to go, instead of using `go install` so, no
| dependencies. Personally, I considered following the
| `.gitignore` route but found that manually specifying what to
| exclude via a comma-separated list of regular expressions
| provided me with the flexibility I needed (initial setup might
| be a bit tedious, though, but, then again, you can use an LLM
| for that).
| resters wrote:
| See the script I created that does something similar with a few
| improvements for large projects:
|
| https://paste.mozilla.org/9rD95yAy
|
| I would like to be able to create sets of files that I can easily
| send to the clipboard in this kind of format. The files could
| correspond to the ones relevant to a particular feature, etc.
| They don't always fall under the same subtree of the source code,
| and the entire source code is too big for the context.
| roskelld wrote:
| Link says snippet deleted.
| resters wrote:
| I made a better one that lets you add the files/paths and
| refresh and copy to the clipboard:
|
| https://paste.mozilla.org/omP4EKE8
| Conasg wrote:
| I made a similar tool in Golang,
| https://github.com/foresturquhart/grimoire. It tries to be a bit
| cleverer, by prioritising files that have had many commits,
| respecting .gitignore files, and excluding useless content like
| binaries or vector images.
| tesserato wrote:
| I can think of no use case where binaries are desired in such
| representation, so I might bake binary exclusion into
| CodeWeaver as well. SVGs, on the other hand, might be wanted
| sometimes, in web design contexts. I'll take a look at your
| implementation and see what I can learn.
| franze wrote:
| thisismy has a -g option for greedy which then also takes
| binaries
| codecraze wrote:
| Nice! Written in go. I like that :)
| Alifatisk wrote:
| There is also repo2txt.simplebasedomain.com/local.html
| OsrsNeedsf2P wrote:
| This thread has convinced me that Aider/Cursor need to do more
| marketing.
| larusso wrote:
| Maybe. But maybe some like the more disconnected way of coding
| with ai.
| lgas wrote:
| Why? It's just moving more of the grunt work of shuffling
| things around to the human?
| larusso wrote:
| For me it's still to feel under control. And the fact that
| I don't want to inject it into every workflow. I'm open to
| AI and use it daily. But my terms may be different then
| others. I want to control what I share and how. People have
| secrets and other things in a project. I sometimes rename
| things because the AI should only deal with the big
| picture. Paint me paranoid but that's how it is for me.
| esafak wrote:
| The future is not evenly distributed.
| rane wrote:
| Cursor is all the rage. Nobody talks about Aider, sadly.
| forrest321 wrote:
| I created something similar.
| https://github.com/forrest321/code2text
| megadragon9 wrote:
| I would say the demand for this kind of tool definitely exists.
| Good work! From a rough glance it looks pretty similar to another
| tool that I've been using https://github.com/mufeedvh/code2prompt
| stan_kirdey wrote:
| Nice! Built something similar in Rust that supports local and
| remote repos: https://crates.io/crates/r2md
| tesserato wrote:
| I thought of using Rust, but ultimately chose Go. I'll take a
| look and see how something similar came out in Rust!
| jdironman wrote:
| Something I didn't dig to find, but is it possible for these
| applications to also respect .gitignores? Might be a handy
| flag!
| mtrovo wrote:
| Anybody with experience of using something like this with a big
| codebase and Gemini 2M context window? I tried a while ago
| (before 2.0 Flash) to solve some refactoring tasks and even after
| spending some time on prompt wrangling I didn't manage to get
| good results out of it.
|
| I don't know what kind of agent architecture Cursor uses
| internally but it seems much better designed at finding where
| changes need to be made.
| tesserato wrote:
| In my experience with feeding large codebases to Gemini, simple
| tasks work ok (enumerate where such and such happens, find
| where a certain function is called, list TODOs throughout the
| code, etc), but tasks that require a bit more logic are
| trickier. Nevertheless, I had some success with moderate
| complex refactoring tasks in Python codebases.
| the_king wrote:
| Files-to-prompt has been a surprisingly useful tool for this kind
| of workflow.
|
| https://github.com/simonw/files-to-prompt
| narmiouh wrote:
| If I'm reading this correctly, why include all code into the
| markdown? It's almost like the AI model that would use this is
| necessarily using all concatenated code plus explanation of the
| code, I'm not sure which is better because the LLM then already
| has access to the entire code as part of markdown?
| emmelaich wrote:
| Is this related to https://gitingest.com/ at all? Which seems to
| be a service doing a similar thing.
| tesserato wrote:
| It is not. Others have commented pointing to services similar
| to this one, though.
| BoorishBears wrote:
| There are a ridiculous number of projects doing this.
|
| I'm always baffled by the response they get since doing this is
| also the most impractical, poorly scaling, way to insert an LLM
| into your development process.
|
| On one hand if you realize that, there may be times where you
| get lucky with the size of a codebase and the nature of your
| questions and it works acceptably.
|
| But on the other, this feels like the kind of thing someone
| who's hearing others rave about the utility of AI will try with
| too large of a codebase, insert the result into ChatGPT, and
| then get an LLM underperforming because it's being flooded with
| irrelevant context for every basic operation it's being asked
| to do.
|
| There are very few times when providing the entire codebase in
| the context window instead of the relevant code to a single
| operation makes sense.
| thesurlydev wrote:
| I literally just wrote something similar called techdocs[1] in
| Rust and uses Claude to generate a README. It includes API and
| CLI.
|
| [1] https://github.com/thesurlydev/techdocs
| tesserato wrote:
| Nice! I thought of using Rust. I'll check how you implemented
| it.
| strizzo wrote:
| There's ClipSource for VSCode that does this
| strizzo wrote:
| I made the same but for VSCode two weeks ago, called it
| ClipSource it's in the extensions marketplace
| https://marketplace.visualstudio.com/items?itemName=Strizzo....
| You can right click on a directory in the workspace and copy all
| content in markdown
| mmanfrin wrote:
| I built a simple tool to do something similar (it's meant for a
| monorepo and will build each subfolder in to a (subfolder-
| code.txt) text file that you can upload to AIs.
|
| https://github.com/manfrin/bundle-codebases
|
| I don't see much merit in things like markdown or syntax
| highlighting as that's just extra noise for the AI. My script
| tries to cut down on any extraneous data since the things I'm
| working on are near the context limit of consumer AIs.
|
| My script also ignores anything in .gitignore and will take a
| .codebundlerwhitelist (i hate this name and have meant to change
| it) to only bundle files matching patterns you specify.
| antirez wrote:
| Not just extra noise, but also extra tokens.
| mmanfrin wrote:
| Exactly.
| sandGorgon wrote:
| how does this compare to code2prompt or files2prompt ? any
| benchmarks on which one works better for LLMs ?
| adityamwagh wrote:
| Another alternative is Gitingest [0]. What are the differences?
|
| [0] https://gitingest.com/
| Terretta wrote:
| _CodeWeavers is a software company that focuses on Wine
| development and sells a proprietary version of Wine called
| CrossOver for running Windows applications on macOS, ChromeOS and
| Linux._
|
| https://en.wikipedia.org/wiki/CodeWeavers
|
| Trademark is active. It's an r not just a (tm), registered not
| just trademarked. To keep it, they have to demonstrate they
| defend it.
|
| https://www.trademarkia.com/codeweavers-76546826
|
| While this project drops the final "s", you don't get to launch
| an OS called "Window". The test is a fuzzy match based on
| likelihood of confusion.
| jychang wrote:
| Yeah, I was thinking "what does the Wine guys have to do with
| this?"
|
| This project is definitely going to get C&D'd.
| _puk wrote:
| Whilst the pendulum seems well on its way to be swinging from
| microservices back to monoliths, I'm thinking we'll end up in a
| place that limits the volume and complexity of the code in a
| single service so that it's just large enough to encompass a
| point of single responsibility.
|
| Then we can easily drop in and out of using LLMs in the code
| space.
|
| Service Oriented Architecture lends itself well to the limited
| context of these models.
|
| Maybe we can revive literate programming and simply build
| everything from a single markdown document..
| azthecx wrote:
| Microservices lend themselves to architectural decisions that
| LLMs are just not trained to understand.
|
| It's one thing to have it be trained in billions of loc and be
| useful, its another for it to have enough quality dataset to
| have enough context and understanding of something like Kafka
| partition ordering and its possible interactions with something
| like a database and at-least once delivery. It will give you an
| explanation of those things in isolation, but not in
| combination.
| nunodonato wrote:
| This kind of context is really useful for LLMs, but in any
| significant project, including all code in this manner will
| easily exceed context limitations. I've been wanting to do
| something like this for my php projects, but instead of dumping
| the entire files, would just create a map of its methods
| signatures, variables, etc. That should give good enough
| information of what each file is used for and can do, while being
| small enough to be ingested by AI.
| panarky wrote:
| _> including all code in this manner will easily exceed context
| limitations_
|
| The context window for Gemini 2.0 Flash can handle roughly
| 50000 lines of code, and 2.0 Pro can handle twice that.
| novemp wrote:
| How do you do the opposite of this? Transform your markdown files
| into a codebase that AI can't leech off of?
| hatmatrix wrote:
| Such a functionality would be useful for developing some scripts
| and then converting to a Quarto document [1].
|
| [1] https://quarto.org/
| mbonnet wrote:
| Second hooray for Quarto. Great tool.
___________________________________________________________________
(page generated 2025-02-14 23:00 UTC)