[HN Gopher] Show HN: Replace "hub" by "ingest" in GitHub URLs fo...
       ___________________________________________________________________
        
       Show HN: Replace "hub" by "ingest" in GitHub URLs for a prompt-
       friendly extract
        
       Gitingest is a open-source micro dev-tool that I made over the last
       week.  It turns any public Github repository into a text extract
       that you can give to your favourite LLM easily.  Today I added this
       url trick to make it even easier to use!  How I use it myself: -
       Quickly generate a README.md boilerplate for a project - Ask LLMs
       questions about an undocumented codebase  It is still very much
       work in progress and I plan to add many more options (file size
       limits, exclude patterns..) and a public API  I hope this tool can
       help you Your feedback is very valuable to help me prioritize And
       contributions are welcome!
        
       Author : cyclotruc
       Score  : 143 points
       Date   : 2024-12-05 15:24 UTC (7 hours ago)
        
 (HTM) web link (gitingest.com)
 (TXT) w3m dump (gitingest.com)
        
       | Exuma wrote:
       | isnt there a limit on prompt size? how would you actually use
       | this? Im not very up to speed on this stuff
        
         | lolinder wrote:
         | Most projects would be way too big to put into a prompt--even
         | if technically you're within the official context window, those
         | are often misleading--the actual window where input is actually
         | useful is usually much smaller than advertised.
         | 
         | What you can do with something like this is store it in a
         | database and then query it for relevant chunks, which you then
         | feed to the LLM as needed.
        
           | jackstraw14 wrote:
           | Ideally let the LLM chunk it up and figure out when to use
           | those chunks.
        
           | tom1337 wrote:
           | I wonder if building a local version of this which resolves
           | dependency paths of the file your currently working on to a
           | certain level so the LLM gains more context of related files
           | instead of just the whole repo (which could be insane if you
           | use a monorepo)
        
         | xnx wrote:
         | Gemini Pro has a 2 million character context window which is
         | ~1000 pages of code.
        
       | modelorona wrote:
       | Very cool! I will try this over the weekend with a new android
       | app to see what kind of README I can generate.
       | 
       | Do you have any plans to expand it?
        
         | cyclotruc wrote:
         | Yes I want to add a way to target a token count to control your
         | LLM costs
        
       | matt3210 wrote:
       | The example buttons are a nice touch
        
       | spencerchubb wrote:
       | Github already has a way to get the raw text files
        
         | barbazoo wrote:
         | All of them in one operation? How?
        
           | johnisgood wrote:
           | I think he is confusing "plain" or "raw" view, so probably
           | not all of them.
        
       | cyclotruc wrote:
       | Hey! OP here: gitingest is getting a lot of love right now, sorry
       | if it's unstable but please tell me what goes wrong so I can fix
       | it!
        
       | nfilzi wrote:
       | Looks neat! From what I understood, it's like zipping up your
       | codebase in a streamlined TXT version for LLMs to ingest better?
       | 
       | What'd you say are the differences with using sth like Cursor,
       | which has access to your codebase already?
        
         | cyclotruc wrote:
         | It's in the same lane, just sometimes you need a quick and
         | handy way to get that streamlined TXT from a public Repo
         | without leaving your browser
        
       | anamexis wrote:
       | It seems to be broken, getting errors like "Error processing
       | repository: Path ../tmp/pallets-flask does not exist"
        
         | cyclotruc wrote:
         | Thank you, I'll look into it
        
       | ComputerGuru wrote:
       | Instead of a copy icon, it would be better to just generate the
       | entire content as plaintext in the result (not in an html div on
       | a rich html page) so the entire url could be used as an
       | attachment or its contents piped directly into an agent/tool.
       | 
       | Ctrl-a + ctrl-c would remain fast.
        
         | vallode wrote:
         | Agreed, missing opportunity to be able to change a url from
         | github.com/cyclotruc/gitingest to
         | gitingest.com/cyclotruc/gitingest and simply recieve the result
         | as plain text. A very useful little tool nonetheless.
        
           | cyclotruc wrote:
           | Yeah I'm going to do that very soon with the API :)
        
         | wwoessi wrote:
         | for that you can use https://uithub.com (g -> u)
         | 
         | - for browsers it shows html - for curl is gets raw text
        
       | prophesi wrote:
       | Since the site was hugged to death by HN, this appears to be the
       | repo[0] for anyone wanting to run it locally.
       | 
       | [0] https://github.com/cyclotruc/gitingest
        
         | bryant wrote:
         | and of course, using the repo as an input for the service
         | renders this[1]
         | 
         | [1] https://gitingest.com/cyclotruc/gitingest
        
           | mdaniel wrote:
           | // Fetch stars when page loads       fetchGitHubStars();
           | 
           | I _do not_ understand why in the world so much of the code is
           | related to poking the GH api to fetch the star count
        
             | cyclotruc wrote:
             | I know the code is not great, but contributions are very
             | much welcome because there's a lot of low hanging fruits
        
             | johnisgood wrote:
             | Probably generated by AI, prompted by no- or junior dev.
             | This is my opinion, of course, but it looks like code
             | generated by an LLM.
        
       | moralestapia wrote:
       | This is really nice, congrats on shipping.
       | 
       | I also really like this idea in general of APIs being domains,
       | eventually making the web a giant supercomputer.
       | 
       | Edit: There is literally nothing wrong with this comment but feel
       | free to keep downvoting, only 5,600 clicks to go!
        
       | Mockapapella wrote:
       | https://uithub.com is also a good one for this. They also have an
       | API with more options.
        
       | fastball wrote:
       | Might be good to have some filtering as well. I added a repo that
       | has a heap of localized docs that don't make much sense to ingest
       | into an LLM but probably use up a majority of the tokens.
        
       | Cedricgc wrote:
       | Does this use the txtar format created for developing the go
       | language?
       | 
       | I actually use txtar with a custom CLI to quickly copy multiple
       | files to my clipboard and paste it into an LLM chat. I try not to
       | get too far from the chat paradigm so I can stay flexible with
       | which LLM provider I use
        
         | maleldil wrote:
         | If I understand correctly, this sounds like
         | https://github.com/simonw/files-to-prompt/.
         | 
         | It's quite useful, with some filtering options (hidden files,
         | gitignore, extensions) and support for Claude-style tags.
        
       | Fokamul wrote:
       | Nothing against gitingest.com, but this is really peak of
       | technology. Having LLMs which require feeding them info with
       | copy&paste, peak of effectivity too. OMFG.
        
       | wwoessi wrote:
       | Hi, great tool!
       | 
       | I've made https://uithub.com 2 months ago. Its speciality is the
       | fact that seeing a repo's raw extract is a matter of changing 'g'
       | to 'u'. It also works for subdirectories, so if you just want the
       | docs of Upstash QStash, for example, just go to
       | https://uithub.com/upstash/docs/tree/main/qstash
       | 
       | Great to see this keeps being worthwhile!
        
         | Arcuru wrote:
         | That looks awesome. You didn't mention it but uithub.com also
         | has an API, I can definitely see myself using this for a new
         | tool.
        
       | nonethewiser wrote:
       | I implemented this same idea in bash for local use. Useful but
       | only up to a certain size of codebase.
        
       | lukejagg wrote:
       | Is the unicode really the best way to display the file structure?
       | The special unicode characters are encoded into 2 tokens, so I
       | doubt it would function better overall for larger repos.
        
         | shawnz wrote:
         | Also, even if different characters were used, the 2D ascii art
         | style representation of the directory tree in general strikes
         | me as something that's not going to be easily interpreted by an
         | LLM, which might not have a conception of how characters are
         | laid out in 2D space
        
       ___________________________________________________________________
       (page generated 2024-12-05 23:00 UTC)